An OpenAQ Impact Story, Matt Lane, March 2025
For those of us fortunate enough to live in an area with clean air, we may lack awareness of the sheer magnitude of the global air pollution health epidemic. In fact, hazardous air quality accounts for more than 8 million premature deaths each year [1] with the vast majority of impact in low- and middle-income countries. Vulnerable populations often reside in areas with the highest levels of dangerous pollution emitted by such sources as vehicles, power plants, and factories.
Further, many of the most affected areas lack comprehensive air quality monitoring which is a significant issue as it blocks accurate diagnosis and forecasting. It is difficult for scientists to analyze the health impacts and understand the sources of pollution, for governments to adequately warn citizens of harm, and for policymakers to take corrective action if they are missing a thorough view of local pollution levels.
One cause for hope is the scientific community which is working hard to solve this monitoring gap. I talked to two such research groups helping advance access to comprehensive air pollution data no matter the location. They are both showcasing innovative techniques that leverage the ubiquity of satellite data, which covers the entire planet, in conjunction with data generated by traditional ground-level air quality monitors which are more accurate and granular.
The first group of atmospheric researchers I met with work on the joint NOAA/NSF NCAR project MELODIES MONET. MELODIES MONET is essentially a toolkit that helps scientists more easily and efficiently conduct air quality research while also integrating the power of remotely-sensed atmospheric plus ground data. Rebecca Schwantes, a research lead on the project, is excited about the group’s progress and upcoming version 1.0 release. “Our community can now seamlessly compare simulations from numerous research and operational models against a variety of surface, aircraft, and satellite observations. This leads to a better understanding of air quality while also enabling our users more time to work on actual scientific discovery.”
I was encouraged to learn that OpenAQ is a key, foundational surface dataset capable of being ingested into MELODIES MONET. Jordan Schnell, a researcher on the project, walked me through a wildfire smoke forecast use case using a next-generation experimental prediction system called RRFS (Rapid-Refresh Forecasting System). Such a smoke model could be used for various operational improvements. For example, to accurately warn at-risk populations of a hazardous air quality forecast during a severe wildfire.
In an example below, Jordan evaluates different forecast initializations of the RRFS model at a site near Jackson, Wyoming versus on-the-ground particulate matter (PM2.5) readings shared on OpenAQ’s data platform. He can then work to improve the fit of the satellite-based model with surface measurements.
The MELODIES MONET team explains to me that the convenience and influx of data from OpenAQ into the toolkit are very important. “OpenAQ brings in so many observations all the way from reference-grade to low-cost sensors. More data means better models and better science,” says Jordan. Finally, the team highlights that the global nature of OpenAQ may be its most important element. “Satellite-based aerosol analysis in different countries is made possible by the global nature of the OpenAQ dataset,” highlights Zachary Moon, a NOAA research modeler.
Next, I talked to two researchers from the University of Eastern Finland and Finnish Meteorological Institute, Andrea Porcheddu and Antti Lipponen, who are also tackling the pollution monitoring coverage challenge but from a different angle, via machine learning.
They looked to improve the conversion accuracy of satellite atmospheric aerosol optical depth (AOD) data, which again offers ubiquitous coverage, into airborne PM2.5 estimates which can be used for air pollution monitoring. [2] They developed a novel machine-learning technique to significantly improve these satellite-based conversions. The model utilized data from EU Sentinel-3 satellites, NASA’s MERRA-2 based AOD-to-PM conversion model, data from OpenAQ, plus other data sets to approximate and then improve the conversion ratios.
The good news is that their approach led to clear improvements in the accuracy of the air pollution predictions (30% to significantly higher depending on the metric). As an example, Figure 4 below shows how at a specific site outside of Paris (below left), their satellite-based corrections track well to the actual ground-based measurements found on OpenAQ (below right).
Next, Andrea and Antti describe to me the importance of OpenAQ in their success. “OpenAQ was crucial. The project would not have been possible without it as it would have taken too long to gather all the data,” remarks Antti. They also highlight OpenAQ’s extensive set of sources, openness, and global coverage as key strengths.
My discussions conclude with a hopeful tone. It is wonderful to see OpenAQ being used by passionate researchers doing such important work. Both teams share a vision for the future where all areas affected by harmful pollution, no matter their location in the world, can leverage open science to generate accurate pollution forecasts. And, this, in turn, helps us get one step closer to a world with clean air for all.
References
[1] Health Effects Institute. 2024. State of Global Air 2024. Special Report. Boston, MA. Health Effects Institute. https://www.stateofglobalair.org/resources/report/state-global-air-report-2024
[2] Post-process correction improves the accuracy of satellite PM2.5 retrievals. Andrea Porcheddu, Ville Kolehmainen, Timo Lähivaara, and Antti Lipponen. https://doi.org/10.5194/amt-17-5747-2024
Related OpenAQ use cases
Predicting What We Breathe, a project applying machine learning to space data and ground-based data to predict air quality
UNICEF Venture Fund-backed Startup Building Global Air Pollution Model to Map Children’s Exposure to Air Pollution, a project leveraging artificial intelligence and machine learning to fuse together real-time satellite imagery and ground-based sensing to replace stale air quality data being used in health impact assessments