How in the world do you access air quality data older than 90 days on the OpenAQ platform? One way is to use Amazon Athena.

August 2020 Update: You can access data older than 90 days now from the API! All the way up to two years ago. Read more here. If you’d like to access even older data, the below information still applies.

This post is written by Heidi Yoon, OpenAQ’s Community Engagement Officer, which is based on a GitHub gist by our co-founder, Joe Flasher, and the User Guide for Amazon Athena.

If you would like to access air quality data from the last 3 months, then you can access the data from the OpenAQ platform using our API and other tools available through our website!

If you would like to access air quality data older than 90 days, then you can access all of the data from S3 buckets, which are cloud data storage managed by Amazon Web Services (AWS).

As a historical note, this was a change that occurred in late 2017 (see our blog post here!). At this point, the OpenAQ platform began housing over 100 million data points in parallel, in S3 buckets and in a database accessed by the API. Once we were managing over 100 million data points, the API performance became slower and the financial costs increased to maintain the database. So, we decided that the bulk of the data would continue to be stored in S3 buckets, while only the last 90 days would be available through the API.

Anyone can access our S3 buckets.

To query the S3 buckets similar in ways that you may have done before with the OpenAQ API, you will need a distributed query tool like Amazon Athena, Apache Spark, or Google BigQuery.

In this blog post, I’ll describe in detail how to use Amazon Athena to query our data and save your results.

As an aside, we are also working to find other ways to make the historical data more accessible. If you have ideas, please contact us! (via info@openaq.org, GitHub, or Slack.)

Onto ATHENA!

Using the AWS Management Console

If this is the first time you are opening Athena, you will go to a Getting Started page. Choose Get Started. The tutorial launches automatically. Feel free to take the tutorial or close it. You can always run the tutorial later, if you wish, by clicking on Tutorial in the upper right hand corner.

Selecting Your Access Region

Creating a Table for the OpenAQ Dataset

Running Queries Using Athena

Here are a few, example queries.

-We can query all the data for a particular location (or city or country), like Manama, the capital of Bahrain, using the query below.

-Or, we could conduct a more specific query. Let’s say, we only needed the date and PM10 values for the station named Otoka in Sarajevo, Bosnia and Herzegovina. Then, the query would be as follows.

Joe has some more examples of sample queries in his GitHub gist here! By using Athena or any other distributed query tool, you should be able to easily access any and all of the air quality data that you wish from the OpenAQ platform.

Saving Queries and Results

By clicking on the save icon in the upper corner of the Results window, you can easily export the data as a CSV file. If you don’t save your results immediately, you can also export as a CSV file later by accessing the History tab.

Athena automatically retains queries and their results for 45 days, and you can view all your recent queries using the History tab or in your unsaved S3 buckets assigned to your AWS account.

We hope this helps unravel some of the mystery of Amazon Athena and that you can access all of the OpenAQ data for your air inequality work! As we said above, we are working to find other ways to make the historical data more accessible. Please let us know if you have ideas! (via info@openaq.org, GitHub, or Slack.)

We're making an open, real-time air quality data hub because we think it'll let people do amazing things. Want to help out? Find us at openaq.org.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store