This is the web scraper our app Nebulo uses to scrape air quality data.
It's built to run off Heroku with a scheduled task npm start.
It's perhaps the most lo-fi setup ever.
Run npm install and then npm start, which will create a bunch of JSON files in the output/ directory.
Each scraper writes its results to output/<scraper>.json. All results are also combined into output/_all.json.
_all.json is a JSON array of city objects with the following shape:
[
{
"name": "string — city or station name",
"region": "string — country or region identifier",
"location": {
"lat": "number — latitude",
"lng": "number — longitude"
},
"data": "number — AQI (Air Quality Index) reading"
}
]- Clone the repo
- Run
npm install - Copy
.env.exampleto.envand populate the values
Feel free to create an issue.
MIT