Session by Gianna-Carina Gruen from DW's Data Journalism Team
How to ...
- collect data from a url
- parse the data into the needed format using Python's
pandaslibrary - use Python's
datawrapperlibrary to create a chart - set up the script to run automatically on Github Actions
Basic coding or Python knowlegde is helpful but not required
- required: Access to colab.google.com (if you already have a Google account, it's included), so you can open this notebook
- required: Datawrapper API token (please set it up in ahead of the session)
- required: Github Account (if you want to set up the workflow to be automated) (please set it up in ahead of the session)
- optional: Code Text editor (if you want to code along in the session), like Atom or Sublime Text
- optional: Distill Browser Plugin (if you want to set up the workflow on click)
Code: To follow along, you'll need to make yourself a copy of this jupyter notebook on Google colab.
Data source: For this session, we'll be working with the UNHCR data on arrivals to Europe via land and sea, more specifically: with the URLs provided on the page to the json data
- all arrivals (sea + land): "https://data.unhcr.org/population/get/timeseries?widget_id=588956&sv_id=100&population_group=4797,4798,5634&frequency=month&fromDate=2016-01-01"
- sea arrivals: "https://data.unhcr.org/population/get/timeseries?widget_id=588957&sv_id=100&population_group=4797,5634&frequency=month&fromDate=2016-01-01"
- land arrivals: "https://data.unhcr.org/population/get/timeseries?widget_id=588958&sv_id=100&population_group=4798&frequency=month&fromDate=2016-01-01"
To automatically run your script on Github Actions, you'll need three things (which are also included in this github repo and can be downloaded at the top.
- your script as a
.pyfile - a
requirements.txtfile - a
.ymlfile wrapped into a folder .github/workflows
You'll be adding all of them into a repository in your own Github account. Then switch to the "Actions" tab to see if your automation run started properly or whether there was an issue that needs debugging.