For this assignment, you will use Python to pull data from an API, work with it in pandas, then visualize it with matplotlib and leaflet. Open data can be messy sometimes and you might have to get creative. This assignment should stretch your skills and knowledge.
- Pick an interesting data set from https://data.sfgov.org/ (or more than one if necessary for the following analysis). SF uses the same socrata platform we saw in class today.
- On the web page of the data set(s) you've chosen, click the light blue export button and choose soda API to access the endpoint, like we did in class.
- Create a new ipython notebook to download the data from this endpoint and manipulate it with pandas. See example below.
- Create 3 matplotlib plots of interesting aspects of the data. Describe what you're visualizing with comments. As one example, if you retrieved bike parking data from the API, you might might create a histogram showing which streets in SF have the most bike racks.
- Create 2 leaflet maps depicting features from your data. The data set(s) you use will need lat-long data to do this. Or, if you're feeling ambitious, you can use a data set with street addresses and geocode them to lat-long with the geopy Python library.
- Submit 6 files to bcourses (don't zip them up): 3 images of your matplotlib plots, 2 screenshots of your leaflet maps, and your ipython notebook.
Here's a code snippet showing how to get data via the API in Python. It displays the address of each parking lot with a regcap greater than 2500.
import requests, json, pandas as pd
url = 'http://data.sfgov.org/resource/4vvz-yypg.json?$where=regcap>2500'
response = requests.get(url)
data = json.loads(response.text)
df = pd.DataFrame(data)