Data Curious 18.10.2018 - Mapping America, analysing habitable planets and predicting taxi fares with ML

18.10.2018
Welcome back. It’s another edition of Data Curious. ICYMI, the first edition is .
This week I found loads of datasets to play with, along with some tutorials on using Python for scraping, machine learning and making charts with Bokeh. There are also some great reads on the design thinking behind data visualisation.
Let’s dive in.
Read_
Planning data projects, truth vs beauty, and making better charts
How can I plan a new data project from start to finish?
The folks over at data.world produced a helpful checklist for planning your next data project. Check it out for a helpful planning resource when building data products with a team.
Yeah I want it →
How many people are in danger from heat exhaustion?
Climate scientists teamed up with the NYT graphics team to visualise the danger of a warming climate in the hottest regions of the world.
Sweating just thinking about it →
Is data visualisation about being accurate or being visual? i.e. should everything just be a line or bar chart?
Ah yes, the age-old question. I rediscovered this great writeup from Andy Cotgreave on the inevitable criticism within the data viz community. He sums it up nicely with a universal law:
"The longer an innovative visualization exists, the probability someone says it should have been a line/bar chart approaches."
So true. But beauty plays an important role in getting people to engage with the data. The key is finding a balance.
Agree / Disagree? →
How can I make better charts and data visualisations?
It's a broad question, but I honestly feel that Lisa Rost of Datawrapper may have all the answers. She has been writing some stellar how-tos and thought pieces on the Datawrapper blog for the past year, and this week she posted a "best of" post to roundup the top pieces.
Reading list sorted →
Explore_
NHS waiting times, mapping all the buildings in America, and racial inequality in schools
How does your NHS Trust perform on key indicators compared to the rest of the UK?
An important question for anyone who has ever had to wait at A&E. This new interactive is both insightful and practically helpful.
Explore →
Where are all the buildings in America?
Here. The NYT broke data viz ground and mapped every single one of them in a series of beautiful, black and white interactive maps.
How do you find racial inequality in US schools?
ProPublica published a new data viz app to find out. Explore the map and click into each district to see how schools are doing. You can also download and use all of their data.
Analyse_
Potentially habitable planets and the happiest professionals (plus more)
How many planets has NASA discovered and which are potentially habitable?
This dataset from the Planetary Habitability Laboratory has all the info you could want. You can also check out their page on possibly habitable planets...could make for some interesting visualisations.
Feelin spacy →
How can I visualise and analyse health measures across different regions of the UK?
Are certain demographics more at risk for diabetes? Where in England are people most likely to smoke? Here's the data. Which you can also access via a handy API.
What is the average life satisfaction for different occupations in the UK?
What Works Wellbeing posted an analysis of life satisfaction by occupation. Most surprising finding? Clergy are happiest, despite not making nearly as much as finance professionals. You can download the data with occupation names, mean satisfaction rates and gross annual salary here for further analysis.
Learn_
Scraping with Python, a machine learning intro, and interactive charts with Bokeh
How can I scrape a website quickly for data with Python?
Julia Kho posted a nice tutorial showing how in a Jupyter Notebook. She claims it only takes four minutes too.
Unleash the spiders →
How can I use machine learning to predict the fare of a taxi in NYC?
Looking for a good walkthrough tutorial for your first machine learning project? Start here. William Koehrsen of Feature Labs has loads of great detailed Python tutorials on Medium, so check him out.
How can I get started building interactive charts with Bokeh?
This tutorial from the Quant Institute is a helpful starter if you're starting from scratch.

Thanks for reading. More to come next week.
Ben_ say hi: twitter | medium | github
Spread the love.
Share this edition of Data Curious.