Next in a series of posts mining crowd-sourced Capital Bikeshare data. This one does a simple correlation between temperature and usage. If you’re really bored, see also parts 1, 2, 3, 4, 5, 6,7, and 8.
How much does the weather matter in Capital Bikeshare customers’ decisions to hop on a big red bike? Here’s a quick overlay of Bikeshare usage vs. daily temperatures recorded at National Airport, via WeatherSource. That’s all I could easily get my hands on. (Anyone have temperature and precipitation, by day or hour, for all of 2011?)
First, here’s a quick look Bikeshare usage and daily mean temperature recorded at National Airport, in 2011. Since Bikeshare use is concentrated in the daylight hours it makes sense to use something higher than the mean temperature, but hey. Keep in mind, as always , that the system expanded, in both users/trips/demand, as well as stations/bikes/supply over the same time period, so we’re also watching system growth over the course of the year.
A few observations from this:
- Take early 2011 with a grain of salt; the system was still taking hold at that point.
- During a few weeks of uncomfortable heat in the July last year, usage dropped off. Makes sense to me – it was really sticky there for awhile.
- What’s going on with the very low usage outlier days in the shoulder seasons? Precipitation? Special events? (Need precipitation data)
- If you discount the spring, usage tends to track pretty well with temperatures.
But let’s look at this just along the dimensions of usage and temperature, regardless of seasonal effects. This time, I’ll look at high temperature, rather than mean temperature. I picked a polynomial line of best fit that seemed right to me:
- It’s interesting that the data are very tightly clustered at the right and left of this graph, but more scattershot in the middle temperatures. In other words, when it dips below 40 and above the mid 80s, Bikeshare use is pretty easy to predict using temperature. In the middle (say, 50s through 70s), temperature is less a driving force in peoples’ decisions to bike.
- Low temperatures (below 40s) tend to discourage use significantly; very high temperatures do too equally consistently, but ridership is still higher in summer heat.
- But, the pattern is far less temperature-driven in the middle range. My guess is that precipitation , humidity and “feel” of the weather may have more to do with people’s decision to bike in the shoulder seasons. As a year-round bike commuter I know firsthand that a day that will eventually get into the 60s and 70s can start out feeling pretty raw in the morning.
There’s probably alot more to be learned here from ridership vs. weather, and this just begins to scratch the surface. Anyone else want to give it a whirl? Got better weather data to play with? Let me know!


Good stuff. Thanks for doing this.
By: jokecamp on March 1, 2012
at 10:18 pm
[...] hot (or cold) to ride? Capital Bikeshare stats man JDAntos is at it again with another post about CaBi’s usage and its relationship to temperature. Good [...]
By: Link Love: No Time to Waste Edition « chasing mailboxes d.c. on March 2, 2012
at 4:51 am
Didn’t realize you were already running with this…though I did drop the ball in getting the hourly data to you. Been a hectic 2 weeks! Should do a more detailed look based on the hourly data. I don’t think daily mean/daily high captures it enough.
By: Froggie on March 2, 2012
at 7:07 am
Would love more granular data to play with! Gotta get at how it “feels” outside – humidity, precipitation, etc.
By: jdantos on March 6, 2012
at 9:04 pm
That’s kinda my thought process behind it as well. I can set up formulas to create wind chill and heat index.
By: Froggie on March 7, 2012
at 8:22 am
I have the hourly data for DC – have through Aug 2011 now, just need to get the Sept through Dec. numbers. Will provide a link later today
By: Bilsko on March 2, 2012
at 10:24 am
Sorry – it took me a while to get around to it today. Here you go:
http://min.us/mbfx0AQs0T
I think there are some blank hours – sorry, bad data-set
By: Bilsko on March 2, 2012
at 8:23 pm
I have the full datasets sitting on my office computer (I work for NOAA at the present time). It’s a matter of me getting over there to copy them down since I’m on temporary duty elsewhere this month.
By: Froggie on March 2, 2012
at 10:16 pm
Thanks for posting this. What’s the source for the hourly temps?
By: Kyle on March 6, 2012
at 7:01 pm
I posted a comment following up with details on the weather data, but it looks like it hasn’t been approved yet.
In the meantime, the source for the weather data that I provided was from Pepco’s electric interval data system for a building here in DC. The weather data that Pepco uses is provided by WeatherBank – I’m not sure where they source their data.
Additional resources for weather data are WeatherUnderground and the AWOS data sets from the Utah Climate Center at Utah State University.
By: Bilsko on March 7, 2012
at 10:12 am
A bit more info on the data sets. I model energy systems -microgrids- and frequently use weather/temperature data sets to analyze the energy consumption for a building or group of buildings. (Correlating temps and electricity consumption or gas consumption helps parse out the different energy loads for a given project.
There are a few resources out there, depending on what level of granularity you need;
1. WeatherUnderground has good historical data with a few limitations. If you need a sizeable span of data (a year, for example), then the datasets are limited to daily values (max/min, etc.) This is fine for calculating CDD/HDD values ( https://en.wikipedia.org/wiki/Heating_degree_day ), but not much help for hourly values. WUnderground does have hourly values if you’re just looking up a single day.
2. Probably the best resource that aggregrates various datasets is the Utah State University’s Climate Center. For hourly data, they provide the AWOS data that the FAA gathers at airports all around the country. Link here: http://climate.usurf.usu.edu/products/data.php?tab=awos
In addition to hourly temps, it provides precipitation too.
The system is a bit clunky, but worth bookmarking.
3. The dataset that I linked above was actually provided with the electric interval data for one of my projects here in DC. Pepco’s Interval Data system (CEO Online) has a subscription to WeatherBank (I think) which probably just pulls from another data set somewhere.
Froggie, I’m not familar with any of the NOAA data sets – are they easily accessible (ie. Free)?
By: Bilsko on March 3, 2012
at 10:04 am
Bilsko – so sorry about that! I didn’t even realize they were sitting in the queue for approval – but they are published now, and I know where to look now. Still getting the hang of this WordPress thing.
By: jdantos on March 7, 2012
at 11:26 pm
no worries – apologies for the double post – I wasn’t sure if it was an automated approval thing or not. Probably some issue with comment length and links or something flagged the comment for approval.
By: Bilsko on March 8, 2012
at 9:21 am
They’re accessible, but they’re not free to the general public (I believe it’s like $3 or $4 per station per month). However, as a NOAA employee, I have free access to the data and so what’s what I had done before I went on temporary duty elsewhere. I plan on heading back to the office at some point this weekend to retrieve the data I pulled for DCA.
By: Froggie on March 8, 2012
at 12:11 am
So I wonder if the AWOS data sets that the Utah Climate Center hosts (along with COOP, CRN, and GSOD sets) is the same as what NOAA produces. If its airport-based measurement, then the AWOS is probably the same as the NOAA data.
By: Bilsko on March 8, 2012
at 9:24 am
love the charts. thanks for sharing!
By: ultrarunnergirl on March 5, 2012
at 10:21 am
Here’s the comment stuck in the moderation queue – maybe it will make it through this time:
A bit more info on the data sets. I model energy systems -microgrids- and frequently use weather/temperature data sets to analyze the energy consumption for a building or group of buildings. (Correlating temps and electricity consumption or gas consumption helps parse out the different energy loads for a given project.
There are a few resources out there, depending on what level of granularity you need;
1. WeatherUnderground has good historical data with a few limitations. If you need a sizeable span of data (a year, for example), then the datasets are limited to daily values (max/min, etc.) This is fine for calculating CDD/HDD values ( https://en.wikipedia.org/wiki/Heating_degree_day ), but not much help for hourly values. WUnderground does have hourly values if you’re just looking up a single day.
2. Probably the best resource that aggregrates various datasets is the Utah State University’s Climate Center. For hourly data, they provide the AWOS data that the FAA gathers at airports all around the country. Link here: http://climate.usurf.usu.edu/products/data.php?tab=awos
In addition to hourly temps, it provides precipitation too.
The system is a bit clunky, but worth bookmarking.
3. The dataset that I linked above was actually provided with the electric interval data for one of my projects here in DC. Pepco’s Interval Data system (CEO Online) has a subscription to WeatherBank (I think) which probably just pulls from another data set somewhere.
Froggie, I’m not familar with any of the NOAA data sets – are they easily accessible (ie. Free)?
By: Bilsko on March 7, 2012
at 10:51 am
[...] Stats man Justin (@jdantos) and I talked about the progress being made at the College Park Metro. He was telling me about his efforts to convince WMATA to make me the CP Bike and Ride King and absolute ruler (this part may have been a bit exaggerated by me). [...]
By: ‘It’s Not a Party Without You’ Utilitaire: #11 « Bicycle Bug's Blog on March 8, 2012
at 7:35 am
I wonder what the purpose of each trip is. I can’t understand how anyone riding anywhere where they need to be presentable would rather bike when it’s 85 or 90 degrees than 40 degrees.
By: Jack Cochrane on March 9, 2012
at 12:18 pm
[...] Part 1 in a (perhaps?) series of posts analyzing Capital Bikeshare usage data. This post focuses on system-level usage by a few dimensions. Check out parts 2, 3, 4, 5, 6, 7 (maps of travel patterns), 8, 9 (weather). [...]
By: Capital Bikeshare Data, Part 1 « JDAntos on May 8, 2012
at 2:05 pm
[...] Part 2 in a (now?) series of posts analyzing Capital Bikeshare usage data. This post focuses on system-level usage by trip duration. See parts 1, 3, and 4, 5, 6, 7 (maps of travel patterns), 8, 9 (weather). [...]
By: Capital Bikeshare Data, Part 2 « JDAntos on May 8, 2012
at 2:08 pm
[...] Next in a series of posts mining crowd-sourced Capital Bikeshare data. This one focuses on net “balanced-ness” across the system. See also parts 1, 2, 3, 4, 5, 6, 7, 8, 9. [...]
By: Capital Bikeshare Data, Part 6 « JDAntos on May 9, 2012
at 8:25 pm
[...] Next in a series of posts mining crowd-sourced Capital Bikeshare data. This one maps a bunch of data by station. See also parts 1, 2, 3, 4, 5, 6, 8, and 9. [...]
By: Capital Bikeshare Data, Part 7: Maps Edition « JDAntos on May 9, 2012
at 8:36 pm
[...] tracked a year’s movement on a single bike. Another created a scatter plot of bikeshare usage by temperature. And this guy actually used the data to calculate that the average biker was riding downhill at an [...]
By: Washington, District Of Cycling — SEO Freelance Writer on September 7, 2012
at 11:28 am
[...] tracked a year’s movement on a single bike. Another created a scatter plot of bikeshare usage by temperature. And this guy actually used the data to calculate that the average biker was riding downhill at an [...]
By: Washington, District Of Cycling - Socially Savvy! on September 10, 2012
at 4:15 am