Posted by: jdantos | March 1, 2012

Capital Bikeshare: Usage and Weather

Next in a series of posts mining crowd-sourced Capital Bikeshare data. This one does a simple correlation between temperature and usage. If you’re really bored, see also parts 12345, 6,7, and 8.

How much does the weather matter in Capital Bikeshare customers’ decisions to hop on a big red bike?  Here’s a quick overlay of Bikeshare usage vs. daily temperatures recorded at National Airport, via WeatherSource.  That’s all I could easily get my hands on.  (Anyone have temperature and precipitation, by day or hour, for all of 2011?)

First, here’s a quick look Bikeshare usage and daily mean temperature recorded at National Airport, in 2011. Since Bikeshare use is concentrated in the daylight hours it makes sense to use something higher than the mean temperature, but hey.  Keep in mind, as always , that the system expanded,  in both users/trips/demand, as well as stations/bikes/supply over the same time period, so we’re also watching system growth over the course of the year.

Bikeshare usage vs. average temperatures in 2011 (click for larger)

A few observations from this:

  • Take early 2011 with a grain of salt; the system was still taking hold at that point.
  • During a few weeks of uncomfortable heat in the July last year, usage dropped off. Makes sense to me – it was really sticky there for awhile.
  • What’s going on with the very low usage outlier days in the shoulder seasons? Precipitation? Special events? (Need precipitation data)
  • If you discount the spring, usage tends to track pretty well with temperatures.

But let’s look at this just along the dimensions of usage and temperature, regardless of seasonal effects. This time, I’ll look at high temperature, rather than mean temperature. I picked a polynomial line of best fit that seemed right to me:

Average high temperature vs. Bikeshare usage in 2011. Each dot = one day in 2011.

  • It’s interesting that the data are very tightly clustered at the right and left of this graph, but more scattershot in the middle temperatures.  In other words, when it dips below 40 and above the mid 80s, Bikeshare use is pretty easy to predict using temperature.  In the middle (say, 50s through 70s), temperature is less a driving force in peoples’ decisions to bike.
  • Low temperatures (below 40s) tend to discourage use significantly; very high temperatures do too equally consistently, but ridership is still higher in summer heat.
  • But, the pattern is far less temperature-driven in the middle range. My guess is that precipitation , humidity and “feel” of the weather may have more to do with people’s decision to bike in the shoulder seasons.  As a year-round bike commuter I know firsthand that a day that will eventually get into the 60s and 70s can start out feeling pretty raw in the morning.

There’s probably alot more to be learned here from ridership vs. weather, and this just begins to scratch the surface. Anyone else want to give it a whirl? Got better weather data to play with? Let me know!

About these ads

Responses

  1. Good stuff. Thanks for doing this.

  2. […] hot (or cold) to ride? Capital Bikeshare stats man JDAntos is at it again with another post about CaBi’s usage and its relationship to temperature. Good […]

  3. Didn’t realize you were already running with this…though I did drop the ball in getting the hourly data to you. Been a hectic 2 weeks! Should do a more detailed look based on the hourly data. I don’t think daily mean/daily high captures it enough.

    • Would love more granular data to play with! Gotta get at how it “feels” outside – humidity, precipitation, etc.

      • That’s kinda my thought process behind it as well. I can set up formulas to create wind chill and heat index.

  4. I have the hourly data for DC – have through Aug 2011 now, just need to get the Sept through Dec. numbers. Will provide a link later today

  5. Sorry – it took me a while to get around to it today. Here you go:

    http://min.us/mbfx0AQs0T

    I think there are some blank hours – sorry, bad data-set

    • I have the full datasets sitting on my office computer (I work for NOAA at the present time). It’s a matter of me getting over there to copy them down since I’m on temporary duty elsewhere this month.

    • Thanks for posting this. What’s the source for the hourly temps?

      • I posted a comment following up with details on the weather data, but it looks like it hasn’t been approved yet.

        In the meantime, the source for the weather data that I provided was from Pepco’s electric interval data system for a building here in DC. The weather data that Pepco uses is provided by WeatherBank – I’m not sure where they source their data.

        Additional resources for weather data are WeatherUnderground and the AWOS data sets from the Utah Climate Center at Utah State University.

  6. A bit more info on the data sets. I model energy systems -microgrids- and frequently use weather/temperature data sets to analyze the energy consumption for a building or group of buildings. (Correlating temps and electricity consumption or gas consumption helps parse out the different energy loads for a given project.

    There are a few resources out there, depending on what level of granularity you need;
    1. WeatherUnderground has good historical data with a few limitations. If you need a sizeable span of data (a year, for example), then the datasets are limited to daily values (max/min, etc.) This is fine for calculating CDD/HDD values ( https://en.wikipedia.org/wiki/Heating_degree_day ), but not much help for hourly values. WUnderground does have hourly values if you’re just looking up a single day.
    2. Probably the best resource that aggregrates various datasets is the Utah State University’s Climate Center. For hourly data, they provide the AWOS data that the FAA gathers at airports all around the country. Link here: http://climate.usurf.usu.edu/products/data.php?tab=awos
    In addition to hourly temps, it provides precipitation too.
    The system is a bit clunky, but worth bookmarking.
    3. The dataset that I linked above was actually provided with the electric interval data for one of my projects here in DC. Pepco’s Interval Data system (CEO Online) has a subscription to WeatherBank (I think) which probably just pulls from another data set somewhere.

    Froggie, I’m not familar with any of the NOAA data sets – are they easily accessible (ie. Free)?

    • Bilsko – so sorry about that! I didn’t even realize they were sitting in the queue for approval – but they are published now, and I know where to look now. Still getting the hang of this WordPress thing.

      • no worries – apologies for the double post – I wasn’t sure if it was an automated approval thing or not. Probably some issue with comment length and links or something flagged the comment for approval.

    • They’re accessible, but they’re not free to the general public (I believe it’s like $3 or $4 per station per month). However, as a NOAA employee, I have free access to the data and so what’s what I had done before I went on temporary duty elsewhere. I plan on heading back to the office at some point this weekend to retrieve the data I pulled for DCA.

      • So I wonder if the AWOS data sets that the Utah Climate Center hosts (along with COOP, CRN, and GSOD sets) is the same as what NOAA produces. If its airport-based measurement, then the AWOS is probably the same as the NOAA data.

  7. love the charts. thanks for sharing!

  8. Here’s the comment stuck in the moderation queue – maybe it will make it through this time:

    A bit more info on the data sets. I model energy systems -microgrids- and frequently use weather/temperature data sets to analyze the energy consumption for a building or group of buildings. (Correlating temps and electricity consumption or gas consumption helps parse out the different energy loads for a given project.

    There are a few resources out there, depending on what level of granularity you need;
    1. WeatherUnderground has good historical data with a few limitations. If you need a sizeable span of data (a year, for example), then the datasets are limited to daily values (max/min, etc.) This is fine for calculating CDD/HDD values ( https://en.wikipedia.org/wiki/Heating_degree_day ), but not much help for hourly values. WUnderground does have hourly values if you’re just looking up a single day.
    2. Probably the best resource that aggregrates various datasets is the Utah State University’s Climate Center. For hourly data, they provide the AWOS data that the FAA gathers at airports all around the country. Link here: http://climate.usurf.usu.edu/products/data.php?tab=awos
    In addition to hourly temps, it provides precipitation too.
    The system is a bit clunky, but worth bookmarking.
    3. The dataset that I linked above was actually provided with the electric interval data for one of my projects here in DC. Pepco’s Interval Data system (CEO Online) has a subscription to WeatherBank (I think) which probably just pulls from another data set somewhere.

    Froggie, I’m not familar with any of the NOAA data sets – are they easily accessible (ie. Free)?

  9. […] Stats man Justin (@jdantos) and I talked about the progress being made at the College Park Metro. He was telling me about his efforts to convince WMATA to make me the CP Bike and Ride King and absolute ruler (this part may have been a bit exaggerated by me). […]

  10. I wonder what the purpose of each trip is. I can’t understand how anyone riding anywhere where they need to be presentable would rather bike when it’s 85 or 90 degrees than 40 degrees.

  11. […] Part 1 in a (perhaps?) series of posts analyzing Capital Bikeshare usage data. This post focuses on system-level usage by a few dimensions. Check out parts 2, 3, 4, 5, 6, 7 (maps of travel patterns), 8, 9 (weather). […]

  12. […] Part 2 in a (now?) series of posts analyzing Capital Bikeshare usage data. This post focuses on system-level usage by trip duration. See parts 1, 3, and 4, 5, 6, 7 (maps of travel patterns), 8, 9 (weather). […]

  13. […] Next in a series of posts mining crowd-sourced Capital Bikeshare data. This one focuses on net “balanced-ness” across the system. See also parts 1, 2, 3, 4, 5, 6, 7, 8, 9. […]

  14. […] Next in a series of posts mining crowd-sourced Capital Bikeshare data. This one maps a bunch of data by station. See also parts 1, 2, 3, 4, 5, 6, 8, and 9. […]

  15. […] tracked a year’s movement on a single bike. Another created a scatter plot of bikeshare usage by temperature. And this guy actually used the data to calculate that the average biker was riding downhill at an […]

  16. […] tracked a year’s movement on a single bike. Another created a scatter plot of bikeshare usage by temperature. And this guy actually used the data to calculate that the average biker was riding downhill at an […]

  17. When some one searches for his vital thing, thus he/she wishes to be available that in detail, thus that thing is maintained
    over here.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: