Posted by: jdantos | February 13, 2012

Capital Bikeshare Data, Part 7: Maps Edition

Next in a series of posts mining crowd-sourced Capital Bikeshare data. This one maps a bunch of data by station. See also parts 1, 2, 3, 4, 5, 6, 8, and 9.

Ridership Over Time: I want to highlight from the comments a great visualization from Bilsko, showing ridership over time of day and year all at once. This “heat map” really shows how the “summer effect” on Bikeshare has as much to do with the pleasant temperatures, as the longer hours of daylight in the evenings. This is definitely worth a good look. Very cool!

Where Are All the Casual Users? Are Capital Bikeshare’s casual riders all taking long rides on the Mall, as the stereotype would lead you to believe? Are all the registered users using it to commute to work? Or are “casual” users really tourists, or just infrequent native users? To help shed some light, I mapped the percentage of trips originating at each station, by user type:

Bikeshare trips begun by casual users, by station. (click for larger)

A few observations:

  • While the stations around the Mall, the White House, and Georgetown are unusually “casual” compared to the rest of the system, there are exceptions to that rule.
    • Casual ridership is relatively high in areas that aren’t your “typical” tourist destinations – Courthouse, Pentagon City, and the SW Waterfront, and two stations to the east of the river – near Fort Dupont at Minnesota Ave and Branch Ave, and at Pennsylvania and Branch Ave (although they’re relatively new).
    • Conversely, casual ridership seems low in areas where I’d expect to see more tourists – look at the swath between Farragut Square, Dupont, and U Street. Maybe that’s because there’s so much ridership in that area that the regular riders are simply overwhelming the casual users.
  • Casual ridership may be high at new stations, as people try it out. As time goes by, people may sign up for the year, and the station turns colors.
  • Look at how “local” the H Street corridor is (I live near there, so it catches my eye). Also, the area towards Petworth I called “Mid-City North” earlier – very local.

Where Is the Balance? It’s clear some stations are more imbalanced than others, over the long term – but how does that look on the map? Are low-lying stations net receivers, and hilltop stations net “senders”?

Net sender (red) or receiver (blue) Capital Bikeshare stations (click for larger)

Sure enough, the stations in Mt. Pleasant, Columbia Heights, and other neighborhoods uphill from downtown, are net senders. (Corey also confirmed that the average Bikeshare trip is net downhill.) Stations around the Mall are net receivers. I’ve shown this in raw numbers rather than as percent of total ridership, so the tan stations that look “in balance” may in fact be low ridership overall.

How Long Do You Ride? Whenever I see people on those red bikes, I wonder where they’re going. Is this a 5-minute dash to make a long-ish walk a lazy ride? Or is this a serious 25-minute haul? Does this pattern change whether you’re in Woodley Park, vs. Crystal City? How does the system serve different needs in different places? Here’s average trip duration, in minutes, by station of origin:

Capital Bikeshare trips by duration, by origin station (click for larger).

I had to exclude trips over 4 hours, since I’m working with a mean (not a median), and some very long trips were throwing things off.

  • The Rosslyn-Ballston corridor is interesting – it looks like Ballston, Virginia Square, station further west near Clarendon stay local, with shorter trips. But at Courthouse and Rosslyn, average trip length goes up to 25+ minutes, suggesting they’re going across the Potomac River to DC. The “breakpoint” seems to be about Courthouse.
  • The different trip lengths at Crystal City mystify me – any suspicion on what’s going on here?
  • Interesting that the stations in Ward 8 (more or less) seem to be shorter than the stations in Ward 7. I would guess it has something to do with the speed and quality of the Anacostia river crossings, but I don’t know why trips from Minnesota Ave. area would be so much longer than those from historic Anacostia.
  • The gap in stations on the Mall, plus the tourist ridership, seem to be driving longer trip lengths. I imagine this will drop this summer, as stations go in on NPS land on the Mall.
  • I’m surprised at how short the trips are, even from neighborhoods north of U Street NW (“Mid-City North”). Trips from these neighborhoods seem to take less time than neighborhoods similarly further from downtown. Any ideas what’s going on here?

Okay, next time I want to delve into maps like this, but I realize I need to adjust for the length of time each station has been open. Stay tuned:

Preview of visualizations to come. Still need to adjust for length of time the station has been open.

About these ads

Responses

  1. Another excellent post. I think one of the main differences in trip duration between the mid-city stations vs the other non-downtown stations is the density.

    For example, in Georgetown, Tenleytown, or Rosslyn, or EOTR there are only a handful of stations within a 15-minute ride (due in part to the rivers and Rock Creek Park). In contrast, a ride from 16th and U has roughly 40 stations within a 10-minute ride, making it much more likely to be a short-duration heavy station.

    • Thanks Jacques! Good point about density… how about plotting “number of stations nearby” against “average trip duration”?

      I also wonder about the effect of loop trips that begin and end at the same (or very nearby) station. That would produce long (duration) trips, even at stations further from other stations.

  2. Interesting stuff. I think the 2nd to last map (trip duration) might be more useful or look more interesting if you had set ranges for the times. Having one end be 0-12 minutes and then several 1.5 minute intervals doesn’t help us much.

    Also, an interesting variation on that map would be if you plotted each station as a distance circle for it’s average trip length. (maybe based on trip duration multiplied by 8mph or something?)

    • Thanks Matt! You’re right, I chose “quartiles” as the scale on the map to force some more differentiation in colors, since there is such a high amount of trips in the 10-20 minute range. You’re right that this probably overemphasizes differences in trip duration – but I figured it was worth it, since the averages over a lot of trips tend to cluster together.

  3. I’d have done the “balance” map a little differently, with a grey color as the “neutral color” and something a little bolder for the net receiver stations.

    How many multi-hour trips originated in Crystal City? Reason I’m asking is because I occasionally see CaBi bikes in Old Town and even further south into Fairfax County on the MVT. It’s possible that these trips are skewing the Crystal City numbers.

    • Good idea Froggie – do you want to take a crack at it? I’m happy to send over the data.

  4. Crystal City: I tend to think of them as two different types of stations. One, a hub-and-spoke system from 18th/Bell to the outlying stations near large office/residential buildings. You can see these as the green stations. The red/orange stations are either too close to 18th/Bell to be used in this way and/or are the closest ones to the GW Parkway trail entrance and are likely inflated by trips starting/ending there.

    As for the net sender issues, you can almost draw a line along Florida avenue from Rock Creek Park to Georgia Avenue and that divides the stations. It’s the elevation change as well as the fact that uphill trips in the winter would be in the dark. I know it’s not exactly the most statistically sound measure with a data set like this, but there’s an r-squared of 0.89 between start station elevation and (downhill) elevation change of the end station. And anecdotally I know many people who only ride downhill from Columbia Heights in the morning and metro home in the evening.

    (Also, I assume you got the tweet earlier for the dataset of station elevation and distance, if not it’s at http://www.coreyholman.com/dist.csv)

    • Awesome, thanks Corey. I’ll check out the data and try to integrate it!

      Interesting about the Florida Avenue boundary. It makes sense, but I wonder why (or if?) we don’t see a similar pattern at other elevation changes, e.g. Capitol Hill, or upper NW?

      I often wonder about this as I slog (anecdotally) up the hill over the Capitol Grounds on my bike every night…

    • Wow, btw, an r-sq. of 0.89 between two variables right off the bat like that is usually pretty meaningful. Is this at the individual trip level, or the OD pair level? Can you show a chart?

      • It’s just station elevation (from the CSV file) vs. the average elevation change of each trip from the originating station. So 140 observations. Chart:

        The biggest outliers are:
        *Wilson & Franklin (small sample size of trips so far have tended to stay above the Rosslyn/Courthouse Hill despite so there are low elevation changes for each trip)
        *Tenleytown (over 51% of trips stay in the Upper NW node of Tenleytown/AU/Cathedral Heights with little elevation change)
        *Calvert & Woodley (this is the noted last-mile trip between here and Adams Morgan which is slightly uphill so the average trip is uphill from here despite a rather high station elevation)

        Again, a simple linear regression violates all sort basic statistics given what we know about the population, but still.

  5. Interesting stats. As for Crystal City, maybe more of the trips from the station near the Metro end up in D.C. It’s tricky to ride on a slow CaBi bike from Crystal City to most areas in D.C. in less than 30 minutes. Only a few D.C. stations lie within a comfortable 30-min. range. That should change once the National Mall stations are in place. Crystal City to Jefferson Memorial is easy to cover in much less than 30 minutes, even on a CaBi bike.

    I also want to point out that there are no stations in Ballston or Virginia Square yet. Those stations will be added soon, with the first stations in Virginia Square being installed by the end of the month. Or so we’ve been promised.

    • Thanks Michael! Good point about Crystal City; I don’t know the area that well. I’ve only ridden a CaBi across the 14th St. Bridge once, to get to the farmer’s market – and it took me 45 minutes from downtown.

      YOu’re right about Ballston – my mistake, I’ll change the text.

    • Crystal/Pentagon City. In terms of leaving the CC-PC node, 29% of trips originating from 20th/Crystal station leave the node. That’s consistent with it being the closest station the GW Parkway Trail. Only 3% of the trips from S Glebe/Potomac Ave station leave the node. Here’s a table of Station, trips leaving node, then broken down by registered/casual

      20th & Crystal Dr 29% (26% / 35%)
      S Joyce & Army Navy Dr 23% (22% / 26%)
      15th & Crystal Dr 22% (20% / 24%)
      12th & Hayes St 16% (13% / 27%)
      15th & Hayes St 15% (11% / 22%)
      18th & Hayes St 13% (12% / 19%)
      18th & Bell St 12% (10% / 21%)
      20th & Bell St 11% (5% / 27%)
      12th & Army Navy Dr 9% (5% / 26%)
      23rd & Crystal Dr 8% (5% / 19%)
      23rd & Eads 7% (5% / 16%)
      26th & Crystal Dr 5% (4% / 18%)
      27th & Crystal Dr 5% (2% / 18%)
      S Glebe & Potomac Ave 3% (2% / 10%)

  6. By the way, what do you all think is the fairest way to adjust for the fact that some stations have been in place only a short time, and some a long time, in these maps? E.g., the last map in this post shows total trips in 2011, but alot of the Rosslyn-Ballston corridor looks very small because many of the stations have only been in service for a couple months.

    Should I divide by the number of days the station has been open, i.e. “trips per day”? Or, easier, should I just show data for, say, December? Showing December only would show a “winter” bias. On the other hand, the newer stations will HAVE to show a winter bias since they don’t have a summer under their belt yet.

  7. […] between temperature and usage. If you’re really bored, see also parts 1, 2, 3, 4, 5, 6,7, and […]

  8. […] data. This post focuses on system-level usage by a few dimensions. Check out parts 2, 3, 4, 5, 6, 7 (maps of travel patterns), 8, […]

  9. […] usage data. This post focuses on system-level usage by trip duration. See parts 1, 3, and 4, 5, 6, 7 (maps of travel patterns), 8, […]

  10. […] one focuses on net “balanced-ness” across the system. See also parts 1, 2, 3, 4, 5, 6, 7, 8, […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: