Home

Behind the scenes with MBTA data.

This article is a comparison of month to month usage of ridership across multiple modes.

The MBTA Customer Opinion Panel is a group of several thousand riders who have signed up to tell us about their experiences every few months. Panel members receive a survey every three months (so a third of the panel receives a survey every month). The Panel survey collects information on the riders’ most recent trip and their perceptions of the service quality for that trip and for the MBTA in general. If you’re interested in contributing, you may sign up for the panel here.

We know from other surveys that satisfaction with service quality varies by mode, so in order to generalize to the MBTA population, we use the reported trip to classify respondents as Commuter Rail riders, Rapid Transit riders, Bus riders, or both Rapid Transit and Bus riders (more details here: http://www.mbtabackontrack.com/blog/56-passenger-satisfaction-by-mode).

Recently, we noticed something unexpected when we looked at our panelists' most recent trip. Different months seemed to show different proportions of bus, subway, or bus-and-subway usage. 

For example, respondents in April 2017 were more likely than respondents in February 2017 to report most recent trips that they took by rapid transit only or that they took using the commuter rail, and were less likely to report trips that they took by bus only or by both bus and rapid transit. The percentage of respondents who had used only rapid transit rose from 28% to 33%, the percentage of respondents who had used the commuter rail at all rose from 37% to 40%, the percentage of respondents who had used only bus service fell from 16% to 12%, the percentage of respondents who had used both bus and rapid transit service fell from 18% to 15%.

We were curious if this represented actual differences in customers' behavior in different months. For example, maybe customers use services closer to their homes in cold winter weather - someone could choose to walk longer to a train station in nice weather, but take a bus to the train station in cold weather. In that case, we would expect to see more mixed-mode (bus-and-subway) usage in winter months.

On the other hand, the Customer Opinion Panel survey has a relatively small sample size each month, and each month's panel is composed of a different mix of panelists. So, the variation could be random and not reflecting overall travel patterns. We wanted to know for sure. 

Method

We turned to data from our automated fare collection (AFC) system, which records cash payments, tickets, and CharlieCard taps. This allows us a much larger and more complete record of bus and rail usage than the customer survey. 

We decided to test two Tuesdays in different seasons: one in February, and one in April. This initial exploration would let us know if we needed to launch a larger investigation into systematic differences in bus vs. rail usage in the different seasons. But if our fare collection system recorded roughly the same proportion of bus, rail, and combined bus/rail trips in April and February, we felt comfortable assuming that the monthly variation in the survey was probably noise from the small sample. 

The AFC data gives us the tap time, payment method, fare collection device (faregate or farebox), and the location of the device, which we used to determine if the tap happened onboard a bus or Light Rail vehicle, or at a gated station.  

But we couldn't just look at individual taps. Remember that we needed to know which trips were solely bus, which were solely subway, and which involved both bus and subway. Each CharlieCard tap represents a ride on a single vehicle (or multiple subway vehicles, in the case of behind-the-gate transfers). But a complete journey from Point A to Point B might involve multiple rides on multiple vehicles. All bus-and-subway trips are, by definition, multi-ride journeys. So, to group taps—or rides—up to the journey level, we needed to know which rides were part of the same journey.

This turned out to be a challenge. For one thing, we couldn't use tap data from cash payments or single-ride tickets. If you pay cash onboard a bus, then later take a second bus with cash, we have no way of knowing that the same person took those two rides. So, we limited our search to CharlieCard or Ticket users, and assumed that cash riders have the same distribution (or at least that this distribution doesn’t vary by season independently of the card and ticket variations). 

Next, we had to decide which rides by a passenger constituted part of the same journey. Imagine a commuter who takes 4 trips in one day. Here's how her rides might appear in our data.

CharlieCard / Ticket Number Tap Time Mode
12345 7:05 AM Bus
12345 7:21 AM Subway
12345 3:45 PM Subway
12345 4:12 PM Bus

Looking at this data, it seems clear that this rider takes two journeys: one in the morning, and one in the afternoon. So how are we, as humans, making this call? Rides that are close together in time, like the 7:05 AM ride and the 7:21 AM ride, appear to be part of the same journey. Rides that are far apart, like the 7:21 AM ride and the 3:45 PM ride, appear to be part of different journeys. So, our algorithm looked at the interval of time between rides by the same passenger. 

CharlieCard / Ticket Number Tap Time Mode Interval Since Previous Trip
12345 7:05 AM Bus n/a
12345 7:21 AM Subway 0:16
12345 3:45 PM Subway 8:24
12345 4:12 PM Bus 0:27

We used this to number the journeys. The first ride of the day started in journey #1. For each subsequent ride, if the interval since the previous ride was under 2 hours, we declared that ride to be part of the same journey (it has the same journey number as the previous ride). If the interval was more than 2 hours, we declared this to be part of the next journey. 

CharlieCard / Ticket Number Tap Time Mode Interval Since Previous Trip Journey Number
12345 7:05 AM Bus n/a 1
12345 7:21 AM Subway 0:16 1
12345 3:45 PM Subway 8:24 2
12345 4:12 PM Bus 0:27 2

With each ride now identified as part of a journey, we could look at vehicle combinations at the journey level. 

Results

Tuesday, February 16, 2016

Mode Combination Percent of Journeys
Bus only 27%
Subway Only 54%
Bus and Subway 18%

Total number of non-cash journeys = 594,743

Tuesday, April 12, 2016

Mode Combination Percent of Journeys
Bus only 28%
Subway Only 54%
Bus and Subway 18%

Our results were remarkably similar, leading us to believe that the variation in the panel survey was not actually representing differences in the real world – just in the panel itself. 

Conclusion

The usage of the MBTA system appears to be extremely consistent by the type of service people use. Based on these results, we are going to weight the panel survey respondents based on their reported trip according to the 28% bus only/54% subway only/18% on both bus and subway ratio. 

Since the panel also has Commuter Rail and Commuter Boat riders, we weight the trips that have a commuter rail component as 9.5% of the ridership and trips that have a commuter boat (but no commuter rail) component as 0.5% of the ridership based on prior data collection. Unfortunately, because the Commuter Rail payment system is not included in the MBTA’s AFC system, we cannot replicate this analysis to see if the mode split including Commuter Rail also remains consistent by month/season.