Behind the scenes with MBTA data.

Investigation into bus ridership changes using regression analysis.

Like many US transit agencies, the MBTA has seen a slight overall ridership decline in the past couple years. As discussed in multiple presentations to the FMCB [viewable here and soon, here], we are monitoring these changes and analyzing ridership data to better understand the reasons for the decline. We noticed that not all services, days or routes dropped at the same rate; some services have mostly steady or even increasing ridership. 

One of our analyses focused on explaining the variance in the change among bus routes. This post describes the regression model we created to try to tease out some of the correlations between certain characteristics of bus routes and their gains or losses in ridership using the change in ridership between Fiscal Year 2016 and Fiscal Year 2017. 

Exploratory Analysis

The first thing that stood out to us in examining the data was the differences between types of day. Ridership on buses fell the most (on a percentage basis) on weekends, and was closer to steady on weekdays:

Chart showing MBTA ridership on bus by day type for the last four years.

The scatter plot below shows the distribution of ridership change among both key and local buses. There does not appear to be a pattern based on ridership (high ridership routes are just as distributed along the ridership change axis as lower-ridership routes) nor based on key bus classification (some key buses lost ridership while others gained it, as did local buses).

Scatterplot showing average ridership by MBTA bus route over the change in average ridership between FY16 and FY17

Mapping the routes by the percentage change we did notice some spatial clustering. In particular routes in Roxbury, Dorchester and Mattapan lost ridership. In future research, in order to investigate these patterns further we will be creating a spatial dataset. 

Map showing the change in ridership among MBTA bus routes as they exist spatially.

Our research question for this analysis is: are there either service quality or rider characteristics of MBTA buses that help explain how ridership on each route changed from FY16 to FY17?


We selected bus routes with reliable automated fare collection (AFC) data that had at least 1,000 average weekday riders in FY2017, resulting in 92 routes. To precisely identify the route being operated, these data were crosswalked with vehicle location data. This process excluded routes like the SL1 to the airport, as many passengers board there without interacting with fareboxes. The mean ridership on these routes was 2,900 on an average weekday in FY2017, with the maximum ridership on the #66 of 10.5k average weekday riders. 86 of these routes had reliable Saturday ridership data, and 77 had reliable Sunday ridership data. 

The table below summarizes the ridership changes on the included bus routes. Weekday ridership changes occur with more variety among bus routes, with weekend days showing both more consistent and proportionally higher losses of ridership.

  Number of Routes Minimum Maximum Mean

Avg. Weekday Ridership Change FY17 over FY16

92 -14% +11% -3%
Avg. Saturday Ridership Change FY17 over FY16 86 -22% +1% -9%
Avg. Sunday Ridership Change FY17 over FY16 77 -23% +7% -7%

Service quality was measured by the metrics set in the Service Delivery Policy, specifically by each route’s cost effectiveness rank and its crowding, reliability, span of service, and frequency metrics. 

Ridership and route characteristics were measured by each route’s proportion of riders paying a reduced fare (senior/student/TAP), from AFC data; proportion of journeys involving a transfer to or from another MBTA service, calculated from our ODX model; and proportion of minority riders, and proportion of weekday trips to or from work, collected by a System-wide Passenger Survey.  In addition, an indicator of whether the route was a key bus was also included.

Not all the information was available for all routes, so the final analyses only included 81 routes, which limits the number of variables we can include in the analyses and the ability of analyses to find significant effects. Only weekday ridership change could be analyzed, since even fewer routes had all the information for Saturdays and Sundays.

Service Quality Regression Model

A model estimating the Average Weekday Ridership Change with only service quality measures is not very predictive. It only explains about 7% of the variation among bus routes in ridership change. 

The only significant predictor is reliability, in the expected direction (higher reliability corresponds with increase in percent ridership change). The size of the effect is fairly minor: 10% increase in reliability corresponds with 1.5% increase in ridership change. The scatter plot below shows the relationship between reliability of a route and the ridership change between FY16 and FY17.

Scatterplot showing MBTA bus route tidership change over the MBTA's reliability metric

Route and Ridership Characteristics Regression Model

This model is more predictive, explaining 16% of the variance between routes in ridership change. The only significant variable is percent of riders paying a reduced fare, in the opposite direction (higher proportion of riders paying a reduced fare corresponds with decreases in ridership change). The effect size is moderate: a 10% increase in percent of riders paying a reduced fare corresponds with a 2.3% decrease in ridership change. 

Percent minority was excluded from this model after analysis because it was highly correlated with the included variables. Excluding it did not reduce the explanatory power of the model. 

The scatter plot below shows the relationship between proportion of a route’s riders paying reduced fare in FY2016 and the ridership change between FY16 and FY17.

Scatterplot showing the relationship between proportion of a route’s riders paying reduced fare in FY2016 and the ridership change between FY16 and FY17

Combined Model

The combined model (with variables from both service quality and rider characteristics) explains 21% of the variance between routes in ridership change and maintains the two independently-predictive aspects: reliability and percentage of riders paying reduced fare. Percentage of riders paying reduced fare has a somewhat bigger effect than reliability on predicting the route’s ridership change. 

The proportion of reduced fares being significant does not necessarily mean loss of reduced fare trips specifically. Likely, the proportion of reduced fares is reflecting some other aspect of bus service (perhaps the spatial distribution of bus routes or number of discretionary trips) that is more explanatory of the route’s ridership change. 


This investigation only explained a small portion of the variance in ridership change between bus routes. We think that this can be improved by measuring the service quality and rider characteristics over a longer period of time and with more nuanced measures. In addition, we think there are likely to be variables that are still missing entirely from this analysis, in particular spatial variables like land use and demographic shifts, along with trip-level variables that will contribute to explaining the variance in ridership beyond the service quality and rider characteristics.  

This analysis is part of a larger effort the MBTA is undertaking to explain ridership changes. This larger effort will consider elements that have been found to have predictive effects in other regions including: fare and pass multiple pricing; land use and demographic changes; shifts in high-transit ridership populations (immigrants, zero-vehicle households, etc.) in the area; recent changes in visitor patterns to the Boston region; ride-hailing (Uber, Lyft) usage and other mode shifts; service quality and capacity; and changes in types and lengths of trips, among other variables. We will post results to sections of this research project as we finish them.