Behind the scenes with MBTA data.

The MBTA  usually evaluates bus routes on the “route” level – with the entire route being considered as one unit. This makes sense for most of the MBTA bus routes, which function as “feeder” routes that take riders to a transfer connection at one end of the route. For these routes, it makes sense to consider the route as an entity, because ridership tends to be homogenous, with the same group of riders boarding the vehicle and traveling the length of the route. 

However, there are several bus routes that, by design, are likely to have a lot of rider turnover. Long cross-town routes with multiple transfer/connection options tend to be especially likely to have different riders along the routes, with relatively few people riding the entire length of the route. In terms of design and function, these routes function more like separate services than as a coherent entity: for example, very few people travel the entire length of routes 1, 66, and 86. 

We wanted to evaluate whether it makes sense to treat these routes as a single service, or if it would be more meaningful to evaluate them at a route segment level that would more closely resemble the service provided on our other bus routes. 

We took route 1 as an example, as its ridership is high and it travels a long distance from Cambridge to Dudley Square. The boardings and alightings are distributed fairly well along the route (not, for instance, with most of the riders boarding at one end and alighting at the other). Additionally, the neighborhoods through which this route runs are very distinct in terms of population demographics, and there are multiple transfer points to rapid transit lines. This led us to hypothesize that there are certain zones/stops on the route where there could be a clear shift in the composition of riders as people board and alight at different stops. 

We used Rider Census data to create a “line profile” of the demographics along the bus route. To have a high-level overview of the data, all the stops on the route were categorized into 3 zones- north zone, transit core, and south zone. We thought we might be able to segment the route and establish different demographics for each “zone.” Each of the zones had a sufficient sample from our Rider Census to effectively count as separately evaluable (if each of these segments were their own route, then we could compare them to each other and to other routes on our system). 

We chose to look at minority status information because we had a high response rate for that question and it was easy to categorize riders as either “minority” or “nonminority” based on their response to the survey.  Route 1’s bus ridership overall is 37% minority.

We dug into the data by looking into the boarding/alighting behavior of riders in these zones. The graph below depicts the average daily load profile (from automated passenger counters) of the Route 1, heading southbound on a weekday. The ‘load’ of a bus is defined as the number of people who are on the bus at a given stop. The stops of interest are those where we see large peaks and dips in the number of riders.  Note the three large “alighting” stops, indicating turnover of riders at those locations. 

We then proceeded to recreate a similar load profile chart for our two demographic groups of interest. We arrived at the visual below that shows the behavior of minority and non-minority riders on route 1 (click to enlarge).

As noted above, the ridership of Route 1 as a whole is 37% minority (excluding respondents who did not report their minority status); if we examine the graph, we can see that there are some stops where there are sudden dips and peaks in the number of riders in each demographic group. We also see that there is a small population of riders who fall in the ‘unknown’ group. This refers to the portion of survey respondents who chose not to disclose their Minority/Non-Minority status.

At the bus stop at Mass Ave and Pearl St in Cambridge, we see a small peak in the percentage of Minority riders. At the stop at Mass Ave and Harrison Ave, we see a distinct increase in Minority riders with a simultaneous decline in Non-minority riders. This stop falls in the south zone, near the Lower Roxbury neighborhood of Boston. In the transit core, we see that the trend line remains almost constant for both categories. This could be because the stops in the transit core are all major bus stops, with plenty of connections to other buses and light rail (at the Symphony and Hynes stations). Riders would be typically making transfers here, either onto the route 1 or to some other service.

If we were to evaluate each of the segments separately, the North Segment would be similar to Route 57; the Transit Core Segment would be similar to the 39, and the South Segment would be similar to other routes that serve the area near Dudley Square (8, 15). Percent minority ridership is spatially distributed, so it is not surprising that the MBTA routes that cross multiple distinct neighborhoods and have many points of transfer or destination, percent minority ridership changes along with the path of the route. And in practice, all parts of Route 1 would likely still be classified as “minority” bus routes for Title VI purposes. However, this pattern provides further evidence that Route 1 functions as separate segments rather than as a single entity.

In terms of certain measures of service quality, riders may have different experiences on different segments of Route 1 (or any long route). For example, one section of a route may have frequent crowding as the route approaches a transfer point, while another segment would usually be less crowded. Since an individual rider is unlikely to travel across multiple segments, their experience is best reflected by the quality of service on the segment they do travel rather than on the route as a whole. Where there is sufficient rider turnover in the middle of a route and if data exists to make evaluation feasible, it may make sense to evaluate performance in segments.

Other routes that we expect to have distinct segments would be the CT1, CT2, and CT3, along with 39, 66, and 86. 

The next steps to moving to this reporting structure would be to analyze all the routes to see which ones should be subject to the segmentation treatment; and to investigate whether the measures of interest, like crowding, can be adjusted to work on the segment level.

MBTA passengers can sign up to receive “T-Alerts” to get information about service delays. The alerts arrive via email or text, and subscribers specify which segments of service are of interest to them.  Because service disruptions can be complex and tend to unfold over time, there is a constant trade-off between the timeliness of sending the alert and the accuracy of the estimated length of impact to service. We discovered through customer feedback, including in our monthly satisfaction survey, that our passengers were relatively dissatisfied with our approach. In March 2018, two departments collaborated to design and implement a different alert system. Our job at the Office of Performance Management and Innovation (OPMI) was to evaluate these changes: did our passengers notice the changes, and if so, are we moving in the right direction of better communication at the critical points in our passengers’ experiences? 


Every month, we send a survey to a few thousand passengers who have signed up to give us feedback on our system. Respondents report on their most recent trip and answer questions about various aspects of their experiences. We measure their satisfaction on a multitude of topics, and consistently using the same 7-point scale, with 1 as “Extremely dissatisfied, 4 as “Neutral,” and 7 as “Extremely satisfied.” This allows us to track big picture trends over time, and do deep dives into different topics while maintaining a level of comparability. From September 2017 through March 2018, three of our surveys included questions that specifically address T-Alert satisfaction. 

In September 2017, of our panel respondents who received T-Alerts, 43% thought that T-Alerts did not arrive in time for them to make decisions, and 37% did not think the alerts accurately reflected the service they were experiencing. When asked about their overall satisfaction with T-Alerts, respondents reported an average satisfaction of 4.74, which is between “neutral” and “somewhat satisfied” on our scale. In December, none of their answers to these questions had changed. To put this in perspective, respondents were slightly more satisfied with T-Alerts than with MBTA communication overall, and less satisfied with T-Alerts than most other channels of communication, including countdown signs in stations and the MBTA website.

Focus on Timeliness

Based on customer feedback through many channels, the Customer Technology Department collaborated with the Operations Control Center to redesign the alert system to fit an overarching philosophy that values timeliness and transparency. 

In early March 2018, the new T-Alert system was released with three substantive changes. When a disruption unfolds, and there is still uncertainty with how it will affect service, the new strategy issues alerts as soon as possible with the best information we have. As more information is gathered, relevant updates are sent out. The new strategy uses clear and straightforward language, and instead of classifying disruptions into the three ambiguous categories, “Minor/Moderate/Severe,” now the alerts provide estimates in minutes for how long the delay is expected to be. On the back end, there are new processes to ensure that alerts are cleared and closed out in a timely manner as well. 

These changes reflect an emphasis on being proactive and being willing to convey uncertainty. They prioritize speed over knowing the final outcome of the delay, and have a high emphasis on transparency. 

Measuring Satisfaction

In mid-March, we asked our passengers to rate their satisfaction with T-Alerts once again. 

We kept the wording of the survey question the same in order to keep their response unbiased; part of what we were looking for was if our passengers even noticed the change.  But first, we had to isolate the effect of the change from all the other factors that could have influenced satisfaction with T-Alerts (for example, weather patterns generally have an effect on satisfaction, and this effect is likely to show up between December and March). 

So, to mitigate possible confounding variables, we analyzed T-Alert satisfaction relative to other metrics collected in the same month. We chose to compare satisfaction with T-Alerts to satisfaction with communication overall, as we believe any changes in overall happiness would similarly affect satisfaction with individual channels as well as patterns overall. This relative satisfaction between T-Alerts and communication satisfaction isolates our independent variable of interest.

Chart showing the change in satisfaction from the December to the March survey.

The chart above shows the difference between panel respondents’ satisfaction with various channels of communication and their satisfaction with MBTA communication overall. Positive values signify that people are more satisfied with that specific channel, and the larger the value, the larger the gap in satisfaction. In December people who used third-party apps reported the highest level of relative satisfaction (0.38 pts more satisfied than with communication overall). T-Alerts hovered around 0.1 points more satisfied. 

The difference between the March “gaps” and the December “gaps” measures the relative change in satisfaction between December and March. Larger differences in gaps signifies larger gains of improvement. T-Alerts showed the greatest improvement relative to communication satisfaction; passengers reported a 0.2 points increase in relative satisfaction, which is a value we consider both significant and meaningful.  This increase in satisfaction disrupted the order of satisfaction of the channels; in December, passengers who used Third party apps were most satisfied with their respective channel, but in March, people who used text message T-Alerts reported the highest levels of satisfaction.   

There are similar improvements in satisfaction with Twitter. We expected Twitter satisfaction to increase because the changes in alert strategy detailed above were also changed on the MBTA Twitter feed. 

Given the unique increase in T-Alert satisfaction ratings, we believe passengers are more satisfied with T-alerts due to our intervention. 


There are two main takeaways for us from this experiment.

First, we successfully implemented a change that made our passengers measurably more satisfied with our communication. Currently over 70,000 of our passengers are subscribed to T-Alerts. We strive to match our communication to their preferences, especially in moments of service disruption. Prior analyses have repeatedly shown that quality communication is extremely valuable to our passengers and that it plays a significant role in their overall satisfaction with our system. These changes demonstrate that we can have a positive impact on our passengers’ experiences in the short term, while we conduct ongoing work system-wide. 

The second takeaway is that we are able to use data to make hypotheses about how to improve communication, implement a change, and then test whether it has the desired impact. We will continue using analysis from customer surveys to inform how we can improve and confirm if it worked. If you would like to help us, please sign-up for our monthly customer opinion panel. 

Why Evaluate MBTA Coverage?

A key component of transit service planning is offering service to the largest number of people possible. Understanding how much of the population the MBTA currently covers, and where that population is located, is important to understanding how well the T is serving its constituents and where the MBTA should expand or modify its service. In 2017 the MBTA set coverage standards as part of its Service Delivery Policy

One application of the coverage evaluation is the Better Bus Project, an ongoing initiative to improve bus service. As the MBTA focuses on bus planning, it is important to be able to evaluate the coverage impact of proposed changes to bus stops or routes. Automating the process of evaluating coverage allows for frequency and consistency in the evaluation process, so that whenever changes are proposed, the T can quickly assess their coverage impact and compare their impact to other proposed changes. 

Coverage Automation Tool Overview

To automate the coverage evaluation process, we (the Office of Performance Management and Innovation at the T) created a coverage evaluation tool using ArcGIS ModelBuilder. This tool uses census population data and location data for transit stops to compute the population of the area within walking distance of MBTA service. The model further calculates the percentage of the population covered by MBTA service in MBTA cities and towns by dividing the population walking distance to transit stops by the total population within the cities and towns.  This post will walk you through the MBTA’s 2017 base coverage analysis. Base coverage is the percent of the total population within the MBTA cities and towns living .5 miles walking distance away from any MBTA operated or subsidized transit stop or station, regardless of the frequency or span of service provided.

Data Inputs for 2017 Base Coverage Analysis

Our first step to evaluating base coverage was downloading the most reliable data available for our analysis. This data includes:

  • All MBTA stops in the fall of 2017, downloaded as text files from GTFS
  • Route data (shapefiles) for MBTA privately operated/subsidized routes, for example, the Lexpress
  • American Community Survey Total Population 2016 5-Year Total Population Estimates downloaded from American FactFinder
  • TIGER census block groups for the seven counties served by MBTA service
  • MBTA Towns from MassGIS
  • The area of water in Massachusetts
  • MassDOT Road Inventory 2017 Road Network


This analysis consisted of three components. We automated the process of finding:

  1. The area and population of all block groups within MBTA Cities & Towns
  2. The area within walking distance from MBTA transit stops and stations
  3. The population living a .5 mile walking distance from MBTA transit stops and stations, calculated as the percentage of the block group population within the walkshed, assuming that the population is evenly distributed. 

Step 1: Finding the Area and Total Population of all MBTA Cities and Towns at Block Group Level

As block group population data comes as a spreadsheet from the American Community Survey, we first transformed block group population data into a spatial dataset. To do this, we joined ACS block group population data with TIGER block group geography. As all block group data is at the census level, we then clipped block group polygons to the shape of MBTA cities and towns. 

This step resulted in tiny slivers, as the block group polygon boundaries do not precisely overlap with the MBTA cities and town boundaries. To delete these slivers, we created a buffer .05 miles around the boundary of the MBTA cities and towns polygon and deleted all census block polygons located completely within this buffer. No real census block group is that small, so we knew all block groups deleted in this process were slivers. 

After spatially displaying the population data at block group level, we then calculated the area of each block group in square miles. To get a better estimate of the area where people actually live, we erased water features first, then calculated the area. 

Step 2: Finding the Area Within Walking Distance from MBTA Transit Stops and Stations

To calculate the base coverage area, first we downloaded all GTFS MBTA transit stops for the fall of 2017 as a text file. Next, we converted the stops from the text file into points using ESRI’s Display GTFS Stops tool. GTFS stops include all MBTA operated bus routes, but do not include flag stops along privately operated routes subsidized by the MBTA, like the Lexpress bus in Lexington. We estimated the location of these stops to be at all road intersections along the subsidized routes. To do this, we first created a network dataset using the MassDOT Road Inventory 2017 file. The resulting network dataset included a streets layer and a road junctions layer. We estimated the location of flag stops by selecting the road junctions 50 feet or less from the subsidized service routes. Our final stop file for base coverage included the GTFS Stops merged with the selected road junctions stops. 

We used ArcGIS Network Analyst to calculate the area a .5 mile walking distance along Massachusetts roads from all MBTA stops and stations. We used the Road Inventory network dataset mentioned previously as the network dataset for the analysis and input the stops as facilities into network analyst. Our network analysis resulted in a layer of dissolved polygons around every MBTA stop or station. This is the MBTA 2017 coverage area.  

Step 3: Finding the Population Living within the MBTA Base Coverage Area 

To find the total population in our coverage area, we clipped the census block polygons with the population and area attribute created in step 1 to the coverage area created in step 2. We then recalculated the area of the census block polygons after they were clipped to get the area of the census block polygons covered by service. We then found the percentage of the area of each census block group covered by our coverage area by dividing the area covered by service by the total area of each block group. Assuming even distribution, we calculated the percentage of the population we covered in each block group as a measure of the area we cover, divided by the total area multiplied by the total population of the census block group. To find the total population in our coverage area, we summed up the total population covered in all census block groups. The final coverage percentage was calculated as the coverage population divided by the total service area population. 

The Base Coverage Model

The model to automate the coverage analysis is shown below. (Click to enlarge)

Flow Chart from ArcMap ModelBuilder showing the coverage model

Map of MBTA Base Coverage

Map of MBTA base coverage

As seen in the base coverage map, we found the coverage area using dissolved walkshed polygons. The polygons are jagged due to the location of walkable roads near transit stops. The area covered by MBTA service is efficiently located over the highest density parts of the service area, so though service only covers around half of the service area, it covers around 80% of the total population. 

Conclusion: Importance in Service Planning

As the MBTA strives to improve the service provided to its constituents living within its service area, the coverage tool can evaluate how proposed bus stop and route changes will affect the number of people receiving service. Further, the T can use different inputs into the coverage metric to understand how it is performing on different types of coverage. For example, we can use stops receiving high frequency service to see what percentage of our population receives frequent service, or we can look at vulnerable populations, instead of total populations, to see the number of vulnerable people covered by MBTA service. Related to the Better Bus Project, the T can see the percentage of the population covered by varying levels of bus service to see what populations receive different types of service. The coverage tool is flexible, quick and more reliable than conducting manual analyses, and will allow the T to continue to evaluate the quality and impact of its service improvements.