Behind the scenes with MBTA data.

The MBTA is constantly working to improve its data quality, especially the data generated by our train tracking systems that affect our customer-facing feeds. Better data quality means more accurate real-time customer information and measurements of our performance that more accurately reflect passenger experiences. But this means that there will be discontinuities in our performance measures based on improvements to the underlying data, rather than changes in performance. This post explains changes made on September 12, 2018 to subway data that impact our performance measures.

In previous posts, we’ve explained how the MBTA tracks vehicles both in general and on the Green Line. We have vehicle tracking systems on almost all our vehicles, with different tracking systems for the different modes (heavy rail, light rail, bus, and commuter rail). These vehicle tracking systems produce real-time data feeds (some built by vendors, some built in-house) that are used to manage our service, measure our performance, view vehicle locations in real-time, and provide passengers with predictions of upcoming vehicle arrivals. We use a data fusion engine to combine the data feeds from each of these systems into one consolidated real-time feed to make it easier for our developers to work with our data. This consolidated feed is also the source of data for our performance tracking system that provides the data published on the MBTA Back on Track dashboard.

The existing software that produced the real-time data feed for heavy and light rail vehicles was a legacy codebase that was built in-house. It was functional for the basic application of providing subway predictions, but design decisions made during the initial development made it difficult or impossible to add new features or improve existing data quality. We have been working to replace the software in order to add new features and improve the accuracy of our locations and predictions. We went live with the Green Line portion of the update on February 8, 2018 and went live with the software for heavy rail (Red, Orange, and Blue lines) on September 12, 2018.

Some of the new and improved features include:

  • Inclusion of location information for trains at terminal stations
  • The flexibility to handle different types of shuttle-bus diversions, including ones that are created on-the-fly in response to incidents
  • Improvements to the accuracy of predictions for trains that are at terminal stations
  • General improvements to the accuracy of locations and predictions throughout the lines

Our previous heavy rail data feed did not include location information for trains at the terminal stations, and the passenger-weighted metrics did not take into account the passengers who were traveling to or from the end of the line. With the inclusion of location information of trains at terminal stations for heavy rail, we now have accurate arrival times at terminal stations and can include these passengers in our metrics. Passenger weighted reliability metrics for Red, Orange, and Blue lines will more accurately reflect the customer experience. This will result in a decrease in the reliability metrics for the heavy rail lines between 0-2% depending on the line and the day.

In addition to the new data feed, we have built a new data fusion engine called Concentrate to combine the new real-time data feed for heavy rail and light rail with the feeds for commuter rail and bus into one consolidated feed for all modes. Concentrate enables higher-capacity, more frequent sharing of all MBTA real-time data. Concentrate went live for providing real-time information to third-party developers and customers in March 2018. We have been rolling it out for use as a source of data for internal systems over the last few months. We began using the data from Concentrate for the performance tracking system on September 12, 2018. It improves the update frequency of real-time information by up to 30 seconds in some cases and results in more accurate arrival and departure times throughout the lines. This was especially important for the Green Line where there are many stations that are close together and trains arrive frequently where even a few seconds delay replicated over the course of the day could result in many missing events.

Missing events create more problems for the Green Line because we are not currently able to identify them and remove false long wait times on the Green Line (as we do for the Red, Orange, and Blue lines) due to complexities with the Green line schedule and other data limitations (described more here). Therefore, improving the accuracy of stop events (arrivals and departures) for the Green Line is very important in improving the accuracy of our passenger wait time metric. With Concentrate, passenger weighted reliability metrics for the Green line will more accurately reflect the customer experience. This will result in an increase in the reliability metrics for the Green line between 2-7% depending on the branch and the day.

We will have to take these methodological changes into consideration when we are looking at heavy rail and light rail performance trends over time so that we can accurately attribute when increases in the wait-time measure are due to data improvements and when they are due to service improvements.

The MBTA  usually evaluates bus routes on the “route” level – with the entire route being considered as one unit. This makes sense for most of the MBTA bus routes, which function as “feeder” routes that take riders to a transfer connection at one end of the route. For these routes, it makes sense to consider the route as an entity, because ridership tends to be homogenous, with the same group of riders boarding the vehicle and traveling the length of the route. 

However, there are several bus routes that, by design, are likely to have a lot of rider turnover. Long cross-town routes with multiple transfer/connection options tend to be especially likely to have different riders along the routes, with relatively few people riding the entire length of the route. In terms of design and function, these routes function more like separate services than as a coherent entity: for example, very few people travel the entire length of routes 1, 66, and 86. 

We wanted to evaluate whether it makes sense to treat these routes as a single service, or if it would be more meaningful to evaluate them at a route segment level that would more closely resemble the service provided on our other bus routes. 

We took route 1 as an example, as its ridership is high and it travels a long distance from Cambridge to Dudley Square. The boardings and alightings are distributed fairly well along the route (not, for instance, with most of the riders boarding at one end and alighting at the other). Additionally, the neighborhoods through which this route runs are very distinct in terms of population demographics, and there are multiple transfer points to rapid transit lines. This led us to hypothesize that there are certain zones/stops on the route where there could be a clear shift in the composition of riders as people board and alight at different stops. 

We used Rider Census data to create a “line profile” of the demographics along the bus route. To have a high-level overview of the data, all the stops on the route were categorized into 3 zones- north zone, transit core, and south zone. We thought we might be able to segment the route and establish different demographics for each “zone.” Each of the zones had a sufficient sample from our Rider Census to effectively count as separately evaluable (if each of these segments were their own route, then we could compare them to each other and to other routes on our system). 

We chose to look at minority status information because we had a high response rate for that question and it was easy to categorize riders as either “minority” or “nonminority” based on their response to the survey.  Route 1’s bus ridership overall is 37% minority.

We dug into the data by looking into the boarding/alighting behavior of riders in these zones. The graph below depicts the average daily load profile (from automated passenger counters) of the Route 1, heading southbound on a weekday. The ‘load’ of a bus is defined as the number of people who are on the bus at a given stop. The stops of interest are those where we see large peaks and dips in the number of riders.  Note the three large “alighting” stops, indicating turnover of riders at those locations. 

We then proceeded to recreate a similar load profile chart for our two demographic groups of interest. We arrived at the visual below that shows the behavior of minority and non-minority riders on route 1 (click to enlarge).

As noted above, the ridership of Route 1 as a whole is 37% minority (excluding respondents who did not report their minority status); if we examine the graph, we can see that there are some stops where there are sudden dips and peaks in the number of riders in each demographic group. We also see that there is a small population of riders who fall in the ‘unknown’ group. This refers to the portion of survey respondents who chose not to disclose their Minority/Non-Minority status.

At the bus stop at Mass Ave and Pearl St in Cambridge, we see a small peak in the percentage of Minority riders. At the stop at Mass Ave and Harrison Ave, we see a distinct increase in Minority riders with a simultaneous decline in Non-minority riders. This stop falls in the south zone, near the Lower Roxbury neighborhood of Boston. In the transit core, we see that the trend line remains almost constant for both categories. This could be because the stops in the transit core are all major bus stops, with plenty of connections to other buses and light rail (at the Symphony and Hynes stations). Riders would be typically making transfers here, either onto the route 1 or to some other service.

If we were to evaluate each of the segments separately, the North Segment would be similar to Route 57; the Transit Core Segment would be similar to the 39, and the South Segment would be similar to other routes that serve the area near Dudley Square (8, 15). Percent minority ridership is spatially distributed, so it is not surprising that the MBTA routes that cross multiple distinct neighborhoods and have many points of transfer or destination, percent minority ridership changes along with the path of the route. And in practice, all parts of Route 1 would likely still be classified as “minority” bus routes for Title VI purposes. However, this pattern provides further evidence that Route 1 functions as separate segments rather than as a single entity.

In terms of certain measures of service quality, riders may have different experiences on different segments of Route 1 (or any long route). For example, one section of a route may have frequent crowding as the route approaches a transfer point, while another segment would usually be less crowded. Since an individual rider is unlikely to travel across multiple segments, their experience is best reflected by the quality of service on the segment they do travel rather than on the route as a whole. Where there is sufficient rider turnover in the middle of a route and if data exists to make evaluation feasible, it may make sense to evaluate performance in segments.

Other routes that we expect to have distinct segments would be the CT1, CT2, and CT3, along with 39, 66, and 86. 

The next steps to moving to this reporting structure would be to analyze all the routes to see which ones should be subject to the segmentation treatment; and to investigate whether the measures of interest, like crowding, can be adjusted to work on the segment level.

MBTA passengers can sign up to receive “T-Alerts” to get information about service delays. The alerts arrive via email or text, and subscribers specify which segments of service are of interest to them.  Because service disruptions can be complex and tend to unfold over time, there is a constant trade-off between the timeliness of sending the alert and the accuracy of the estimated length of impact to service. We discovered through customer feedback, including in our monthly satisfaction survey, that our passengers were relatively dissatisfied with our approach. In March 2018, two departments collaborated to design and implement a different alert system. Our job at the Office of Performance Management and Innovation (OPMI) was to evaluate these changes: did our passengers notice the changes, and if so, are we moving in the right direction of better communication at the critical points in our passengers’ experiences? 


Every month, we send a survey to a few thousand passengers who have signed up to give us feedback on our system. Respondents report on their most recent trip and answer questions about various aspects of their experiences. We measure their satisfaction on a multitude of topics, and consistently using the same 7-point scale, with 1 as “Extremely dissatisfied, 4 as “Neutral,” and 7 as “Extremely satisfied.” This allows us to track big picture trends over time, and do deep dives into different topics while maintaining a level of comparability. From September 2017 through March 2018, three of our surveys included questions that specifically address T-Alert satisfaction. 

In September 2017, of our panel respondents who received T-Alerts, 43% thought that T-Alerts did not arrive in time for them to make decisions, and 37% did not think the alerts accurately reflected the service they were experiencing. When asked about their overall satisfaction with T-Alerts, respondents reported an average satisfaction of 4.74, which is between “neutral” and “somewhat satisfied” on our scale. In December, none of their answers to these questions had changed. To put this in perspective, respondents were slightly more satisfied with T-Alerts than with MBTA communication overall, and less satisfied with T-Alerts than most other channels of communication, including countdown signs in stations and the MBTA website.

Focus on Timeliness

Based on customer feedback through many channels, the Customer Technology Department collaborated with the Operations Control Center to redesign the alert system to fit an overarching philosophy that values timeliness and transparency. 

In early March 2018, the new T-Alert system was released with three substantive changes. When a disruption unfolds, and there is still uncertainty with how it will affect service, the new strategy issues alerts as soon as possible with the best information we have. As more information is gathered, relevant updates are sent out. The new strategy uses clear and straightforward language, and instead of classifying disruptions into the three ambiguous categories, “Minor/Moderate/Severe,” now the alerts provide estimates in minutes for how long the delay is expected to be. On the back end, there are new processes to ensure that alerts are cleared and closed out in a timely manner as well. 

These changes reflect an emphasis on being proactive and being willing to convey uncertainty. They prioritize speed over knowing the final outcome of the delay, and have a high emphasis on transparency. 

Measuring Satisfaction

In mid-March, we asked our passengers to rate their satisfaction with T-Alerts once again. 

We kept the wording of the survey question the same in order to keep their response unbiased; part of what we were looking for was if our passengers even noticed the change.  But first, we had to isolate the effect of the change from all the other factors that could have influenced satisfaction with T-Alerts (for example, weather patterns generally have an effect on satisfaction, and this effect is likely to show up between December and March). 

So, to mitigate possible confounding variables, we analyzed T-Alert satisfaction relative to other metrics collected in the same month. We chose to compare satisfaction with T-Alerts to satisfaction with communication overall, as we believe any changes in overall happiness would similarly affect satisfaction with individual channels as well as patterns overall. This relative satisfaction between T-Alerts and communication satisfaction isolates our independent variable of interest.

Chart showing the change in satisfaction from the December to the March survey.

The chart above shows the difference between panel respondents’ satisfaction with various channels of communication and their satisfaction with MBTA communication overall. Positive values signify that people are more satisfied with that specific channel, and the larger the value, the larger the gap in satisfaction. In December people who used third-party apps reported the highest level of relative satisfaction (0.38 pts more satisfied than with communication overall). T-Alerts hovered around 0.1 points more satisfied. 

The difference between the March “gaps” and the December “gaps” measures the relative change in satisfaction between December and March. Larger differences in gaps signifies larger gains of improvement. T-Alerts showed the greatest improvement relative to communication satisfaction; passengers reported a 0.2 points increase in relative satisfaction, which is a value we consider both significant and meaningful.  This increase in satisfaction disrupted the order of satisfaction of the channels; in December, passengers who used Third party apps were most satisfied with their respective channel, but in March, people who used text message T-Alerts reported the highest levels of satisfaction.   

There are similar improvements in satisfaction with Twitter. We expected Twitter satisfaction to increase because the changes in alert strategy detailed above were also changed on the MBTA Twitter feed. 

Given the unique increase in T-Alert satisfaction ratings, we believe passengers are more satisfied with T-alerts due to our intervention. 


There are two main takeaways for us from this experiment.

First, we successfully implemented a change that made our passengers measurably more satisfied with our communication. Currently over 70,000 of our passengers are subscribed to T-Alerts. We strive to match our communication to their preferences, especially in moments of service disruption. Prior analyses have repeatedly shown that quality communication is extremely valuable to our passengers and that it plays a significant role in their overall satisfaction with our system. These changes demonstrate that we can have a positive impact on our passengers’ experiences in the short term, while we conduct ongoing work system-wide. 

The second takeaway is that we are able to use data to make hypotheses about how to improve communication, implement a change, and then test whether it has the desired impact. We will continue using analysis from customer surveys to inform how we can improve and confirm if it worked. If you would like to help us, please sign-up for our monthly customer opinion panel.