Behind the scenes with MBTA data.

The Federal Transit Administration (FTA) Title VI Circular (C 4702.1B) requires large transit providers to collect demographic, travel, and fare payment data about their riders using passenger surveys at least every five years. The MBTA, working with the Central Transportation Planning Staff, has just completed a systemwide passenger survey to collect necessary passenger demographic data for bus routes and rail stations. This project updates the 2008-2009 dataset and will be used for service planning, ridership analysis, and Title VI equity analyses.   

The MBTA knows this data is useful for many other research projects, so we are releasing an interactive tool that allows you to compare the results for stations and bus routes. You can also download the associated datasets. 

Screenshot from the Rider Census online tool

Survey Methodology

The survey responses were obtained through a combination of an online form that was available from late October 2015 to May 2017 and a paper form with mail-in option, distributed at MBTA stations and on board MBTA vehicles from March 2016 to March 2017. Approximately half of all completed forms were submitted by each method. The English version is shown below.

The survey plan called for obtaining responses at the route level for bus and ferry routes and at the station or line segment level for all other modes. A goal was to obtain at minimum enough responses from each route, station, or line segment to meet statistical requirements for a confidence level of 90 percent with a confidence interval of 10 percent. In cases where the number of responses was insufficient to meet these standards, results from two or more routes, stations, or segments serving the same general area were combined. 

To compensate for differences in response rates when comparing results from different lines or modes, the published results for each route, station, or segment are weighted in proportion to typical weekday total passenger boardings on the corresponding services based on recent count data. 

More detailed methodology can be found in the survey report, available soon.

Data Considerations

Please consider the following in working with these data:

90/10 confidence and precision: Below the 90/10 level, data are not displayed (the tool shows “Insufficient Data”). The displayed routes and stations all meet at least this level of confidence and precision, but some services are near this threshold, and some are much more precise. The additional precision mostly comes from higher samples on high ridership routes and stations. In addition, we assumed the “worst case” of evenly split characteristics in order to evaluate the confidence and precision levels. Since some characteristics are not expected to be split evenly among riders, even data included at the 90/10 levels is likely to actually be more reliable than 90/10.  

However, because the confidence and precision levels can vary, it is important to take into account the possibly-wide interval range when comparing routes or stations to each other. Conclusions about differences among services are likely to be more reliable for higher-ridership routes and stations, or at aggregations of services (e.g. at the mode-level). In order to assist with this evaluation, the valid sample counts for each question are provided along with the weighted response data in the downloadable datasets. 

Check all that apply: Questions that allowed respondents to check multiple responses will have answer options that total more than 100 percent. These questions are:

  • Do you sometimes make this trip another way? and
  • How do you self-identify by race?

Trip-specific information: Some questions have wording that is trip-specific and cannot be generalized to MBTA use overall, including:

  • “Fare payment” applies to the reported trip, not to fare payments overall
  • “Trip frequency” applies to the reported trip, not to the frequency of riding the MBTA
  • “Alternative modes” refers to alternative modes for the reported trip, not for alternative modes to the MBTA in general 

Survey response bias: Some groups of people are more likely to respond to surveys than others. Disparities in results for these groups suggest a disparity in the response rates between the groups rather than such a large difference in the actual ridership population. Specifically, the gender disparities and English-speaking ability disparities are likely effects of response bias (women and English-speakers are more likely to respond to surveys) and not necessarily representative of the population. For these demographic elements, the reported values are likely to be biased, but the trends are likely reliable (i.e. a bus route with more women than another bus route likely does have more women, but the percentage of women on both bus routes is likely over-estimated). 

Additionally, we believe that visitors to the region and the MBTA are less likely to fill out and return surveys than regular riders. This response bias reveals itself most in the fare payment data – the portion of survey respondents who reported using monthly or seven-day passes is higher than the portion recorded by our fare system paying with these passes. This and other biases may show up in other results as well.

English Proficiency: The “Ability to Understand English” results cannot be assumed to provide an accurate measure of the percent of MBTA riders with little or no English proficiency because 99 percent of the returned survey forms used the English version, and forms were available in a limited number of other languages (Spanish, Portuguese, Cape Verde Creole, traditional and simplified Chinese, Vietnamese, French).

The MBTA, working with the Central Transportation Planning Staff, has just completed a systemwide passenger survey to collect necessary passenger demographic data for bus routes and rail stations. This project updates the 2008-2009 dataset and will be used for service planning, ridership analysis, and Title VI equity analyses.   

The MBTA knows this data is useful for many other research projects, so we are releasing an interactive tool that allows you to compare the results for stations and bus routes. 

In collaboration with the Boston Area Research Initiative, the MBTA is holding a data challenge to see how students and researchers can creatively use the survey data to answer research questions. The winners of the data challenge will be invited to present their work at the BARI Spring 2018 conference on April 27th, 2018. 

Screenshot from the Rider Census tool

Data Challenge Logistics

The first rule is read all the data caveats! After that you are free to do whatever analysis interests you. To get you started we have created a list of potential research questions (below). Feel free to combine this data with other datasets about Boston.  

You may work on your submission as individuals or teams. Submissions are due at midnight at the end of April 16th, 2018. Please e-mail them along with your contact information to This email address is being protected from spambots. You need JavaScript enabled to view it.. You may also contact us at this address with data questions you have as you work on the challenge.

Winners will be notified on April 20th, 2018 and invited to attend the BARI Spring 2018 conference and present their results. Winning submissions will also be featured on this very prestigious data blog.

Your submission can be a map, written analysis, an interactive tool, or whatever you think best conveys the analysis you did. 

Data Challenge Criteria

Submissions will be judged on the following criteria:

  • Accuracy of the analysis: Did you use the data correctly? Were your analyses methodologically sound and well-documented? Did you account for the caveats?
  • How compelling the research question is: Does the analysis reveal something that was not apparent at first glance? Does it confirm something we believed but weren’t sure about? The results do not have to be surprising to be compelling.
  • Presentation of the analysis: Is the deliverable easy to understand? Are the graphics clear? Are the graphics and tables helpful in understanding the results?

Potential Research Questions

To get you started, one of our interns did an analysis of how the minority usage on our bus routes compares to demographics of the tracts the route passes through.

Other ideas:

  • Where do MBTA rider demographics match (or not match) resident populations? Extend the above analysis to other demographics or look at it another way.
  • Which route/station has the most representative demographics of Boston? (It’s up to you how you want to define Boston geographically and how you want to define representative.)
  • Do the access modes to the stations reflect land use around those stations? Also think about parking availability and transfers (available in the dataset).
  • How does household vehicle availability match (or not) with usage? Are there spatial or demographic explanations for any mismatch?

Links to other datasets that might be useful

BARI data portal

Hubway data 

City of Boston open data portal






MAPC Vehicle Census

How to measure equity on high ridership bus routes

Anna is an M.A. candidate in Tufts’ Urban and Environmental Policy and Planning program and is one of OPMI’s interns for this semester. The following is a post she wrote about a class GIS project she did using recently-available MBTA Rider Census data. We’re currently holding a Data Challenge using this data – see this post for more details.

The MBTA follows Title VI of the 1964 Civil Rights Act, which protects people from discrimination based on race, color, or national origin. This means that when the MBTA considers service improvements and changes in service, it must make sure that these changes do not have disparate impact on minority populations. In order to evaluate service changes for impacts on minority populations, the MBTA must understand the proportion of riders who are minority on all of its services. 

There are currently two main ways to calculate minority ridership. The MBTA could either conduct a rider census survey to collect demographic information from a sample of riders on each MBTA bus route or station, or could use U.S. census data for people within the service areas of that service to estimate minority ridership. Conducting a rider census survey is the preferred method; however, this method is resource and time intensive. U.S. census data is easily accessible and easily analyzed so this method, if representative, would be useful. 

In this project, I sought to understand if U.S. census minority data within MBTA bus service areas was representative of the minority bus ridership on MBTA buses. Essentially this is a question of whether the minority make-up of the people on bus routes matches the minority make-up of the neighborhoods the routes travel through.  

The MBTA recently conducted a Rider Census (the results are available here), so I compared the rider census survey data to U.S. census minority population data in MBTA bus service areas.

How did I analyze the difference between the minority population riding the bus and living in bus service areas?

I examined all MBTA bus routes with a daily ridership of over 600 passengers, excluding the Silver Line. The analysis used MBTA Rider Census data, MassGIS bus stop and bus route data and 2010 U.S. census block and population data. 

In order to find service areas for the key bus routes, I conducted a network analysis on the key routes’ bus stops using Massachusetts streets. I then selected the census blocks intersecting the service area polygons generated from the network analysis. 

For all non-key bus routes with daily ridership over 600 passengers, I conducted a simpler analysis to find the service area. For these 121 routes, the census blocks within a quarter-mile Euclidian radius of the bus routes’ bus stops were selected to create the routes’ service areas. I chose this simpler analysis for these routes because this analysis was faster to conduct on a large number of bus routes. Network Analyst more accurately represents walking distances; however, the simpler analysis results varied little from the network analyst results, so I chose to use the more accurate network analyst method for key bus routes only. 

After finding the service areas for key routes and for all bus routes, I calculated the average U.S. Census minority percentage living in each route’s service area. I then compared the U.S. Census minority percentage to the MBTA rider census minority percentage for each of these routes. For each bus route, I created a value by dividing the MBTA rider minority percentage by the U.S. Census minority percentage to show the rider census minority population as a percentage of the U.S. Census minority population living within the route service area. This value shows the difference between the percentage of riders who are minority and the percentage of the population living within each bus service area who is minority. I then mapped the difference values for each bus route, several individual high ridership bus routes and the bus routes with the highest difference values. 

So, can U.S. Census data predict minority bus ridership?

The study found that the percentage minority riding MBTA buses is higher than the percentage minority living in MBTA bus service areas; t(115) = 12.38, p=.00.  On 86% of bus routes, the minority percentage riding the bus was at least 115% of the minority percentage living in the service areas (Figure 1). Figure 2 below shows the 7% of MBTA bus routes in blue where the minority population percentage living in the bus service area is comparable to the minority population percentage taking the bus. 

Figure 1. Pie chart demonstrating the percentage of bus routes with minority bus ridership higher, lower and similar to the minority population living within bus service areas. 

Figure 2: This map shows the MBTA bus routes in blue the eight bus routes where the percentage of minority bus riders is comparable (between 86% and 115%) to the minority population percentage living in the bus service area. 

For example, on Route 1, the MBTA bus route with the fourth-highest weekday ridership, 36.7% of bus riders are minority; however, only 28.4% of people living in the Route 1 service area are minority. Figure 3 below shows the minority population percentage living within the service area census blocks adjacent to route 1 bus stops. 

This analysis not only demonstrates that minorities disproportionately ride the bus compared with the non-minority population, but also demonstrates where the difference between minority bus ridership and minority residents living in bus service areas is the highest. Figure 4 shows a map of routes where the minority bus ridership is greater (between 100% and 360%) than the minority population living in the bus service area. 

Figure 4. This figure shows the bus routes with a percentage minority ridership higher than the percentage minority residents in the service area. Darker orange bus routes have a higher difference.

The routes with the highest difference between minority bus ridership and minority population in the service areas are bus routes 350, 134, 76, 230 and 93. On these bus routes, the percentage minority of bus riders is more than three times the percentage of minority residents living in the service areas. Route 93 has  the highest difference (Figure 5). 30.3% of bus riders on bus 93 are minority but only 9.9% of the population in the bus service area is minority. Route 93 runs through downtown between Haymarket and Sullivan Square. 

Figure 5. Route 93 bus route and minority population distribution within its service area.

In Conclusion…

Based on this analysis, the minority population living within a bus route’s service area is not representative of minority ridership on buses travelling along the route, so the MBTA should continue to use rider census data, instead of U.S. Census data, to estimate minority bus ridership. 

This study’s methodology could be improved by automating the network analyst tool for creating bus service areas. This way, the analysis could be carried out quickly on all bus routes, not only high ridership routes. Also, further analysis could explore alternative ways to select census block data within a .25 mile walkshed of a bus stop. The current analysis selects full census blocks with parts within the .25 mile walkshed, in an effort to avoid assuming even distribution of the minority population. Future analysis could examine the distribution of minority populations within census blocks to improve accuracy.

Future studies may also generate a way to predict minority bus ridership using U.S. Census minority percentages, by weighting each U.S. census block minority percentage based on the number of boardings at the stop within the block.  A future tool for predicting minority ridership based on U.S. Census minority population data could also identify and use other external factors that may affect minority ridership as well when generating a more accurate prediction. 

Finally, this study gives the MBTA a general understanding of the minority bus usage in non-minority residential corridors. This could help the MBTA make service improvements to improve access for these riders.