Home

Behind the scenes with MBTA data.

The MBTA is excited to be taking many steps to improve our bus services. You can read about the process used to analyze the changes proposed by the Better Bus Project in our previous post. We're also in the beginning stages of re-designing our overall bus network. And of course, we're continuing our work to partner with cities and towns to implement street-level changes in order to prioritize the movement of buses.

Over the last two+ years, the MBTA, in partnership with cities, towns and other stakeholders, has begun piloting and implementing bus-only lanes on some of our key corridors. We will continue to partner to improve bus service by implementing more bus-only lanes as well as other interventions like queue jumps and transit signal priority.

These lanes take various forms: some are operational only during peak times, some are all-day, and they are different lengths and affect different areas. Here at the blog, we will be presenting some analysis on these lanes and all our bus priority interventions as data becomes available. Read below for an overview of the bus-only lanes we’ve introduced so far. To get an in-depth look at some of the rationale for why these lanes are so important, take a look through our bus ridership report!

The basic idea behind the bus lanes is that there are 40 or more people on a bus during peak times, and usually just one in a private vehicle, so it makes sense according to simple geometry to prioritize the travel of buses. In many cases, these lanes have taken the place of a parking lane which was only lightly used. When you separate buses from mixed traffic, you can both improve the speed of bus travel along the corridor and decrease the variability of run times, both of which make taking the bus a more competitive option with driving, and over time, you can not only improve the experience for passengers but also attract more passengers to the bus. We’ll take a look at how well these interventions have met these goals below.

1. Broadway Bus Lane, Everett

The Broadway Bus Lane in Everett operates in the inbound direction for about one mile from 4AM to 9AM, Monday through Friday. It has been in operation since December 5, 2016, though various improvements have been made since then to improve the operation of the lane. There are 5 bus routes that serve the 11 stops within the bus lane. Route 109 serves the entirety of the corridor, while routes 97, 104, 110, and 112 join the corridor at various points along the way (Figure 1). More information about the project is available on the MassDOT website.

Figure 1: Routes serving the Broadway Bus Lane in Everett

Figure 1: Routes serving the Broadway Bus Lane in Everett

1.1 Run Time Changes

To evaluate changes in run time, we used data from the MBTA’s automated passenger counter (APC) system. APCs collect information about when the doors open and close at each stop and the number of passengers boarding or alighting. Dwell times, defined here as the time in between when the doors are opened and closed at each stop, are impacted by the number of passengers boarding and alighting, whether they pay in cash or prepaid fare media, the presence of wheelchairs and strollers, and other factors. In order to focus on run times (since other programs like AFC 2.0 are intended to help reduce dwell times), the run time analysis looked at the run time between when the doors close at one stop until they open at the next stop. 

To calculate the change in run time from the bus lane, we used data from weekdays between September 1 and November 15 in 2016 and 2018. The data from 2016 represent the before condition, while 2018 represents the after condition. It is important to compare times from similar seasons, because traffic patterns vary throughout the year due to school schedules, holidays, and weather. Fall is a good time to use for comparison because ridership and traffic are generally at their highest annual levels, and travel patterns are not generally impacted by holidays and/or bad weather. 

Because various routes enter the corridor at different points, we began by looking at route 109, the only route that covers the entire corridor. We first calculated the run time for each trip in the corridor by excluding the dwell time. Next, we grouped these observations by hour and calculated the median and 90th percentile run times by hour for the corridor. The median value represents the “normal” run time for each hour of the day, while the 90th percentile is useful for MBTA operations, because each trip’s scheduled run time and recovery time is based on the 90th percentile run time. Reductions at the 90th percentile run time will have an outsized impact in vehicle reliability and the frequency at which the MBTA can operate buses.  

These results are shown in Figure 2 and Figure 3 below for the median and 90th percentile, respectively. We collected data throughout the day, but these figures show the results for peak hours when the greatest numbers of buses are in service and sufficient data was collected. While there was a clear reduction in run time between 2016 and 2018 between 5 and 9 AM, there was no discernible change during the 9-10 AM hour and the PM peak hours when the bus lane is not in operation. The savings were particularly strong from 7-8 AM, when buses saved almost 8 minutes at median and almost 11 minutes at the 90th percentile. Other routes on the corridor realized a smaller portion of the savings because they utilize a smaller portion of the bus lane. In the next section, this is considered in greater detail. 

Figure 2: Median corridor run time by hour

Figure 2: Median corridor run time by hour

Figure 3: 90th percentile corridor run time by hour

Figure 3: 90th percentile corridor run time by hour

1.2 Passenger Time Savings

After determining the run time savings, we then looked at passenger time savings, which is a function of both time saved and the number of passengers on the bus. The steps in the calculation are as follows:

  1. Calculate the median and 90th percentile run time by hour for each stop-to-stop pair in 2016 and 2018, using the methods described above. The time savings for each hour is the difference between 2016 and 2018.
  2. Calculate the average stop time and passenger load of each trip leaving each stop in 2018.
  3. Multiply the time savings times the load for each trip-stop and aggregate. 

In Fall 2018, on the median weekday morning, passengers saved 24 hours of travel time. On the 90th percentile "bad" day, passengers saved 65 hours altogether. The results by hour are shown in Figure 4 below. Passengers on routes 104 and 109 accounted for more than two-thirds of the total savings due to higher ridership and more utilization of the bus lane corridor on those routes. 

Figure 4: Passenger-hours saved per day

Figure 4: Passenger-hours saved per day

2. Mt. Auburn Street

The Mt. Auburn St. Bus Lane in Cambridge and Watertown operates in the inbound direction 24 hours a day, 7 days a week. It began operation on October 15, 2018, though there were various implementation and signal issues at first. By November 15, 2018 it began its normal operation. Routes 71 and 73 serve the bus lane. There are short bus-only lanes on Belmont Street and Mt. Auburn Street, which are served by the 73 and 71, respectively, just before the routes join together at Belmont Street at Mt. Auburn Street, as well as installed signal priority and queue jumps. From there, the bus lane continues east to Fresh Pond Parkway, though it is not continuous in throughout the corridor. More information about the project is available on the City of Cambridge’s website.

Figure 5: Map of the Mt. Auburn Bus Lane in Cambridge and Watertown

Figure 5: Map of the Mt. Auburn Bus Lane in Cambridge and Watertown

2.1 Run Time Changes

Routes 71 and 73 are served by Electric Trolley Buses (ETBs), which were the last portion of MBTA’s bus fleet to not have any vehicles equipped with Automated Passenger Counters (APCs). In July 2018, 5 of the 28 ETBs were equipped with APCs, and it took several months after installation before data was regularly collected due to configuration issues. As a result, we did not collect enough data before the bus lane installation to conduct a thorough year-over-year analysis.  The automated vehicle locator (AVL) system data was used as an alternative. It is less granular than APC data, so it is less useful for drilling down into bus lane performance, but all vehicle are tracked by AVL, offering comprehensive coverage.

To calculate the change in run time from the bus lane, we used data from weekdays between January 1 and March 31 in 2018 and 2019. The data from 2018 represent the before condition, while 2019 represents the after condition. Again, it is important to compare times from similar seasons, because traffic patterns vary throughout the year due to school schedules, holidays, and weather. Although data from Fall is preferable, the implementation and refinement of the bus lane occurred in various stages throughout October and November, making Fall 2018 a poor time for comparison. 

We then calculated the run time for the route segment between the intersection of Mt. Auburn and Belmont St and the eastern side of Mt. Auburn Hospital. This segment encompasses all of the bus lane shared by routes 71 and 73, and omits only the separate portions on Belmont St and Mt. Auburn St. before the intersection. Next, we grouped these observations by half-hour and calculated the median and 90th percentile run times by half-hour in both directions. The median value represents the “normal” run time for each hour of the day, while the 90th captures what a typical “bad day” is like. Even if a day like this only happens 1 out of 10 times, bus riders likely need to plan for the full range of possible travel times when planning their trips. Similarly, the 90th percentile is useful for MBTA operations, because each trip’s scheduled run time and recovery time is based on the 90th percentile run time. Reductions at the 90th percentile run time will have an outsized impact in vehicle reliability and the frequency at which the MBTA can operate buses.  

The inbound and outbound results are shown in Figure 6 and Figure 7 below. In the inbound direction, buses save 3-4 minutes at median and 5-8 minutes at the 90th percentile during the busiest time of the AM peak, and 0.5 to 2 minutes throughout the rest of the day. At the 90th percentile, this represents an 8-12% reduction of the maximum cycle time during the AM peak period. Despite there not being any bus lane in the outbound direction, run times were consistently shorter throughout the day, with buses saving 0.5-1.5 minutes. This is likely due in part to changes in signal timing that have helped move all vehicles through the project area. 

Figure 6: Year-over-year change in inbound segment run times on Mt. Auburn St.

Figure 6: Year-over-year change in inbound segment run times on Mt. Auburn St.

Figure 7: Year-over-year change in outbound segment run times on Mt. Auburn St.

Figure 7: Year-over-year change in outbound segment run times on Mt. Auburn St.

2.2 Next Step

Next, we will work to update these calculations as more seasonal data is collected. We also plan to estimate the savings in terms of passenger hours saved by the bus lane, though as discussed above there are some data issues that complicate this calculation. 

3. Washington Street, Roslindale

The Washington Street bus lane operates in the inbound direction from Roslindale Village to the Forest Hills MBTA Station, a distance of about one mile, from 5AM-9AM Monday – Friday. After a pilot period in May 2018, permanent operation began on June 18, 2018. Nine MBTA bus routes—30, 34, 34E, 35, 36, 37, 40, 50, and 51—operate in the corridor. More information about the project is available on the City of Boston’s website.

3.1 Run Time Changes

To evaluate changes in run time, we again used data from MBTA’s automated passenger counter (APC) system. As previously explained, APCs collect information about when the doors open and close at each stop and the number of passengers boarding or alighting. This number of passengers boarding and alighting directly affects dwell times. Therefore, in order to focus on run times (since other programs like AFC 2.0 are intended to help reduce dwell times), the run time analysis looked at the run time between when the doors close at one stop until they open at then next stop—excluding dwell times. 

Unlike the bus lanes discussed above, the Washington Street bus lane presents the ideal conditions to evaluate the bus with APC data. All buses that serve the corridor are equipped with APCs, providing a rich source of data.

To calculate the change in run time from the bus lane, we used data from weekdays between January 1 and March 15 in 2018 and 2019. The data from 2018 represent the before condition, while 2019 represents the after condition. Although data from the Fall is preferable, Fall 2017 could not be used because bus travel times were greatly impacted by road construction around Forest Hills, making it a poor choice for comparison. 

We first calculated the run time for each trip in the corridor by excluding the dwell time. Next, we grouped these observations by half-hour and calculated the median and 90th percentile run times by half-hour for the corridor. The median value represents the “normal” run time for each half-hour of the day, while the 90th percentile is useful for MBTA operations, because each trip’s scheduled run time and recovery time is based on the 90th percentile run time. Reductions at the 90th percentile run time will have an outsized impact in vehicle reliability and the frequency at which the MBTA can operate buses.  

These results are shown in Figure 8 below. There was a clear reduction in run time between 6AM and 9AM, when the bus lane is in operation, while times were very similar during the rest of the day. Buses save 2 minutes at median and 5-7 minutes at the 90th percentile during the busiest time of the AM peak. In the next section, we test how this impacted passenger and total ridership.

Figure 8: Year-over-year change in inbound segment run times on Washington Street

Figure 8: Year-over-year change in inbound segment run times on Washington Street

3.2 Passenger-weighted Savings

After determining the run time savings, we then looked at passenger time savings, which is a function of both time saved and the number of passengers on the bus. The steps in the calculation are as follows:

  1. Calculate the median and 90th percentile run time by hour for each stop-to-stop pair in 2018 and 2019, using the methods described above. The time savings for each hour is the difference between 2018 and 2019.
  2. Calculate the average stop time and passenger load of each trip leaving each stop in 2019.
  3. Multiply the time savings times the load for each trip-stop and aggregate. 

Using this method, the incremental and cumulative savings in passenger travel times is shown in Figure 9 below. In total, MBTA riders save 41 total hours of travel time at the median and 176 hours at the 90th percentile due to the Washington St bus lane per weekday. It’s possible that some of the savings calculated here may be due in part to lingering construction impacts prior to the start of the bus lane, and we will continue to evaluate year-over-year changes to confirm these estimates. 

Figure 9: Reduction in passenger-hours of travel times due to decreased run times on Washington Street

Figure 9: Reduction in passenger-hours of travel times due to decreased run times on Washington Street

3.3 Changes in Ridership

Finally, we looked at ridership in the corridor to determine whether there was any year-over-year change in ridership. During the hours of 5AM – 9AM, we found a 4% increase in ridership between Fall 2017 (3,181 arrivals at Forest Hills) and Fall 2018 (3,300 arrivals at Forest Hills). We found a similar increase of 4% between Winter 2018 (2,911 arrivals at Forest Hills) and Winter 2019 (3,034 arrivals at Forest Hills). Because there are many external factors that may impact ridership, changes in ridership cannot be solely attributed to the bus lane. However, we will continue to monitor ridership in the corridor.

3.4 Next Steps

Next, we will work to update these calculations as more seasonal data is collected. We will also evaluate the other bus lanes that are newly installed or planned for the future.

 

On Monday, December 3, 2018, the walkway that links Independence Avenue in Quincy to the Red Line’s Quincy Adams Station reopened, allowing for the adjacent neighborhood to have an easier and more direct access point to the station.

An image of the newly reopened walkway linking Independence Avenue in Quincy to the Red Line's Quincy Adams Station.

An image of the newly reopened walkway linking Independence Avenue in Quincy to the Red Line's Quincy Adams Station.

This gated entry point near Independence Avenue allows for a better pedestrian connection direct to the station. As we highlighted way back in the early days of the Data Blog, the walkshed around this station was severely limited when this gate was closed, and really the only way to access the station was via car or bus. You can see that the entire neighborhood, which is just steps from the station as the crow flies, was not accessible along the pedestrian network:

A map of the surrounding neighborhood and walkshed affected by the closing and reopening of the walkway linking Independence Avenue and Quincy Adams Station.

With the reopening of the Independence Avenue gate, would there be an increase in the amount of riders boarding at the Quincy Adams station? To learn more about how the reopening of this entrance would impact our riders, we dug into the data. First we queried our Automated Fare Collection (AFC) database to get the tap count on fare gates at Quincy Adams station and on buses that stop on Independence Avenue.

We started by identifying the total number of taps on each of the fare gates at Quincy Adams station. The results are as shown below. The fare gates are numbered according to the ease of access from the station entrance; for example, fare gate 1 is numbered as such because it is closest to the station entrance. If more riders started commuting from Quincy Adams Station after the station entrance at Independence Avenue reopened, we expected the tap count to increase on each fare gate or on the one that is closest to the entrance. However, we observed that there was not much change in the tap count on the fare gates compared to the previous years. The tap count on fare gate 4 increased from December 2018 to January 2019 but this fare gate is the one in the middle and not very close to the station entrance. We concluded that this fare gate was probably used the most because the other fare gates were down (or perhaps this was just random noise) and the increase in tap count was not an impact of reopening the station entrance at Independence Avenue.

A chart tracking faregate 1 use over time. Little change is seen from before and after Quincy Adams Station became more accessible.

A chart tracking faregate 2 use over time. Little change is seen from before and after Quincy Adams Station became more accessible.

A chart tracking faregate 3 use over time. Little change is seen from before and after Quincy Adams Station became more accessible.

A chart tracking faregate 4 use over time. Little change is seen from before and after Quincy Adams Station became more accessible.

A chart tracking faregate 5 use over time. Little change is seen from before and after Quincy Adams Station became more accessible.

You can also see the entries visualized by day in the following chart. While we see a slight increase from normal in the daily entries on a few days soon after the new entrance opened, we also saw a high number of entries on November 29, 2018, the week before the new entrance opened. We also don’t see the trend continuing into December and January.

 

After looking into the fare gate data, we took a look at the bus ridership for the routes that travel through the newly-accessible neighborhood. We assumed that a few riders took the bus to Quincy Center and then transferred to the Red Line at the Quincy Center station before the Quincy Adams station entrance opened at Independence Avenue, so we looked at the weekday tap counts on bus route 230, which has various stops on Independence Avenue. Unfortunately, we didn't have good APC coverage on buses along this route over the whole time period, and the data we do have at the stop level from ODX is spotty. But here is what we found:

If passengers who previously took the bus to Quincy Center started walking to Quincy Adams once the station opened, there should have been a decrease in the weekday tap count at these stops after December 3rd 2018, but there is no significant decrease compared to the previous years in the data we have.

We then thought that passengers who lived near Quincy Adams, or perhaps in between the two stations, may have been walking, biking, or getting dropped off at Quincy Center before the Quincy Adams gate opened. If this hypothesis was correct, we thought we’d find a number of CharlieCards that typically tapped in at Quincy Center switching to tapping in at Quincy Adams instead. We decided to look at the number of CharlieCards/Tickets that were tapped at Quincy Center between September 1, 2018, and December 3, 2018, to check if these same cards were tapped at Quincy Adams between December 4, 2018, and May 31, 2019 when the station entry point on Independence Avenue was reopened. We found that about 9.22% of CharlieCards/Tickets that were tapped at Quincy Center station between September 1 and December 3, 2018, were also tapped at Quincy Adams station between December 4, 2018, and May 31, 2019. We wanted to see if this percentage of change was unique to that time period, or if a similar change has happened in previous years. Therefore, we looked at the number of CharlieCards/Tickets that were tapped at Quincy Center station the previous year (September 1, 2017 - December 3, 2017) and checked if the same cards were tapped at Quincy Adams station between December 4, 2017, and May 31, 2018. We found that 8.49% of the cards that were tapped at Quincy Center station between September 1 and December 3, 2017 were tapped at Quincy Adams station between December 4, 2017, and May 31, 2018. 

This led us to conclude that even though riders who used CharlieCards/Tickets at Quincy Center station before December 3, 2018, used it at Quincy Adams station after December 3, 2018, it is not necessarily the direct impact of the reopening the Quincy Adams station entrance at Independence Avenue. Riders usually take the Red Line from either Quincy Center station or Quincy Adams station.  

Conclusion

We looked at the data in a number of ways, but could not find a significant indicator of use of the pedestrian gate at Quincy Adams. Of course, we also don’t have any counters or sensors at that gate or on the path, so we don’t believe that no one is using it – just that it was not a big enough change to notice in the data, especially compared to the people who use the other entrance and parking garage to access the station.

Looking at the neighborhood near the new entrance, it is not particularly dense (mostly single- and two-family homes), so we would not expect a huge number of new riders compared to the thousands who were already accessing the station from the other side. It is also likely that since the entrance was closed historically, people whose usual trip was on the Red Line would not choose to live there in great numbers. This may change as time goes on and new people move to the neighborhood, or people living there change their travel patterns. 

This investigation also shows the limitations of our current data, especially on ridership. We don’t always have the level of detail needed to discern relatively small changes like this, especially since there is always normal variation in passenger behavior. We are always investigating new cost-effective ways to learn things about how our passengers travel while continuing to respect their privacy.

 

As an essential measure of the performance of the MBTA, we report our best estimates of ridership each month both on the MBTA Back on Track Dashboard and to the National Transit Database. As we have discussed on the blog, the source data for ridership comes from different systems and is measured in different ways. There are also many riders and trips that we are unable to measure from our equipment, and whose travel we need to estimate. This post will discuss the methods we use to count riders and trips, and to estimate those we can’t directly count. We will also discuss some of our future plans for improving these estimates and our reporting.

Recap: The Sources

We use different systems to collect the raw data depending on the technology available. The two main sources are Automated Passenger Counters (APCs), which are currently installed on most of the bus fleet, and the Automated Fare Collection system that counts CharlieCard taps and other payment methods on rail and bus services. APCs are also being installed on the Commuter Rail coaches and are being installed on the MBTA’s new Green, Red and Orange Line vehicles which are expected to come into service over the next few years.

For services where we do not have significant APC coverage, we use estimates based on data from the AFC system. The AFC system counts every interaction with a piece of fare equipment (for ridership purposes, these are faregates and fareboxes). We also conduct manual counts at various times and places to check against our automatically collected data, or in cases like Commuter Rail where we have limited automatic data.

Recap: The Measure

We report ridership as Unlinked Passenger Trips (UPT), which counts each boarding of each vehicle as one “unlinked” trip, even if it was part of a longer journey. While this gives additional credit to transfer trips, it is the industry standard and is required by the NTD, so we currently report ridership in this manner. We are investigating other measures of ridership and hope to be able to provide them along with UPT in the future.

How we estimate ridership from raw data

Bus: For our bus network, with a few small exceptions, we have enough APCs installed that we can use them to estimate ridership with minimal scaling and uncertainty. For each day type and route, we compare the boardings counted by APCs on trips with buses equipped with them to the total number of trips scheduled and scale the ridership up. We then scale the ridership back down to account for scheduled service that did not run. 

Rapid Transit: We currently have very limited coverage of APCs on the Rapid Transit system and need to use the AFC data to estimate ridership. We start with the raw validations (taps, ticket insertions, or cash payments) at each AFC location. From here we apply three different factors in order to estimate total ridership from the validations. These factors are explained below:

  • Non-Interaction: Non-Interaction factors account for people who entered the MBTA system without interacting with fare equipment. These are most often children, employees, people actively evading the fare or people who entered when the fare equipment was not functioning. These factors are calculated based on a sample of manual observations of people entering faregates, conducted each year.
  • Station Splits: We usually assume that every validation at a faregate at a station leads to a person boarding the line that serves that station. At stations that serve multiple lines, we do not directly know which line someone who validated there then boarded. For example, someone validating at Government Center could then board either a Green Line or Blue Line vehicle without any further interaction with fare equipment. To estimate these data, we apply a factor called a “station split” to “split” the boardings at such stations between the lines that serve each station. These factors are currently based on past surveys of passengers, but at the conclusion of this fiscal year we will update them using ODX.
  • Behind-the-Gate: As noted above, we report ridership as unlinked passenger trips – every boarding of each vehicle. This means that for trips where passengers transferred lines without passing through a faregate or an APC, we cannot directly measure their second trip and we therefore need to estimate it with a factor. Currently, we do this using the answers from surveys of passengers. We ask them as they are waiting for a train where they are going, and determine how many additional unlinked trips we can estimate for each boarding based on which line they boarded. For example, if our survey showed that there were 121 unlinked trips for 100 passengers surveyed, the “behind the gate” factor for that line would be .21, and we would multiply the count of boardings (after the other factors were applied) by 1.21 to estimate total unlinked trips. We are also updating this factor at the end of the fiscal year using the ODX algorithm.

Putting it all together

The following chart shows an example of how we calculate final ridership from raw faregate interactions, with all three factors applied. These numbers are rounded to the nearest thousand.

A chart depicting average Red Line weekday ridership, with examples of how non-interaction, station splits, and behind-the-gate activity affects our ridership estimates.

First, we sum all the interactions at all faregates at stations with Red Line service. This will over-count the riders at stations that serve multiple lines. Then, we apply the “split factors” to the total interactions at stations that serve multiple lines (there is a different factor for each station-line combination) and apply those interactions to the other lines. This is represented by the -27 in the second column on the chart above. We then have a subtotal of 194,000 interactions that can be attributed to the Red Line.

Third, we apply the non-interaction factor to scale these taps to account for people who entered without interacting with the faregate. This brings our running total to 206,000.

Finally, we apply the additional trips from the other lines that could have behind-the-gate transfers to the Red Line (Green and Orange). These are counted in a similar calculation that is conducted on the interactions recorded at gates on those lines. This adds an additional 36,000 unlinked trips to our total, giving us our final ridership estimate of 242,000 average weekday UPT on the Red Line.

Green Line Surface

The Green Line is the most extensive and complex light rail system in the country, and this complexity presents myriad data challenges, as we have detailed on the blog. For ridership reporting, the surface-running portion of the Green Line presents some unique issues that we must account for. First, there is a high level of non-interaction on the Green Line due to the operational practice of allowing passengers with passes to board at the back door. While we believe the revenue loss from this is relatively low, it does mean we have a large non-interaction factor that we use for Green Line. We continually monitor and improve this factor, and as the new Type 9 cars, equipped with APCs, come into service, we will be able to use these to better estimate non-interaction.

Second, the Green Line fareboxes are not hard-wired to the AFC central database. This means they must be manually “probed” to download their transaction data (cash payments into the fareboxes are collected through a different process). Since the AFC system was installed nearly 15 years ago, this is a much more difficult process than it might seem; data can only be probed in certain places in the train yard, and vehicles do not always come into these places in the yard for any operational reason (by contrast, fareboxes on buses are probed much more regularly since it is part of the nightly re-fueling process). In fact, a large portion of the data from surface AFC interactions are not downloaded to our database until weeks or sometimes months after the transaction occurred.

In order to account for this probing lag, we have developed a process to impute taps for which we do not have data yet, based on the amount of service we see that each vehicle has provided (measured by stations visited from our AVL system) and the number of taps per vehicle-stop visit that we have recorded in each month in the past.

This process consists of four steps: first, we evaluate how much AFC data is missing and likely to come in through a future probing. We conservatively estimate AFC data to be missing if a vehicle is seen to be in service during a particular date but did not record any AFC records. Next, we estimate what the missing data is likely to be based on the same month of the prior year (to account for seasonal ridership trends), in terms of taps per vehicle-stop visit we tend to see in that month. We then look at the number of stop visits that occur on the vehicles with currently missing AFC, and scale them up by this estimate. Finally, every month, as more probed data comes in, we replace the estimates with real data. 

Ridership on the Dashboard

We put all of the above together into our ridership update six weeks after the end of each month. This is the earliest date we feel confident that we have enough Green Line surface data to estimate its ridership. After QA/QC, we combine the above calculated ridership with the ridership reporting we get from Commuter Rail, Ferry and the RIDE to display our average weekday ridership for each month. 

We are working on more detailed and granular ridership tools which will allow users of the Dashboard to explore our ridership data in different ways as data quality and availability improves. Look for these in a future update to the Dashboard.