Over the last fifteen years, Railroads in the US, Europe and other countries have been using RFID devices on their locomotives and railcars. Typically, this Information is stored in traditional (i.e. mostly relational) databases. Information from the RFID scanner provides information about the railcar number and locomotive number. This railcar number is then mapped to existing railcar and train schedule. Timestamp information on scanned data also provides us the sequence of cars on that train. Information from data obtained by scanning RFID on locomotive provides us the number of locomotives and the total horsepower assigned to the train. It also informs whether locomotive is coupled in front of the train or rear of the train.
The scanned data requires cleansing. Often, readings from a railcar RFID are missing at certain scanner. In this case, the missing value is estimated by looking at the scanner reading before and after the problematic scanner to estimate the time of arrival.
Major Railroads have also defined their territory using links where a link is the directional connection between two nodes. Railroads have put RFID scanners at major links.
An RFID gives information on railcar sequence in train, locomotive consist, and track in real-time. Railroads store this real-time and historical data for analysis.
Figure 1: Use cases of Rail Time Series Data
Figure 1 above shows use cases of time series data in railroad industry. We believe that all of these use cases are applicable for freight railroads. These use cases can also be used for passenger railroads with some changes. They involve the use of Analytics and RFID
Uses of Real-Time Time Series Data
Here are some ways that time series data is/can be used in railroads in real-time.
- Dispatching: Scanner data is being used for dispatching decisions for many years now. Scanner data is used to display the latest location of trains. Dispatchers use this information, track type, train type, time table information to determine the priority that should be assigned to various trains.
- Information for Passengers: Passengers can use train arrival and departure estimates for planning their journey.
Uses of Historical Time Series Data:
Here are some ways that historical time series data is/can be used in railroads.
- Schedule Adherence Identify trains that are consistently delayed: We can identify trains that are on Schedule, delayed or earlier. . We can identify trains that consistently occupy tracks more than the schedule permit. These are the trains that should be considered for a schedule change. These are the trains that are candidate for root cause analysis.
- Better Planning: We would be able to determine if planned ‘sectional running time’ are accurate or need to be checked. Sectional run times are generally determined based on experience and are estimates at network level but don’t consider local infrastructure (signal, track type). Sectional running time is used in development of train schedule and maintenance schedule at network and local level
- Infrastructure Improvemen – Track Utilization: We can identify the section of track where trains have the highest occupancy. This would lead us to identify tracks that are being operated near track capacity or above track capacity. Assumption here is that Utilization above track capacity would result in delays. We can identify the set of trains, tracks, time of day, day of the week when occupancy is high and low. This would provide us insights in train movement and perhaps provide suggestions on train schedule change. We might be able to determine if trains are held up at station/yards or on mainline. An in-depth and careful analysis can help us determine if attention needs to be paid to yard operations or mainline operations.
- Simulation Studies: RFID scan data provides us actual time of arrival and departure for every car (hence train). Modelers do create hypothetical trains to feed to simulation studies. This information (actual train arrival/departure time at every scanner, train consist, locomotive consist) is used in infrastructure expansion projects.
- Maintenance Planning : Historical Occupancy of tracks would enable us to identify time windows when maintenance should be scheduled in future. Railroads use inspection cars to inspect and record track condition regularly. Some railroads are facing the challenge of getting the right geo coordinates for segment of track. Careful insights of this geo and time series data measure track health and deterioration. Satellite imagery data is becoming available frequently. A combination of these two sources can do well to inspect tracks, schedule maintenance, predict track failures, and move maintenance gangs.
- Statistical Analysis of Railroad Behavior
- We can map train behavior with train definition (train type, schedule, train speed, train length) and track definition (signal type, track class, grade, curve, authority type) and identify patterns.
- Passenger trains do affect the operations of freight trains. Scanner data can be used to determine the delay imposed on freight trains
- Time series information of railcars can be used to identify misrouted cars or lost cars.
- Locomotive consist information and time series data based performance can be used together to determine the best locomotive consist such as make, horsepower (historically) for every track segment
- Locomotive is a costly asset for any railroad. Time series data can easily be used to determine locomotive utilization.
- Demand Forecasting : Demand for railroad empty cars is known as an indicator of a country’s economy. While demand of railroad cars vary with car type and macro-economic factors, it is worth making efforts getting insights on historical perspective. Number of cars by car type can be estimated and forecasted for every major origin-destination pair. Number of train starts and train ends at every origin and destination can be used to forecast the number of trains for a future month. Number of trains forecasted would help a railroad determine the number of crew, locomotives. It would also help railroad determine the load that tracks would go through. Number of forecasted trains can be used in infrastructure studies.
- Safety: Safety is the most important feature of railroad culture. Track maintenance, track wear and tear ( track utilization) are all related to safety. Time series data of railcars, signal type, track type, train type, accident type, train schedule can all be analyzed together to identify potential relationship (if any) between various relevant factors.
- Train Performance Calculations: What is the unopposed running speed on a track with a given grade, curve, locomotive consist, car type, wind direction and speed? These factors were determined by Davis  in 1926. Could time series data help us calibrate the co-efficient of Davis’s equation for railcars with new design?
- Planning and Optimization: All findings above can be used to develop smarter optimization models for train schedule, maintenance planning, locomotive planning, crew scheduling, and railcar assignment.
In this article, we have highlighted some use cases of time series data for Railroads. There are many more factors that could be considered especially in the use of Technology for implementing these Time series algorithms. In subsequent sections, we will show how some of these use cases could be implemented based on the R programming language.
To know more about the Data Science for Internet of Things practitioners course. Please contact firstname.lastname@example.org for more details. You can sign up for more posts from us HERE
- Davis, W.J, Jr.: The tractive resistance of electric locomotives and cars, General Electric Rewiew, vol. 29, October 1926.