It’s always fascinating to observe outliers and understand them. Outliers are the data points that don’t seem to fit well with rest of the data population. It is interesting that data points with outlier behavior are ‘outlier’ but can be found in almost every dataset that you get your hands on. Identifying outliers is always one of the first few things that a person understanding the data or interpreting the data should do. I would like to argue that it is a sin to infer from data without understanding outliers in that dataset.

View Details

Detection of Outliers in Time Series Data from Control Systems

Outliers are observations which do not fit in the tendency of the time series observed as they differ dramatically from the typical pattern of the trend and/or seasonal components.

Time series data often undergo sudden changes that alter the dynamics of the data. These changes are typically non-systematic and cannot be captured by standard time series models. That’s why they are known as outlier effects. Detecting outliers is important because they have an impact on the selection of the model, the estimation of parameters and consequently, on forecasts. Hence, an approach was followed as described in Chen & Liu (1993) which was published in the Journal of the American Statistical Association, an automatic procedure for detection of outliers in time series i.e  implemented in the package tsoutliers. The function tso is the main interface for the automatic procedure [1].

View Details

Over the last fifteen years, Railroads in the US, Europe and other countries have been using  RFID devices on their locomotives and railcars.  Typically, this Information is stored in traditional (i.e. mostly relational) databases. Information from the RFID scanner provides information about the railcar number and locomotive number. This railcar number is then mapped to existing railcar and train schedule. Timestamp information on scanned data also provides us the sequence of cars on that train. Information from data obtained by scanning RFID on locomotive provides us the number of locomotives and the total horsepower assigned to the train. It also informs whether locomotive is coupled in front of the train or rear of the train.

View Details

Predictive maintenance might sound like a buzzword but I have seen it in action. I can take the credit to implement it and be useful for end users. Predicting unplanned failures can be beneficial to two user groups.
• Determine warranty cost
• Determine root cause of unplanned failures or inferior output
Warranty cost has been a research subject in insurance industry for long. A manufacturer will use predictive maintenance to calculate the right warranty cost and offer to the customer where it provides warranty, extended maintenance and service contracts. Terms and conditions of a warranty and extended warranty are designed based on insights from warranty models. . …

View Details