top of page


Big data root cause analysis of rare events

The task

Fluctuating torque loads can lead to vibrations and jerks that affect passenger comfort and vehicle performance. As a rule, these events are recognized by test drivers during gear changes and manually logged during the driving routines. In this use case, the customer was looking for methods to objectify the detection of these events and to gain statistical insights into possible causes. The knowledge gained should support the development process and design decisions.


The challenge

The great challenge of this analysis lies in the rarity of the event and thus in the amount of data required to collect a statistically meaningful number of samples of "jerks". In order to identify possible causes for the jerks, numerous recorded signals including their derived characteristics must be considered. This leads to a complex and computationally intensive analysis. Since the data is generated continuously, the analysis must also be carried out repeatedly - ideally automatically.


our solution

The data was automatically converted from the individual measurement files into a big data format in order to ensure the scalability of the analysis. Instead of relying on the subjective manual logging by the test driver, we used a big data search to find “jerks” in the recorded data. The time intervals found were then classified using a detection algorithm based on spectral analysis, machine learning and deep learning.

In a further step, possible causes for the "jerking" were looked for. For this purpose, it was determined which signals show abnormal behavior during the classified events. We developed an app to identify the signals and their characteristics with the strongest statistical correlation to the event. The complete analysis results thus contain the time intervals of the detected jerks, combined with the identified correlated signals.


The customer benefit

The provided solution enables engineers to perform complex data analyzes in a scalable scope of distributed data sets in a self-service. Implementation as an app with a graphical user interface enables big data analysis to be carried out even by users who are not familiar with the details of the analysis algorithm. As a result, the number of test drives required is reduced, since relevant events can now be found in an already existing data set, which leads to enormous cost reductions, and the overall productivity is increased drastically.



Our role

  • Support of the customer by data scientists and data engineers

Our activities

  • Development of a library for pattern-based extraction of defined events

  • Use of advanced analytics, machine learning and deep learning to tag anomalous events

  • Export of relevant data sections and provision for existing visualization tools

Technologies & methods

  • Applications: DaSense, Tableau

  • Databases: Hive, Hbase, MF4, Parquet

  • Languages / Frameworks: Python (Anaconda Stack), Hadoop, Spark

  • Methods: time series analysis, spectrum analysis, correlations, machine learning, deep learning

bottom of page