Data Center Anomaly Detection using Machine Learning
A team of 50 AI Engineers applied machine learning for anomaly detection that lead to data center outages using the Root Cause Analysis (RCA) process.
By partnering with Omdena, Israel-based startup Sensai aims to create a robust automation process that automates the analysis of data center metrics for rapid AD and RCA, smoothly integrates into their current data management system, and contributes to their end goal of creating self-healing capabilities for data centers.
In data Centers, once an anomaly is detected, Site Reliability Engineers (SREs) need to identify and understand the root cause of that anomaly asap by analyzing the sequence of events that led to the anomaly. In most cases, identifying the root cause is done manually and takes significant time and effort.
Sensai’s target is to reduce the human effort required for root cause analysis by automating the whole process of identification and RCA, using state-of-the-art machine learning and causal inference mechanisms.
Causal inference is gaining colossal momentum these days and is becoming the next big thing. A few days ago, the Nobel prize in economics was given to causal inference researchers “for their methodological contributions to the analysis of causal relationships.”
The project outcomes
The team built various ML models and applied data science techniques to find unsupervised solutions to identify and analyze the “most likely” events leading to anomalies. The algorithms detect in multivariate time series metrics data collected from different machines and applications in hybrid data centers in real-time.
Using Machine Learning algorithms to (I) detect anomalies and (II) analyze and identify the root cause – the ‘most likely’ event leading to the detected anomaly.
The dataset is a multivariate time series data collected from different machines and applications in hybrid data centers.
This challenge has been hosted with our friends at
"Great, smooth, and an amazing product manager to keep the team of AI engineers on track"