The challenge
Our client needed to thoroughly examine the test data of their product to detect faults in production. Data analysts monitored several parameters, types, and combinations of both to identify faults in the equipment being sold. The sheer volume of data made manual processing challenging, with 9200 device types and characteristics measured. Our client’s primary analysis tool displayed the measurement data over time and determined whether a product needed to be repaired or sent to packaging. However, the resulting charts of the analysis were far from easy to handle, and they could only analyze a fraction of the data.
The lack of information regarding which data points indicated faults in the system also extended the analysis time. In addition, the manufacturer’s current architecture included a SQL database on an Azure server and on-prem solutions for analytics purposes. It lacked the ability to scale, collect data from various sources quickly, and offer convenient access for the analytics team. This made it hard to change the approach to analysis. Consequently, our client approached VirtusLab for assistance in improving their semi-manual data inspection process.
The solution
Based on the client’s data analysis team’s experience and the database’s current location, VirtusLab suggested designing and implementing a cloud solution on Azure with Power BI extensions. The solution’s core revolves around a machine learning model responsible for identifying issues in the measurement data, making manual reviews unnecessary. To create the initial version of the neural network model, VirtusLab opted for a statistical drift detection model.
The system learns from feedback derived from data analytics and adjusts future alerts accordingly. VirtusLab developed a visualization platform using Power BI, which presents time series data for a specific device type, highlighting points where drift changes have been identified. A data analyst can then determine whether the drift signifies a problem, and the feedback is automatically sent to the Azure platform to enhance the model’s decision-making capabilities.
The human feedback drift detection process:
- Ingests data from the client database into Azure Storage using the Data Factory component.
- Trains models based on historical data and collected feedback from data analysts.
- Registers a new model.
- Creates a set of drift detection predictions for new data.
- Passes data to Power BI for visualization.
- Collects feedback from data analysts.
The results
The platform is scalable and adaptable to future data sources and features. Its wide adoption is due to Microsoft’s provision of familiar infrastructure components to the client. Our client also gained:
- Time savings: The model significantly reduces analysis time by suggesting the data that needs to be reviewed, eliminating the need for data analysts to go through every device type/characteristic combination.
- Constant improvement: The model consistently improves by learning which drift detections are worth alerting, enhancing the system’s value over time as it gathers new feedback.
- Ease of use: All data can be reviewed and compared in one place, with useful filtering options and data display modes.
- Low maintenance and cost: All machine learning processes are configured to operate without the need for continuous monitoring or manual intervention, enabling data analysts to focus on crucial tasks while benefiting from automated feedback.
The tech-stack
Languages: Python, TensorFlow
Data visualization: Power BI
Infrastructure: Azure Data Factory, Azure Storage, Azure Machine Learning