The challenge
Suitable machine learning methods were crucial to managing big data and meeting business requirements. The existing batch learning approach was suitable for regularly received data. However, for our client’s situation, where data was abundant and increased in frequency, scalability and volume became critical factors.
Therefore, they required a solution with greater elasticity. The hybrid model proved to be the best solution. This was the time our client reached out to VirtusLab.
The solution
VirtusLab provided the client with a manageable setup for data processing, utilizing Azure Machine Learning service, ETL pipelines, and feature engineering. The hybrid cloud and batch offline processing strategy enabled our client to train ML models on data from 5 million customers, with real-time adaptability to customer behavior.
Additionally, the incorporated feedback loop allowed for continuous learning, supporting the client’s sales optimization strategy by making real-time predictions based on retained and new customer behavior data.
The results
The new hybrid platform provided the retailer with numerous benefits, including:
- Overcoming limited computing resources and the shortcomings of an unscalable on-prem solution.
- Processing several hundred terabytes of client data.
- Creating a large dataset for model training.
- Accumulating real-time data enabled faster, cost-efficient learning.
- Advanced website customization in real-time supported online sales.
- Adapting quickly to changes in customer behavior.
- Allowing the data team to adopt the approach to other projects with manageable infrastructure from day one.
The tech stack
Languages: Python, Java
Databases: Hive
Eventing platform: kafka, Flink
Infrastructure: Azure, Azure DevOps, Terraform, Terragrunt, CHEF, GitHub Actions
Modeling: TensorfFlow, scikit-learn, Azure Machine Learning
Partner flexibly with VirtusLab
Use one or a combination of engagement models to suit your needs.