The challenge
Due to time restrictions, large storage volume, and a limited vendor lock, the logistics company ran their operations on CosmosDB with MongoDB API. Although the CosmosDB storage solution worked well initially, the client experienced a slowly increasing number of issues with growing data volume. They encountered challenges such as unexpected costs, timeouts for full scan queries and even risk of data corruption due to the schema-less nature of the database. The forwarder commissioned VirtusLab to create a more efficient storage solution.
The solution
VirtusLab suggested switching from CosmoDB to Microsoft Azure Storage adopting Apache Avro format. This change helped solve storage challenges. Using Apache Avro in a way similar to how Kafka is managing data allowed the client-controlled evolution and compatibility testing of data by using schemas, which are templates defining the structure, format, and data types of stored and transmitted information. Instead of storing and sending schemas, they only sent values, reducing storage needs and expenses.
The results
Our client rid themselves of storage issues with VirtusLab’s help. They gained:
- Modification and update of the data structure in an organized way
- Storage reduction by 63%
- Operational costs reduction by 77%
- Vendor lock avoidance
The tech stack
Languages: Scala
Infrastructure: Azure ComosDB, Microsoft Azure
Framework: Kafka, Apache Avro, Confluent Cloud