Skip to main content

VirtusLab's ArticlesRSS

Data Engineering|Oct 21, 2021

Scala 3 and Spark?

Spark 3.2.0’s support for Scala 2.13 technically allows Scala 3 Spark jobs—but it remains “an uphill path,” requiring workarounds for encoders and data shapes. Using libraries like Iskra smooths the path, yet production readiness is still experimental.

Data Engineering|Aug 20, 2021

Table schemas in data pipelines Spark: How to handle large, nested & growing ones

In this post, we describe how we built a pipeline for the type of “incoming data” situation, and how we came up with a good solution in the end.

Table_schemas_in_data_pipelines_Spark_How_to_handle_large,_nested_&_growing_ones_image-min.jpg