Table schemas in data pipelines Spark: How to handle large, nested & growing ones
In this post, we describe how we built a pipeline for the type of “incoming data” situation, and how we came up with a good solution in the end.
Hadoop legacy
Navigating data lakes using Atlas
Diving in the data lake
The rapid growth of unstructured data is a serious business challenge for organizations.