19 September 2017 / Jan Paw

Hadoop legacy

, , ,


In the previous blog post I explained the basic concepts of data lakes. Some core problems which can occur in data lakes were defined and I gave some hints to avoid them. Most of these pitfalls are caused by the traits of data lakes. Unfortunately, current Hadoop distributions can’t resolve them entirely. Additionally, the architecture […]

Read more

7 August 2017 / Jan Paw

Diving in the data lake

, , ,


Rapid growth of unstructured data is a serious business challenge for organizations. Data repositories, known as data lakes, have a great chance to play an important role in extracting valuable business information from enormous amounts of data. Storing and processing data on such a scale is a very complex and demanding task. Existing RDBMS-based systems […]

Read more