Data Warehouse or Data Lake
20 may

Data warehouse or data lake

Data Lakes and Data Warehouses are buzzwords you hear when it comes to data retention in the context of big data goes. In fact, they also refer to two different approaches. The "lake" is actually a coherent term for the data lake: a large basin filled with raw data that is stored there unstructured and without a specific use. A data warehouse, on the other hand, stores structured, filtered data in an organised manner. Which approach should be used for which purpose?

One place to find them all

Companies receive huge amounts of data, in a wide variety of ways and from different sources. They often go beyond what conventional relational databases can handle. They need additional systems and tools to manage them.

In this way, new insights are gained and trends become visible, making it easier to make decisions that are not based on gut feeling.


All these data stores have one purpose: they house data for business reporting and analysis. But they differ in their purpose, structure, data types, origin and who has access to them.
Often, the data in these memories first comes from systems that generate data - CRM, ERP, HR, financial applications and other similar applications. The data records created from these systems are partly applied or/and generated according to the rules stored there. Afterwards, they end up in a central repository. There they can be evaluated with analysis tools and interpreted in different contexts. In this way, new insights are gained and trends become visible, making it easier to make decisions without having to rely on gut feelings. Many companies use both a data lake and a data warehouse to cover the spectrum of their data storage requirements.

What is a data lake?

A data lake is a huge repository that stores raw data in its original format. The fact that a data lake can store very different structures is an essential feature and advantage. Each stored data element is identified by a unique identifier and a unique name. metadata tagged. In this way, it can be found again and assigned if necessary. The individual data records usually do not have a predefined purpose. Data is collected more according to a stock principle: what you have, you have.

Data is collected more according to a stock principle: what you have, you have.



This adds up to a lot of users migrating to the big data stores in the cloud.
Data Lakes are typically used by Data Scientists and Engineers who prefer to explore data in its raw form to gain new, unique business insights.
They serve disciplines such as predictive analytics, Machine Learning, Data Visualisation, BI, Big Data Analytics.

Storage costs are relatively cheap in a data lake compared to a data warehouse. Data Lakes are also less time-consuming to manage, which reduces operating costs.

What is a data warehouse?

A data warehouse is a repository for data that business applications collect or/and generate for a given purpose. Such applications use a predefined schema to store the data. The data must be cleaned and organised before it is stored in the data warehouse.

Since the data stored in a data warehouse is already structured, it is better suited for high-level analyses


Since the data stored in a data warehouse is already structured, it is more suitable for high-level analyses. BI tools can easily handle the processed data from a data warehouse. This makes it easier for non-data experts to make sense of this data.

Data from a data warehouse can be used to support historical analysis and reporting to support decision making across all areas of an organisation's business.

Data from a data warehouse is usually accessed by managers and business users who need to gain insight into business processes. KPIs want to gain. The data are already structured in such a way that they provide answers to predefined questions for analysis. In doing so, they usually generate data visualisation, BI analyses, data analytics.

Data warehouses cost more than data lakes and also require more time to manage, resulting in additional operating costs.

Versino Financial Suite for SAP Business One Finance
Contact Versino
Current SAP tools for integration & data analysis

SAP Integration & Data Analytics Tools 2023

In the current digital landscape, data is indisputably considered the central element of every company. The correct handling of data analysis or ...
MariProject Dashboards

Dashboard functions in MARIProject

SAP Business One has had distinctive functionalities for creating dashboards since version 9 at the latest. But also MARIProject, integrated project software ...
metrics

More key figures transparency in the ERP system

Many companies hope that the introduction of ERP software will bring more transparency to their key figures. However, it only becomes clearer...
Data Lake vs Warehouse

Data warehouse or data lake

Data lakes and data warehouses are buzzwords one hears when it comes to data retention in the context of big data...
BIG DATA for SMEs

Big Data – Relevant for SMEs?

Big data is not relevant for small and medium-sized companies. Such an assessment can often be heard and read ...
DATA Science

Data analysis for SMEs

A critical analysis "Data analysis for SMEs" or "Data Science" should actually be one of the top issues for medium-sized companies. For it ...
Wird geladen ?