[Basics] Data Engineering Fundamentals

What is a dataengineer?

As per [1]

“Data engineers prepare and transform data using pipelines. This involves extracting data from various data source systems, transforming it into the staging area, and loading it into a data warehouse system. This process is known as ETL (Extract, Transform, Load).”

What is data engineering and why it is important?

As per [2]

“Data engineering is the development, operation, and maintenance of data infrastructure. Either on the cloud (or hybrid or multi-cloud), using databases and pipelines to extract, transform, and load data. Today, companies and organizations are powered by data. Applications create data that we use data, analyze it, manipulate it, and create insights as output. This is where Data Engineering is critical in the process of Data Science. The more agile and responsive organizations becomes, the less effort will be spent on moving data around. Users and other teams members other than Data Engineers can then focus on the core business requests. This is why the pipeline is a critical component in the data-driven enterprise. The flow of data and the ease of the access becomes nearly as important as the data itself.”

Following are the must see videos to understand more about data engineering and the role of a data engineer.

Video Links:

  1. How Data Engineering Works - YouTube

  2. What is a Data Engineer? - YouTube

  3. What is ETL; What is Data Warehouse; OLTP vs OLAP - YouTube

References

[1] https://www.springboard.com/library/data-engineering/job-description/#what-is-a-data-engineer

[2] https://simonrenauld.github.io/dataeng.html