Data Engineering Fundamentals


Overview/Description
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description

Data engineering is the area of data science that focuses on practical applications of data collection and analysis. This 12-video course helps learners explore distributed systems, batch versus in-memory processing, NoSQL uses, and the various tools available for data management/big data and the ETL (extract, transform, and load) process. Begin with an overview of distributed systems from a data perspective. Then look at differences between batch and in-memory processing. Learn about NoSQL stores and their use, and tools available for data management. Explore ETL—what it is, the process, and the different tools available. Learn to use Talend Open Studio to showcase the ETL concept. Next, examine data modeling and creating a data model in Talend Open Studio. Explore the hierarchy of needs when working with AI and machine learning. In another tutorial, learn how to create a data partition. Then move on to data engineering and best practices, with a look at approaches to building and using data reporting tools. Conclude with an exercise designed to create a data model.



Expected Duration (hours)
0.8

Lesson Objectives

Data Engineering Fundamentals

  • Course Overview
  • describe distributed systems from a data perspective
  • identify the differences between batch and in-memory processing
  • describe NoSQL stores and how they are used
  • identify different tools available for data management
  • describe the ETL process and different tools available
  • use Talend Open Studio to showcase the ETL concept
  • describe and create a data model
  • describe the hierarchy of needs when working with AI and machine learning
  • describe and create a data partition
  • identify data engineering best practices
  • describe data reporting tools
  • create a data model
  • Course Number:
    it_dsdefddj_01_enus

    Expertise Level
    Intermediate