Data Integration


Overview/Description
Target Audience
Prerequisites
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description
Data integration is the last step in the data wrangling process where data is put into its useable and structured format for analysis. In this course, you'll explore examples of practical tools and techniques for data integration.

Target Audience
Individuals with some programming and math experience working toward implementing data science in their everyday work

Prerequisites
None

Expected Duration (hours)
0.7

Lesson Objectives

Data Integration

  • start the course
  • use csvjoin to concatenate CSV data
  • use the cat function to concatenate separate logs into a single file
  • sort lines in a text file
  • merge separate xml files into a single schema
  • aggregate data from a CSV file into a table of summarized values
  • normalize data from unstructured sources
  • denormalize data from a structured source
  • use pivot tables to cross tabulate data
  • insert missing values in a data set
  • use csvjoin to merge two compatible CSV documents into one
  • Course Number:
    df_dses_a06_it_enus

    Expertise Level
    Beginner