Ingesting Data and Computing for Batch Processing


Overview/Description
Target Audience
Prerequisites
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description
There are many considerations when designing and implementing big data analytics solutions with Microsoft Azure. This course covers data ingesting and storage and designing and provisioning compute clusters, and it aligns with exam 70-475.

Target Audience
Professionals who are preparing to take the 70-475: Designing and Implementing Big Data Analytics Solutions certification exam, and who are experienced in designing, programming, implementing, automating, and monitoring Microsoft Azure cloud platform solutions. Exam candidates should also be adept at using development tools, techniques, and design methodologies associated with the implementations of cloud-based big data analytics solutions

Prerequisites
None

Expected Duration (hours)
1.6

Lesson Objectives

Ingesting Data and Computing for Batch Processing

  • start the course
  • identify basic features of Microsoft big data solutions
  • recognize storage options for big data and identify methods to load data into Azure Blob storage
  • list key features of the Azure Data Factory and the Azure Data Lake Store
  • use Azure PowerShell with Azure Storage
  • recognize best practices and considerations for data collection and loading in HDInsight
  • recognize key features of Apache Storm and Apache Flume
  • recognize key features of Azure Cosmos DB and DocumentDB
  • store and access .NET web application data with Azure Cosmo DB
  • install and use the Microsoft Azure Storage Explorer
  • load data into an Azure SQL Data Warehouse
  • install and use PolyBase to query data in an Azure Storage account
  • recognize common methods for moving data from an on-premises SQL Server to an Azure Virtual Machine SQL Server
  • recognize features of Hadoop and HDInsight clusters
  • identify how Apache Spark is used with HDInsight
  • recognize the capabilities of HBase in HDInsight
  • identify how Apache Kafka is used with HDInsight
  • recognize the capabilities of Interactive Hive in HDInsight
  • identify how R is used with HDInsight
  • determine which tools to use and identify important security features
  • recognize key features and capabilities of various tools used with HDInsight
  • Course Number:
    df_dibd_a01_it_enus

    Expertise Level
    Intermediate