Implement a data engineering solution with Azure Databricks

  • Home
  • /
  • Courses
  • /
  • Implement a data engineering solution with Azure Databricks
Course ID: DP-3027
Exam Code: -
Duration: 1 Day
Private in-house training

Apart from public, instructor-led classes, we also offer private in-house trainings for organizations based on their needs. Call us at +852 2116 3328 or email us at [email protected] for more details.

What are the skills covered
  • Perform incremental processing with spark structured streaming
  • Implement streaming architecture patterns with Delta Live Tables
  • Optimize performance with Spark and Delta Live Tables
  • Implement CI/CD workflows in Azure Databricks
  • Automate workloads with Azure Databricks Jobs
  • Manage data privacy and governance with Azure Databricks
  • Use SQL Warehouses in Azure Databricks
  • Run Azure Databricks Notebooks with Azure Data Factory
Who should attend this course
  • Data Engineer
Course Modules

Module 1: Perform incremental processing with spark structured streaming

You explore different features and tools to help you understand and work with incremental processing with spark structured streaming.

Learning objectives

At the end of this module, you’re able to:

  • Understand Spark structured streaming.
  • Some techniques to optimize structured streaming.
  • How to handle late arriving or out of order events.
  • How to set up real-time-sources for incremental processing.

Prerequisites

  • Ability to navigate the Azure portal.
  • Understanding of the Azure Databricks workspace.
  • Experience with Spark languages and Notebooks.

 

Module 2: Implement streaming architecture patterns with Delta Live Tables

You explore different features and tools to help you develop architecture patterns with Azure Databricks Delta Live Tables.

Learning objectives

At the end of this module, you’re able to:

  • Use Event driven architectures with Delta Live Tables
  • Ingest streaming data
  • Achieve Data consistency and reliability
  • Scale streaming workloads with Delta Live Tables

Prerequisites

  • Ability to navigate the Azure portal.
  • Understanding of the Azure Databricks workspace.
  • Experience with Spark languages and Notebooks.

 

Module 3: Optimize performance with Spark and Delta Live Tables

Learn how to Optimize performance with Spark and Delta Live Tables in Azure Databricks.

Learning objectives

In this module, you learn how to:

  • Use serverless compute and parallelism with Delta live tables
  • Perform cost based optimization and query performance
  • Use Change Data Capture (CDC)
  • Apply enhanced autoscaling capabilities
  • Implement Observability and enhance data quality metric

Prerequisites

Before starting this module, you should have a fundamental knowledge of data analytics concepts. Consider completing Azure Data Fundamentals certification before starting this module.

 

Module 4: Implement CI/CD workflows in Azure Databricks

Learn how to implement CI/CD workflows in Azure Databricks to automate the integration and delivery of code changes.

Learning objectives

In this module, you learn how to:

  • Implement version control and Git integration.
  • Perform unit testing and integration testing.
  • Maintain environment and configuration management.
  • Implement rollback and roll-forward strategies.

Prerequisites

Before starting this module, you should have a fundamental knowledge of data analytics concepts. Consider completing Azure Data Fundamentals certification before starting this module.

 

Module 5: Automate workloads with Azure Databricks Jobs

Learn how to orchestrate and schedule data workflows with Azure Databricks Jobs. Define and monitor complex pipelines, integrate with tools like Azure Data Factory and Azure DevOps, and reduce manual intervention, leading to improved efficiency, faster insights, and adaptability to business needs.

Learning objectives

In this module, you learn how to:

  • Implement job scheduling and automation.
  • Optimize workflows with parameters.
  • Handle dependency management.
  • Implement error handling and retry mechanisms.
  • Explore best practices and guidelines

Prerequisites

Before starting this module, you should have a fundamental knowledge of data analytics concepts. Consider completing Azure Data Fundamentals certification before starting this module.

 

Module 6: Manage data privacy and governance with Azure Databricks

In this module, you explore different features and approaches to help you secure and manage your data within Azure Databricks using tools, such as Unity Catalog.

Learning objectives

At the end of this module, you’re able to:

  • Implement data encryption techniques
  • Manage access controls
  • Implement data masking and anonymization
  • Use compliance frameworks and secure data sharing
  • Use data lineage and metadata management
  • Roll out governance automation

Prerequisites

  • Ability to navigate the Azure portal
  • Familiarity of the Azure Databricks workspace
  • Knowledge of Spark languages and notebooks

 

Module 7: Use SQL Warehouses in Azure Databricks

Azure Databricks provides SQL Warehouses that enable data analysts to work with data using familiar relational SQL queries.

Learning objectives

In this module, you’ll learn how to:

  • Create and configure SQL Warehouses in Azure Databricks.
  • Create databases and tables.
  • Create queries and dashboards.

Prerequisites

Before starting this module, you should have a basic knowledge of Azure Databricks. Consider completing the Explore Azure Databricks module before this one.

 

Module 8: Run Azure Databricks Notebooks with Azure Data Factory

Using pipelines in Azure Data Factory to run notebooks in Azure Databricks enables you to automate data engineering processes at cloud scale.

Learning objectives

In this module, you’ll learn how to:

  • Describe how Azure Databricks notebooks can be run in a pipeline.
  • Create an Azure Data Factory linked service for Azure Databricks.
  • Use a Notebook activity in a pipeline.
  • Pass parameters to a notebook.

Prerequisites

Before starting this module, you should have a basic knowledge of Azure Databricks. Consider completing the Explore Azure Databricks module before this one.

Prerequisites

There are no prerequisites required to attend this course.

Search for a course