Data Engineering on Google Cloud Platform

This four-day instructor-led class provides participants a hands-on introduction to designing and building data processing systems on Google Cloud Platform. Through a combination of presentations, demos, and hand-on labs, participants will learn how to design data processing systems, build end-to-end data pipelines, analyze data and carry out machine learning. The course covers structured, unstructured, and streaming data.

Enquiry Register Schedule

Course Objectives

This course teaches participants the following skills:

  • Design and build data processing systems on Google Cloud Platform
  • Process batch and streaming data by implementing autoscaling data pipelines on Cloud Dataflow
  • Derive business insights from extremely large datasets using Google BigQuery
  • Train, evaluate and predict using machine learning models using Tensorflow and Cloud ML
  • Leverage unstructured data using Spark and ML APIs on Cloud Dataproc
  • Enable instant insights from streaming data

Intended Audience

This class is intended for experienced developers who are responsible for managing big data transformations including:

  • Extracting, Loading, Transforming, cleaning, and validating data
  • Designing pipelines and architectures for data processing
  • Creating and maintaining machine learning and statistical models
  • Querying datasets, visualizing query results and creating reports


To get the most of out of this course, participants should have:

  • Completed Google Cloud Fundamentals: Big Data & Machine Learning course OR have equivalent experience
  • Basic proficiency with common query language such as SQL
  • Experience with data modeling, extract, transform, load activities
  • Developing applications using a common programming language such as Python
  • Familiarity with Machine Learning and/or statistics

Delivery Method

  • Instructor-led, instructor-led online


  • 4 Days

Course Outline

The course includes presentations, demonstrations, and hands-on labs.

Leveraging Unstructured Data with Cloud Dataproc on Google Cloud Platform
Module 1: Google Cloud Dataproc Overview
Module 2: Running Dataproc Jobs
Module 3: Integrating Dataproc with Google Cloud Platform
Module 4: Making Sense of Unstructured Data with Google’s Machine Learning APIs

Serverless Data Analysis with Google BigQuery and Cloud Dataflow
Module 5: Serverless data analysis with BigQuery
Module 6: Serverless, autoscaling data pipelines with Dataflow

Serverless Machine Learning with TensorFlow on Google Cloud Platform
Module 7: Getting started with Machine Learning
Module 8: Building ML models with Tensorflow
Module 9: Scaling ML models with CloudML
Module 10: Feature Engineering

Building Resilient Streaming Systems on Google Cloud Platform
Module 11: Architecture of streaming analytics pipelines
Module 12: Ingesting Variable Volumes
Module 13: Implementing streaming pipelines
Module 14: Streaming analytics and dashboards
Module 15: High throughput and low-latency with Bigtable

Training Fee

  • HKD18,000
For private classes, please contact us at (852) 2116 3328 for more details.

Enquiry Register Schedule