The need for data engineers is constantly growing and certified data engineers are some of the top paid certified professionals. Data engineers have a wide range of skills including the ability to design systems to ingest large volumes of data, store data cost-effectively, and efficiently process and analyze data with tools ranging from reporting and visualization to machine learning. Earning a Google Cloud Professional Data Engineer certification demonstrates you have the knowledge and skills to build, tune, and monitor high performance data engineering systems.
This course is designed and developed by the author of the official Google Cloud Professional Data Engineer exam guide and a data architect with over 20 years of experience in databases, data architecture, and machine learning. This course combines lectures with quizzes and hands-on practical sessions to ensure you understand how to ingest data, create a data processing pipelines in Cloud Dataflow, deploy relational databases, design highly performant Bigtable, BigQuery, and Cloud Spanner databases, query Firestore databases, and create a Spark and Hadoop cluster using Cloud Dataproc.
The final portion of the course is dedicated to the most challenging part of the exam: machine learning. If you are not familiar with concepts like backpropagation, stochastic gradient descent, overfitting, underfitting, and feature engineering then you are not ready to take the exam. Fortunately, this course is designed for you. In this course we start from the beginning with machine learning, introducing basic concepts, like the difference between supervised and unsupervised learning. We’ll build on the basics to understand how to design, train, and evaluate machine learning models. In the process, we’ll explain essential concepts you will need to understand to pass the Professional Data Engineer exam. We’ll also review Google Cloud machine learning services and infrastructure, such as BigQuery ML and tensor processing units.
The course includes a 50 question practice exam that will test your knowledge of data engineering concepts and help you identify areas you may need to study more.
By the end of this course, you will be ready to use Google Cloud Data Engineering services to design, deploy and monitor data pipelines, deploy advanced database systems, build data analysis platforms, and support production machine learning environments.
ARE YOU READY TO PASS THE EXAM? Join me and I’ll show you how!
Cloud Storage for Data Engineering
Relational Databases - Cloud SQL
-
3Introduction to Object Storage
Understand what object storage is used for
-
4Options for Loading Data
Use gsutil, Transfer Service and other methods to upload data.
-
5Access Controls for Cloud Storage
Use IAM and access control lists to limit access to data in Cloud Storage
-
6Lifecycle Policy Management
Use policies to manage objects in Cloud Storage
-
7Using Cloud Storage Console
Use GCP console to manage Cloud Storage
-
8Exercise: Cloud Storage
Check your understanding of Cloud Storage
-
9Solution: Cloud Storage
Know the solution to the exercise.
Relational Databases - Cloud Spanner
-
10Introduction to Relational Databases
Effectively use GCP's managed relational database options
-
11When to use Cloud SQL
Use Cloud SQL for regional and zonal relational databases
-
12Creating a Cloud SQL Database
Create Cloud SQL databases
-
13Monitoring Cloud SQL
-
14Exercise: Create a Cloud SQL Database
Check your knowledge of Cloud SQL
-
15Solution: Create a Cloud SQL Database
Review the correct way to deploy a Cloud SQL database
NoSQL Databases: Cloud Firestore
-
16When to use Cloud Spanner
When to use Cloud Spanner for multi-regional and global database applicaitons
-
17Creating a Cloud Spanner Database
Create a Cloud Spanner instance.
-
18Cloud Spanner Performance Considerations
Optimized Cloud Spanner I/O performance
-
19Check Your Knowledge: Choosing a Primary Key for a Spanner Table
NoSQL Databases: Bigtable
Analytical Databases: BigQuery Data Management
Migrating a Data Warehouse
-
35Introduction to BigQuery and Analytical Databases
-
36BigQuery Scalar Datatypes
-
37BigQuery Nested and Repeated Fields
-
38Querying Scalars, Nested and Repeated Fields
-
39Exercise: Querying BigQuery Public Datasets
-
40Solution: Querying BigQuery Public Datasets
-
41Access Controls in BigQuery
-
42Partitioning Tables
-
43Clustering Partitioned Tables
-
44Loading Data into BigQuery