Have a question?
Message sent Close
3.35
130 reviews

Google Cloud Certified Professional Data Engineer

Theory, Hand-ons and 252 Questions, Answers with Explanations. All Hands-Ons in 1-Click Copy-Paste Style. PDF Downloads
Instructor
Deepak Dubey
16,659 Students enrolled
  • Description
  • Curriculum
  • FAQ
  • Reviews

Designing data processing systems

Selecting the appropriate storage technologies. Considerations include:

●  Mapping storage systems to business requirements

●  Data modeling

●  Trade-offs involving latency, throughput, transactions

●  Distributed systems

●  Schema design

Designing data pipelines. Considerations include:

●  Data publishing and visualization (e.g., BigQuery)

●  Batch and streaming data (e.g., Dataflow, Dataproc, Apache Beam, Apache Spark and Hadoop ecosystem, Pub/Sub, Apache Kafka)

●  Online (interactive) vs. batch predictions

●  Job automation and orchestration (e.g., Cloud Composer)

Designing a data processing solution. Considerations include:

●  Choice of infrastructure

●  System availability and fault tolerance

●  Use of distributed systems

●  Capacity planning

●  Hybrid cloud and edge computing

●  Architecture options (e.g., message brokers, message queues, middleware, service-oriented architecture, serverless functions)

●  At least once, in-order, and exactly once, etc., event processing

Migrating data warehousing and data processing. Considerations include:

●  Awareness of current state and how to migrate a design to a future state

●  Migrating from on-premises to cloud (Data Transfer Service, Transfer Appliance, Cloud Networking)

●  Validating a migration

Building and operationalizing data processing systems

Building and operationalizing storage systems. Considerations include:

●  Effective use of managed services (Cloud Bigtable, Cloud Spanner, Cloud SQL, BigQuery, Cloud Storage, Datastore, Memorystore)

●  Storage costs and performance

●  Life cycle management of data

Building and operationalizing pipelines. Considerations include:

●  Data cleansing

●  Batch and streaming

●  Transformation

●  Data acquisition and import

●  Integrating with new data sources

Building and operationalizing processing infrastructure. Considerations include:

●  Provisioning resources

●  Monitoring pipelines

●  Adjusting pipelines

●  Testing and quality control

Operationalizing machine learning models

Leveraging pre-built ML models as a service. Considerations include:

●  ML APIs (e.g., Vision API, Speech API)

●  Customizing ML APIs (e.g., AutoML Vision, Auto ML text)

●  Conversational experiences (e.g., Dialogflow)

Deploying an ML pipeline. Considerations include:

●  Ingesting appropriate data

●  Retraining of machine learning models (AI Platform Prediction and Training, BigQuery ML, Kubeflow, Spark ML)

●  Continuous evaluation

Choosing the appropriate training and serving infrastructure. Considerations include:

●  Distributed vs. single machine

●  Use of edge compute

●  Hardware accelerators (e.g., GPU, TPU)

Measuring, monitoring, and troubleshooting machine learning models. Considerations include:

●  Machine learning terminology (e.g., features, labels, models, regression, classification, recommendation, supervised and unsupervised learning, evaluation metrics)

●  Impact of dependencies of machine learning models

●  Common sources of error (e.g., assumptions about data)

Ensuring solution quality

Designing for security and compliance. Considerations include:

●  Identity and access management (e.g., Cloud IAM)

●  Data security (encryption, key management)

●  Ensuring privacy (e.g., Data Loss Prevention API)

●  Legal compliance (e.g., Health Insurance Portability and Accountability Act (HIPAA), Children’s Online Privacy Protection Act (COPPA), FedRAMP, General Data Protection Regulation (GDPR))

Ensuring scalability and efficiency. Considerations include:

●  Building and running test suites

●  Pipeline monitoring (e.g., Cloud Monitoring)

●  Assessing, troubleshooting, and improving data representations and data processing infrastructure

●  Resizing and autoscaling resources

Ensuring reliability and fidelity. Considerations include:

●  Performing data preparation and quality control (e.g., Dataprep)

●  Verification and monitoring

●  Planning, executing, and stress testing data recovery (fault tolerance, rerunning failed jobs, performing retrospective re-analysis)

●  Choosing between ACID, idempotent, eventually consistent requirements

Ensuring flexibility and portability. Considerations include:

●  Mapping to current and future business requirements

●  Designing for data and application portability (e.g., multicloud, data residency requirements)

●  Data staging, cataloging, and discovery

Migrating Data - Google Data Transfer Service, gsutil, Transfer Appliance
Cloud SQL - MySQL, PostgreSQL, Microsoft SQL Server in the Cloud
Cloud Spanner - Horizontally Scalable Distributed Multi Region SQL Database
Cloud Dataflow - Managed Apache Beam Service for Data Processing, Transformation
Cloud BigTable - NoSQL Massively Distributed Extremely Performant Database
Cloud DataPrep - Data Preparation using point and clicks on Web - No Programming
Cloud Data Fusion - Point & Click Solution for Creating Data Pipelines & Connect
Cloud Data Catalog - Metadata Management for Data Movement, Lineage, Discovery
Cloud Memorystore - Managed Redis, Memcached Service on Cloud for Key Value Data
Cloud Data Loss Prevention (DLP) Service - redact, mask, tokenize & transform
Machine Learning Concepts
How long do I have access to the course materials?
You can view and review the lecture materials indefinitely, like an on-demand channel.
Can I take my courses with me wherever I go?
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don't have an internet connection, some instructors also let their students download course lectures. That's up to the instructor though, so make sure you get on their good side!
3.35
130 reviews
Stars 5
78
Stars 4
32
Stars 3
10
Stars 2
4
Stars 1
6
48354
Course details
Video 25 hours
Certificate of Completion
Full lifetime access
Access on mobile and TV

External Links May Contain Affiliate Links read more

Join our Telegram Channel To Get Latest Notification & Course Updates!
Join Our Telegram For FREE Courses & Canva PremiumJOIN NOW