Exam Details

  • Exam Code
    :PROFESSIONAL-DATA-ENGINEER
  • Exam Name
    :Professional Data Engineer on Google Cloud Platform
  • Certification
    :Google Certifications
  • Vendor
    :Google
  • Total Questions
    :331 Q&As
  • Last Updated
    :May 19, 2025

Google Google Certifications PROFESSIONAL-DATA-ENGINEER Questions & Answers

  • Question 251:

    You have terabytes of customer behavioral data streaming from Google Analytics into BigQuery daily Your customers' information, such as their preferences, is hosted on a Cloud SQL for MySQL database Your CRM database is hosted on a Cloud SQL for PostgreSQL instance. The marketing team wants to use your customers' information from the two databases and the customer behavioral data to create marketing campaigns for yearly active customers. You need to ensure that the marketing team can run the campaigns over 100 times a day on typical days and up to 300 during sales. At the same time you want to keep the load on the Cloud SQL databases to a minimum. What should you do?

    A. Create BigQuery connections to both Cloud SQL databases Use BigQuery federated queries on the two databases and the Google Analytics data on BigQuery to run these queries.

    B. Create streams in Datastream to replicate the required tables from both Cloud SQL databases to BigQuery for these queries.

    C. Create a Dataproc cluster with Trino to establish connections to both Cloud SQL databases and BigQuery, to execute the queries.

    D. Create a job on Apache Spark with Dataproc Serverless to query both Cloud SQL databases and the Google Analytics data on BigQuery for these queries.

  • Question 252:

    You orchestrate ETL pipelines by using Cloud Composer One of the tasks in the Apache Airflow directed acyclic graph (DAG) relies on a third-party service. You want to be notified when the task does not succeed. What should you do?

    A. Configure a Cloud Monitoring alert on the sla_missed metric associated with the task at risk to trigger a notification.

    B. Assign a function with notification logic to the sla_miss_callback parameter for the operator responsible for the task at risk.

    C. Assign a function with notification logic to the on_retry_callback parameter for the operator responsible for the task at risk.

    D. Assign a function with notification logic to the on_failure_callback parameter for the operator responsible for the task at risk.

  • Question 253:

    You are creating a data model in BigQuery that will hold retail transaction data. Your two largest tables, sales_transation_header and sales_transation_line. have a tightly coupled immutable relationship. These tables are rarely modified after load and are frequently joined when queried. You need to model the sales_transation_header and sales_transation_line tables to improve the performance of data analytics queries. What should you do?

    A. Create a sal es_transaction table that Stores the sales_tran3action_header and sales_transaction_line data as a JSON data type.

    B. Create a sale3_transaction table that holds the sales_transaction_header information as rows and the sales_transaction_line rows as nested and repeated fields.

    C. Create a sale_transaction table that holds the sales_transaction_header and sales_transaction_line information as rows, duplicating the sales_transaction_header data for each line.

    D. Create separate sales_transation_header and sales_transation_line tables and. when querying, specify the sales transition line first in the WHERE clause.

  • Question 254:

    You have a network of 1000 sensors. The sensors generate time series data: one metric per sensor per second, along with a timestamp. You already have 1 TB of data, and expect the data to grow by 1 GB every day You need to access this data in two ways. The first access pattern requires retrieving the metric from one specific sensor stored at a specific timestamp, with a median single-digit millisecond latency. The second access pattern requires running complex analytic queries on the data, including joins, once a day. How should you store this data?

    A. Store your data in Bigtable Concatenate the sensor ID and timestamp and use it as the row key Perform an export to BigQuery every day.

    B. Store your data in BigQuery Concatenate the sensor ID and timestamp. and use it as the primary key.

    C. Store your data in Bigtable Concatenate the sensor ID and metric, and use it as the row key Perform an export to BigQuery every day.

    D. Store your data in BigQuery. Use the metric as a primary key.

  • Question 255:

    You are implementing workflow pipeline scheduling using open source-based tools and Google Kubernetes Engine (GKE). You want to use a Google managed service to simplify and automate the task. You also want to accommodate Shared VPC networking considerations. What should you do?

    A. Use Dataflow for your workflow pipelines. Use Cloud Run triggers for scheduling.

    B. Use Dataflow for your workflow pipelines. Use shell scripts to schedule workflows.

    C. Use Cloud Composer in a Shared VPC configuration. Place the Cloud Composer resources in the host project.

    D. Use Cloud Composer in a Shared VPC configuration. Place the Cloud Composer resources in the service project.

  • Question 256:

    You need to migrate a 2TB relational database to Google Cloud Platform. You do not have the resources to significantly refactor the application that uses this database and cost to operate is of primary concern.

    Which service do you select for storing and serving your data?

    A. Cloud Spanner

    B. Cloud Bigtable

    C. Cloud Firestore

    D. Cloud SQL

  • Question 257:

    You are designing a real-time system for a ride hailing app that identifies areas with high demand for rides to effectively reroute available drivers to meet the demand. The system ingests data from multiple sources to Pub/Sub. processes the

    data, and stores the results for visualization and analysis in real-time dashboards. The data sources include driver location updates every 5 seconds and app-based booking events from riders. The data processing involves real-time

    aggregation of supply and demand data for the last 30 seconds, every 2 seconds, and storing the results in a low-latency system for visualization.

    What should you do?

    A. Group the data by using a tumbling window in a Dataflow pipeline, and write the aggregated data to Memorystore

    B. Group the data by using a hopping window in a Dataflow pipeline, and write the aggregated data to Memorystore

    C. Group the data by using a session window in a Dataflow pipeline, and write the aggregated data to BigQuery.

    D. Group the data by using a hopping window in a Dataflow pipeline, and write the aggregated data to BigQuery.

  • Question 258:

    You have 100 GB of data stored in a BigQuery table. This data is outdated and will only be accessed one or two times a year for analytics with SQL. For backup purposes, you want to store this data to be immutable for 3 years. You want to minimize storage costs. What should you do?

    A. 1 Create a BigQuery table clone.

    2. Query the clone when you need to perform analytics.

    B. 1 Create a BigQuery table snapshot. 2 Restore the snapshot when you need to perform analytics.

    C. 1. Perform a BigQuery export to a Cloud Storage bucket with archive storage class. 2 Enable versionmg on the bucket.

    3. Create a BigQuery external table on the exported files.

    D. 1 Perform a BigQuery export to a Cloud Storage bucket with archive storage class. 2 Set a locked retention policy on the bucket.

    3. Create a BigQuery external table on the exported files.

  • Question 259:

    You are developing an application on Google Cloud that will automatically generate subject labels for users' blog posts. You are under competitive pressure to add this feature quickly, and you have no additional developer resources. No one on your team has experience with machine learning. What should you do?

    A. Call the Cloud Natural Language API from your application. Process the generated Entity Analysis as labels.

    B. Call the Cloud Natural Language API from your application. Process the generated Sentiment Analysis as labels.

    C. Build and train a text classification model using TensorFlow. Deploy the model using Cloud Machine Learning Engine. Call the model from your application and process the results as labels.

    D. Build and train a text classification model using TensorFlow. Deploy the model using a Kubernetes Engine cluster. Call the model from your application and process the results as labels.

  • Question 260:

    You have uploaded 5 years of log data to Cloud Storage A user reported that some data points in the log data are outside of their expected ranges, which indicates errors You need to address this issue and be able to run the process again in the future while keeping the original data for compliance reasons. What should you do?

    A. Import the data from Cloud Storage into BigQuery Create a new BigQuery table, and skip the rows with errors.

    B. Create a Compute Engine instance and create a new copy of the data in Cloud Storage Skip the rows with errors

    C. Create a Cloud Dataflow workflow that reads the data from Cloud Storage, checks for values outside the expected range, sets the value to an appropriate default, and writes the updated records to a new dataset in Cloud Storage

    D. Create a Cloud Dataflow workflow that reads the data from Cloud Storage, checks for values outside the expected range, sets the value to an appropriate default, and writes the updated records to the same dataset in Cloud Storage

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Google exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your PROFESSIONAL-DATA-ENGINEER exam preparations and Google certification application, do not hesitate to visit our Vcedump.com to find your solutions here.