Exam Details

  • Exam Code
    :PROFESSIONAL-DATA-ENGINEER
  • Exam Name
    :Professional Data Engineer on Google Cloud Platform
  • Certification
    :Google Certifications
  • Vendor
    :Google
  • Total Questions
    :331 Q&As
  • Last Updated
    :May 19, 2025

Google Google Certifications PROFESSIONAL-DATA-ENGINEER Questions & Answers

  • Question 261:

    You are testing a Dataflow pipeline to ingest and transform text files. The files are compressed gzip, errors are written to a dead-letter queue, and you are using Sidelnputs to join data You noticed that the pipeline is taking longer to complete than expected, what should you do to expedite the Dataflow job?

    A. Switch to compressed Avro files

    B. Reduce the batch size

    C. Retry records that throw an error

    D. Use CoGroupByKey instead of the Sidelnput

  • Question 262:

    You have a BigQuery table that contains customer data, including sensitive information such as names and addresses. You need to share the customer data with your data analytics and consumer support teams securely. The data analytics team needs to access the data of all the customers, but must not be able to access the sensitive data. The consumer support team needs access to all data columns, but must not be able to access customers that no longer have active contracts. You enforced these requirements by using an authorized dataset and policy tags After implementing these steps, the data analytics team reports that they still have access to the sensitive columns. You need to ensure that the data analytics team does not have access to restricted data What should you do?

    Choose 2 answers

    A. Create two separate authorized datasets; one for the data analytics team and another for the consumer support team.

    B. Ensure that the data analytics team members do not have the Data Catalog Fine- Grained Reader role for the policy tags.

    C. Enforce access control in the policy tag taxonomy.

    D. Remove the bigquery. dataViewer role from the data analytics team on the authorized datasets.

    E. Replace the authorized dataset with an authorized view Use row-level security and apply filter_ expression to limit data access.

  • Question 263:

    You are running a Dataflow streaming pipeline, with Streaming Engine and Horizontal Autoscaling enabled. You have set the maximum number of workers to 1000. The input of your pipeline is Pub/Sub messages with notifications from Cloud Storage One of the pipeline transforms reads CSV files and emits an element for every CSV line. The Job performance is low. the pipeline is using only 10 workers, and you notice that the autoscaler is not spinning up additional workers. What should you do to improve performance?

    A. Use Dataflow Prime, and enable Right Fitting to increase the worker resources.

    B. Update the job to increase the maximum number of workers.

    C. Enable Vertical Autoscaling to let the pipeline use larger workers.

    D. Change the pipeline code, and introduce a Reshuffle step to prevent fusion.

  • Question 264:

    You are collecting loT sensor data from millions of devices across the world and storing the data in BigQuery. Your access pattern is based on recent data tittered by location_id and device_version with the following query:

    You want to optimize your queries for cost and performance. How should you structure your data?

    A. Partition table data by create_date, location_id and device_version

    B. Partition table data by create_date cluster table data by tocation_id and device_version

    C. Cluster table data by create_date location_id and device_version

    D. Cluster table data by create_date, partition by location and device_version

  • Question 265:

    You are designing storage for two relational tables that are part of a 10-TB database on Google Cloud. You want to support transactions that scale horizontally. You also want to optimize data for range queries on nonkey columns. What should you do?

    A. Use Cloud SQL for storage. Add secondary indexes to support query patterns.

    B. Use Cloud SQL for storage. Use Cloud Dataflow to transform data to support query patterns.

    C. Use Cloud Spanner for storage. Add secondary indexes to support query patterns.

    D. Use Cloud Spanner for storage. Use Cloud Dataflow to transform data to support query patterns.

  • Question 266:

    An online retailer has built their current application on Google App Engine. A new initiative at the company mandates that they extend their application to allow their customers to transact directly via the application.

    They need to manage their shopping transactions and analyze combined data from multiple datasets using a business intelligence (BI) tool. They want to use only a single database for this purpose. Which Google Cloud database should they choose?

    A. BigQuery

    B. Cloud SQL

    C. Cloud BigTable

    D. Cloud Datastore

  • Question 267:

    You are building a streaming Dataflow pipeline that ingests noise level data from hundreds of sensors placed near construction sites across a city. The sensors measure noise level every ten seconds, and send that data to the pipeline when levels reach above 70 dBA. You need to detect the average noise level from a sensor when data is received for a duration of more than 30 minutes, but the window ends when no data has been received for 15 minutes What should you do?

    A. Use session windows with a 30-mmute gap duration.

    B. Use tumbling windows with a 15-mmute window and a fifteen-minute. withAllowedLateness operator.

    C. Use session windows with a 15-minute gap duration.

    D. Use hopping windows with a 15-mmute window, and a thirty-minute period.

  • Question 268:

    You have a query that filters a BigQuery table using a WHERE clause on timestamp and ID columns. By using bq query ?-dry_run you learn that the query triggers a full scan of the table, even though the filter on timestamp and ID select a tiny fraction of the overall data. You want to reduce the amount of data scanned by BigQuery with minimal changes to existing SQL queries. What should you do?

    A. Create a separate table for each ID.

    B. Use the LIMIT keyword to reduce the number of rows returned.

    C. Recreate the table with a partitioning column and clustering column.

    D. Use the bq query - -maximum_bytes_billed flag to restrict the number of bytes billed.

  • Question 269:

    You have an Oracle database deployed in a VM as part of a Virtual Private Cloud (VPC) network. You want to replicate and continuously synchronize 50 tables to BigQuery. You want to minimize the need to manage infrastructure. What should you do?

    A. Create a Datastream service from Oracle to BigQuery, use a private connectivity configuration to the same VPC network, and a connection profile to BigQuery.

    B. Create a Pub/Sub subscription to write to BigQuery directly Deploy the Debezium Oracle connector to capture changes in the Oracle database, and sink to the Pub/Sub topic.

    C. Deploy Apache Kafka in the same VPC network, use Kafka Connect Oracle Change Data Capture (CDC), and Dataflow to stream the Kafka topic to BigQuery. D O Deploy Apache Kafka in the same VPC network, use Kafka Connect Oracle change data capture (CDC), and the Kafka Connect Google BigQuery Sink Connector.

  • Question 270:

    You are configuring networking for a Dataflow job. The data pipeline uses custom container images with the libraries that are required for the transformation logic preinstalled. The data pipeline reads the data from Cloud Storage and writes the data to BigQuery. You need to ensure cost-effective and secure communication between the pipeline and Google APIs and services. What should you do?

    A. Leave external IP addresses assigned to worker VMs while enforcing firewall rules.

    B. Disable external IP addresses and establish a Private Service Connect endpoint IP address.

    C. Disable external IP addresses from worker VMs and enable Private Google Access.

    D. Enable Cloud NAT to provide outbound internet connectivity while enforcing firewall rules.

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Google exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your PROFESSIONAL-DATA-ENGINEER exam preparations and Google certification application, do not hesitate to visit our Vcedump.com to find your solutions here.