Exam Details

  • Exam Code
    :PROFESSIONAL-DATA-ENGINEER
  • Exam Name
    :Professional Data Engineer on Google Cloud Platform
  • Certification
    :Google Certifications
  • Vendor
    :Google
  • Total Questions
    :331 Q&As
  • Last Updated
    :Jun 05, 2025

Google Google Certifications PROFESSIONAL-DATA-ENGINEER Questions & Answers

  • Question 241:

    Flowlogistic is rolling out their real-time inventory tracking system. The tracking devices will all send package-tracking messages, which will now go to a single Google Cloud Pub/Sub topic instead of the Apache Kafka cluster. A subscriber application will then process the messages for real-time reporting and store them in Google BigQuery for historical analysis. You want to ensure the package data can be analyzed over time.

    Which approach should you take?

    A. Attach the timestamp on each message in the Cloud Pub/Sub subscriber application as they are received.

    B. Attach the timestamp and Package ID on the outbound message from each publisher device as they are sent to Clod Pub/Sub.

    C. Use the NOW () function in BigQuery to record the event's time.

    D. Use the automatically generated timestamp from Cloud Pub/Sub to order the data.

  • Question 242:

    Flowlogistic's CEO wants to gain rapid insight into their customer base so his sales team can be better informed in the field. This team is not very technical, so they’ve purchased a visualization tool to simplify the creation of BigQuery reports. However, they’ve been overwhelmed by all the data in the table, and are spending a lot of money on queries trying to find the data they need. You want to solve their problem in the most cost-effective way.

    What should you do?

    A. Export the data into a Google Sheet for virtualization.

    B. Create an additional table with only the necessary columns.

    C. Create a view on the table to present to the virtualization tool.

    D. Create identity and access management (IAM) roles on the appropriate columns, so only they appear in a query.

  • Question 243:

    Flowlogistic wants to use Google BigQuery as their primary analysis system, but they still have Apache Hadoop and Spark workloads that they cannot move to BigQuery. Flowlogistic does not know how to store the data that is common to both workloads.

    What should they do?

    A. Store the common data in BigQuery as partitioned tables.

    B. Store the common data in BigQuery and expose authorized views.

    C. Store the common data encoded as Avro in Google Cloud Storage.

    D. Store he common data in the HDFS storage for a Google Cloud Dataproc cluster

  • Question 244:

    Flowlogistic's management has determined that the current Apache Kafka servers cannot handle the data volume for their real-time inventory tracking system. You need to build a new system on Google Cloud Platform (GCP) that will feed the proprietary tracking software. The system must be able to ingest data from a variety of global sources, process and query in real-time, and store the data reliably.

    Which combination of GCP products should you choose?

    A. Cloud Pub/Sub, Cloud Dataflow, and Cloud Storage

    B. Cloud Pub/Sub, Cloud Dataflow, and Local SSD

    C. Cloud Pub/Sub, Cloud SQL, and Cloud Storage

    D. Cloud Load Balancing, Cloud Dataflow, and Cloud Storage

  • Question 245:

    The CUSTOM tier for Cloud Machine Learning Engine allows you to specify the number of which types of cluster nodes?

    A. Workers

    B. Masters, workers, and parameter servers

    C. Workers and parameter servers

    D. Parameter servers

  • Question 246:

    When you design a Google Cloud Bigtable schema it is recommended that you _________.

    A. Avoid schema designs that are based on NoSQL concepts

    B. Create schema designs that are based on a relational database design

    C. Avoid schema designs that require atomicity across rows

    D. Create schema designs that require atomicity across rows

  • Question 247:

    Which of the following is NOT a valid use case to select HDD (hard disk drives) as the storage for Google Cloud Bigtable?

    A. You expect to store at least 10 TB of data.

    B. You will mostly run batch workloads with scans and writes, rather than frequently executing random reads of a small number of rows.

    C. You need to integrate with Google BigQuery.

    D. You will not use the data to back a user-facing or latency-sensitive application.

  • Question 248:

    You currently have a single on-premises Kafka cluster in a data center in the us-east region that is responsible for ingesting messages from IoT devices globally. Because large parts of globe have poor internet connectivity, messages sometimes batch at the edge, come in all at once, and cause a spike in load on your Kafka cluster. This is becoming difficult to manage and prohibitively expensive. What is the Google-recommended cloud native architecture for this scenario?

    A. Edge TPUs as sensor devices for storing and transmitting the messages.

    B. Cloud Dataflow connected to the Kafka cluster to scale the processing of incoming messages.

    C. An IoT gateway connected to Cloud Pub/Sub, with Cloud Dataflow to read and process the messages from Cloud Pub/Sub.

    D. A Kafka cluster virtualized on Compute Engine in us-east with Cloud Load Balancing to connect to the devices around the world.

  • Question 249:

    You are on the data governance team and are implementing security requirements to deploy resources. You need to ensure that resources are limited to only the europe-west 3 region You want to follow Google-recommended practices What should you do?

    A. Deploy resources with Terraform and implement a variable validation rule to ensure that the region is set to the europe-west3 region for all resources.

    B. Set the constraints/gcp. resourceLocations organization policy constraint to in:eu- locations.

    C. Create a Cloud Function to monitor all resources created and automatically destroy the ones created outside the europe-west3 region.

    D. Set the constraints/gcp. resourceLocations organization policy constraint to in: europe- west3-locations.

  • Question 250:

    You work for an airline and you need to store weather data in a BigQuery table Weather data will be used as input to a machine learning model. The model only uses the last 30 days of weather data. You want to avoid storing unnecessary

    data and minimize costs.

    What should you do?

    A. Create a BigQuery table where each record has an ingestion timestamp Run a scheduled query to delete all the rows with an ingestion timestamp older than 30 days.

    B. Create a BigQuery table partitioned by ingestion time Set up partition expiration to 30 days.

    C. Create a BigQuery table partitioned by datetime value of the weather date Set up partition expiration to 30 days.

    D. Create a BigQuery table with a datetime column for the day the weather data refers to.Run a scheduled query to delete rows with a datetime value older than 30 days.

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Google exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your PROFESSIONAL-DATA-ENGINEER exam preparations and Google certification application, do not hesitate to visit our Vcedump.com to find your solutions here.