Exam Details

  • Exam Code
    :PROFESSIONAL-DATA-ENGINEER
  • Exam Name
    :Professional Data Engineer on Google Cloud Platform
  • Certification
    :Google Certifications
  • Vendor
    :Google
  • Total Questions
    :331 Q&As
  • Last Updated
    :May 08, 2024

Google Google Certifications PROFESSIONAL-DATA-ENGINEER Questions & Answers

  • Question 21:

    You've migrated a Hadoop job from an on-prem cluster to dataproc and GCS. Your Spark job is a complicated analytical workload that consists of many shuffling operations and initial data are parquet files (on average 200-400 MB size each).

    You see some degradation in performance after the migration to Dataproc, so you'd like to optimize for it. You need to keep in mind that your organization is very cost-sensitive, so you'd like to continue using Dataproc on preemptibles (with 2

    non-preemptible workers only) for this workload.

    What should you do?

    A. Increase the size of your parquet files to ensure them to be 1 GB minimum.

    B. Switch to TFRecords formats (appr. 200MB per file) instead of parquet files.

    C. Switch from HDDs to SSDs, copy initial data from GCS to HDFS, run the Spark job and copy results back to GCS.

    D. Switch from HDDs to SSDs, override the preemptible VMs configuration to increase the boot disk size.

  • Question 22:

    A TensorFlow machine learning model on Compute Engine virtual machines (n2-standard - 32) takes two days to complete framing.

    The model has custom TensorFlow operations that must run partially on a CPU

    You want to reduce the training time in a cost-effective manner. What should you do?

    A. Change the VM type to n2-highmem-32

    B. Change the VM type to e2 standard-32

    C. Train the model using a VM with a GPU hardware accelerator

    D. Train the model using a VM with a TPU hardware accelerator

  • Question 23:

    An aerospace company uses a proprietary data format to store its night data. You need to connect this new data source to BigQuery and stream the data into BigQuery. You want to efficiency import the data into BigQuery where consuming

    as few resources as possible.

    What should you do?

    A. Use a standard Dataflow pipeline to store the raw data m BigQuery and then transform the format later when the data is used

    B. Write a she script that triggers a Cloud Function that performs periodic ETL batch jobs on the new data source

    C. Use Apache Hive to write a Dataproc job that streams the data into BigQuery in CSV format

    D. Use an Apache Beam custom connector to write a Dataflow pipeline that streams the data into BigQuery in Avro format

  • Question 24:

    An online brokerage company requires a high volume trade processing architecture. You need to create a secure queuing system that triggers jobs. The jobs will run in Google Cloud and cat the company's Python API to execute trades. You need to efficiently implement a solution. What should you do?

    A. Use Cloud Composer to subscribe to a Pub/Sub tope and can the Python API.

    B. Use a Pub/Sub push subscription to trigger a Cloud Function to pass the data to tie Python API.

    C. Write an application that makes a queue in a NoSQL database

    D. Write an application hosted on a Compute Engine instance that makes a push subscription to the Pub/Sub topic

  • Question 25:

    You want to optimize your queries for cost and performance. How should you structure your data?

    A. Partition table data by create_date, location_id and device_version

    B. Partition table data by create_date cluster table data by location_Id and device_version

    C. Cluster table data by create_date location_id and device_version

    D. Cluster table data by create_date partition by locationed and device_version

  • Question 26:

    You need ads data to serve Al models and historical data tor analytics longtail and outlier data points need to be identified

    You want to cleanse the data n near-reel time before running it through Al models

    What should you do?

    A. Use BigQuery to ingest prepare and then analyze the data and then run queries to create views

    B. Use Cloud Storage as a data warehouse shell scripts tor processing, and BigQuery to create views tor desired datasets

    C. Use Dataflow to identity longtail and outber data points programmatically with BigQuery as a sink

    D. Use Cloud Composer to identify longtail and outlier data points, and then output a usable dataset to BigQuery

  • Question 27:

    The Development and External teams nave the project viewer Identity and Access Management (1AM) role m a folder named Visualization. You want the Development Team to be able to read data from both Cloud Storage and BigQuery, but the External Team should only be able to read data from BigQuery. What should you do?

    A. Remove Cloud Storage IAM permissions to the External Team on the acme-raw-data project

    B. Create Virtual Private Cloud (VPC) firewall rules on the acme-raw-data protect that deny all Ingress traffic from the External Team CIDR range

    C. Create a VPC Service Controls perimeter containing both protects and BigQuery as a restricted API Add the External Team users to the perimeter s Access Level

    D. Create a VPC Service Controls perimeter containing both protects and Cloud Storage as a restricted API. Add the Development Team users to the perimeter's Access Level

  • Question 28:

    You are using BigQuery and Data Studio to design a customer-facing dashboard that displays large quantities of aggregated data. You expect a high volume of concurrent users. You need to optimize tie dashboard to provide quick visualizations with minimal latency. What should you do?

    A. Use BigQuery BI Engine with materialized views

    B. Use BigQuery BI Engine with streaming data.

    C. Use BigQuery Bl Engine with authorized views

    D. Use BigQuery Bl Engine with logical reviews

  • Question 29:

    You're training a model to predict housing prices based on an available dataset with real estate properties. Your plan is to train a fully connected neural net, and you've discovered that the dataset contains latitude and longtitude of the property. Real estate professionals have told you that the location of the property is highly influential on price, so you'd like to engineer a feature that incorporates this physical dependency.

    What should you do?

    A. Provide latitude and longtitude as input vectors to your neural net.

    B. Create a numeric column from a feature cross of latitude and longtitude.

    C. Create a feature cross of latitude and longtitude, bucketize at the minute level and use L1 regularization during optimization.

    D. Create a feature cross of latitude and longtitude, bucketize it at the minute level and use L2 regularization during optimization.

  • Question 30:

    You are building a data pipeline on Google Cloud. You need to prepare data using a casual method for a machine-learning process. You want to support a logistic regression model. You also need to monitor and adjust for null values, which must remain real-valued and cannot be removed.

    What should you do?

    A. Use Cloud Dataprep to find null values in sample source data. Convert all nulls to `none' using a Cloud Dataproc job.

    B. Use Cloud Dataprep to find null values in sample source data. Convert all nulls to 0 using a Cloud Dataprep job.

    C. Use Cloud Dataflow to find null values in sample source data. Convert all nulls to `none' using a Cloud Dataprep job.

    D. Use Cloud Dataflow to find null values in sample source data. Convert all nulls to using a custom script.

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Google exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your PROFESSIONAL-DATA-ENGINEER exam preparations and Google certification application, do not hesitate to visit our Vcedump.com to find your solutions here.