Exam Details

  • Exam Code
    :DP-203
  • Exam Name
    :Data Engineering on Microsoft Azure
  • Certification
    :Microsoft Certified: Azure Data Engineer Associate
  • Vendor
    :Microsoft
  • Total Questions
    :380 Q&As
  • Last Updated
    :May 16, 2024

Microsoft Microsoft Certified: Azure Data Engineer Associate DP-203 Questions & Answers

  • Question 21:

    You have an Azure subscription that contains a storage account named storage1 and an Azure Synapse Analytics dedicated SQL pool. The storage1 account contains a CSV file that requires an account key for access.

    You plan to read the contents of the CSV file by using an external table.

    You need to create an external data source for the external table.

    What should you create first?

    A. a database role

    B. a database scoped credential

    C. a database view

    D. an external file format

  • Question 22:

    You have an Azure Synapse Analytics workspace that contains an Apache Spark pool named SparkPool1. SparkPool1 contains a Delta Lake table named SparkTable1.

    You need to recommend a solution that supports Transact-SQL queries against the data referenced by SparkTable1. The solution must ensure that the queries can use partition elimination.

    What should you include in the recommendation?

    A. a partitioned table in a dedicated SQL pool

    B. a partitioned view in a dedicated SQL pool

    C. a partitioned index in a dedicated SQL pool

    D. a partitioned view in a serverless SQL pool

  • Question 23:

    You have an Azure Databricks workspace and an Azure Data Lake Storage Gen2 account named storage1.

    New files are uploaded daily to storage1.

    You need to recommend a solution that configures storage1 as a structured streaming source. The solution must meet the following requirements:

    1.

    Incrementally process new files as they are uploaded to storage1.

    2.

    Minimize implementation and maintenance effort.

    3.

    Minimize the cost of processing millions of files.

    4.

    Support schema inference and schema drift. Which should you include in the recommendation?

    A. COPY INTO

    B. Azure Data Factory

    C. Auto Loader

    D. Apache Spark FileStreamSource

  • Question 24:

    You are designing a solution that will use tables in Delta Lake on Azure Databricks. You need to minimize how long it takes to perform the following:

    1.

    Queries against non-partitioned tables

    2.

    Joins on non-partitioned columns

    Which two options should you include in the solution? Each correct answer presents part of the solution.

    NOTE: Each correct selection is worth one point.

    A. the clone command

    B. Z-Ordering

    C. Apache Spark caching

    D. dynamic file pruning (DFP)

  • Question 25:

    You have an Azure subscription that contains the resources shown in the following table.

    You need to read the TSV files by using ad-hoc queries and the OPENROWSETfunction. The solution must assign a name and override the inferred data type of each column.

    What should you include in the OPENROWSETfunction?

    A. the WITH clause

    B. the ROWSET_OPTIONSbulk option

    C. the DATAFILETYPEbulk option

    D. the DATA_SOURCEparameter

  • Question 26:

    You have an Azure Data Lake Storage Gen2 account named account1 that contains a container named container1.

    You plan to create lifecycle management policy rules for container1.

    You need to ensure that you can create rules that will move blobs between access tiers based on when each blob was accessed last.

    What should you do first?

    A. Configure object replication

    B. Create an Azure application

    C. Enable access time tracking

    D. Enable the hierarchical namespace

  • Question 27:

    You have an Azure Data Factory pipeline named pipeline1 that includes a Copy activity named Copy1. Copy1 has the following configurations:

    1.

    The source of Copy1 is a table in an on-premises Microsoft SQL Server instance that is accessed by using a linked service connected via a self-hosted integration runtime.

    2.

    The sink of Copy1 uses a table in an Azure SQL database that is accessed by using a linked service connected via an Azure integration runtime.

    You need to maximize the amount of compute resources available to Copy1. The solution must minimize administrative effort.

    What should you do?

    A. Scale out the self-hosted integration runtime.

    B. Scale up the data flow runtime of the Azure integration runtime and scale out the self-hosted integration runtime.

    C. Scale up the data flow runtime of the Azure integration runtime.

  • Question 28:

    You have an Azure Synapse Analytics dedicated SQL pool named Pool1. Pool1 contains a fact table named Table1.

    You need to identify the extent of the data skew in Table1.

    What should you do in Synapse Studio?

    A. Connect to Pool1 and run DBCC PDW_SHOWSPACEUSED.

    B. Connect to the built-in pool and run DBCC PDW_SHOWSPACEUSED.

    C. Connect to the built-in pool and run DBCC CHECKALLOC.

    D. Connect to the built-in pool and query sys.dm_pdw_sys_info.

  • Question 29:

    You have an Azure Synapse Analytics dedicated SQL pool that contains a table named DimSalesPerson. DimSalesPerson contains the following columns:

    1.

    RepSourceID

    2.

    SalesRepID

    3.

    FirstName

    4.

    LastName

    5.

    StartDate

    6.

    EndDate

    7.

    Region

    You are developing an Azure Synapse Analytics pipeline that includes a mapping data flow named Dataflow1. Dataflow1 will read sales team data from an external source and use a Type 2 slowly changing dimension (SCD) when loading the

    data into DimSalesPerson.

    You need to update the last name of a salesperson in DimSalesPerson.

    Which two actions should you perform? Each correct answer presents part of the solution.

    NOTE: Each correct selection is worth one point.

    A. Update three columns of an existing row.

    B. Update two columns of an existing row.

    C. Insert an extra row.

    D. Update one column of an existing row.

  • Question 30:

    A company purchases IoT devices to monitor manufacturing machinery. The company uses an Azure IoT Hub to communicate with the IoT devices.

    The company must be able to monitor the devices in real-time.

    You need to design the solution.

    What should you recommend?

    A. Azure Analysis Services using Azure PowerShell

    B. Azure Stream Analytics Edge application using Microsoft Visual Studio

    C. Azure Analysis Services using Microsoft Visual Studio

    D. Azure Data Factory instance using Azure Portal

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Microsoft exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DP-203 exam preparations and Microsoft certification application, do not hesitate to visit our Vcedump.com to find your solutions here.