You have two Azure Blob Storage accounts named account1 and account2.
You plan to create an Azure Data Factory pipeline that will use scheduled intervals to replicate newly created or modified blobs from account1 to account2.
You need to recommend a solution to implement the pipeline. The solution must meet the following requirements:
1.
Ensure that the pipeline only copies blobs that were created or modified since the most recent replication event.
2.
Minimize the effort to create the pipeline. What should you recommend?
A. Run the Copy Data tool and select Metadata-driven copy task.
B. Create a pipeline that contains a Data Flow activity.
C. Create a pipeline that contains a flowlet.
D. Run the Copy Data tool and select Built-in copy task.
Correct Answer: A
Build large-scale data copy pipelines with metadata-driven approach in copy data tool
When you want to copy huge amounts of objects (for example, thousands of tables) or load data from large variety of sources, the appropriate approach is to input the name list of the objects with required copy behaviors in a control table,
and then use parameterized pipelines to read the same from the control table and apply them to the jobs accordingly. By doing so, you can maintain (for example, add/remove) the objects list to be copied easily by just updating the object
names in control table instead of redeploying the pipelines. What's more, you will have single place to easily check which objects copied by which pipelines/triggers with defined copy behaviors.
Copy data tool in ADF eases the journey of building such metadata driven data copy pipelines. After you go through an intuitive flow from a wizard-based experience, the tool can generate parameterized pipelines and SQL scripts for you to
create external control tables accordingly. After you run the generated scripts to create the control table in your SQL database, your pipelines will read the metadata from the control table and apply them on the copy jobs automatically.
Incorrect:
Not C: A flowlet is a reusable container of activities that can be created from an existing mapping data flow or started from scratch. By reusing patterns you can prevent logic duplication and apply the same logic across many mapping data
flows.
With flowlets you can create logic to do things such as address cleaning or string trimming. You can then map the input and outputs to columns in the calling data flow for a dynamic code reuse experience.
You plan to use an Apache Spark pool in Azure Synapse Analytics to load data to an Azure Data Lake Storage Gen2 account.
You need to recommend which file format to use to store the data in the Data Lake Storage account. The solution must meet the following requirements:
Column names and data types must be defined within the files loaded to the Data Lake Storage account. Data must be accessible by using queries from an Azure Synapse Analytics serverless SQL pool. Partition elimination must be
supported without having to specify a specific partition.
What should you recommend?
A. Delta Lake
B. JSON
C. CSV
D. ORC
Correct Answer: D
Question 44:
You are deploying a lake database by using an Azure Synapse database template.
You need to add additional tables to the database. The solution must use the same grouping method as the template tables.
Which grouping method should you use?
A. business area
B. size
C. facts and dimensions
D. partition style
Correct Answer: A
Business area: This is how the Azure Synapse database templates group tables by default. Each template consists of one or more enterprise templates that contain tables grouped by business areas. For example, the Retail template has business areas such as Customer, Product, Sales, and Store123. Using the same grouping method as the template tables can help you maintain consistency and compatibility with the industry-specific data model. https://techcommunity.microsoft.com/t5/azure-synapse-analytics-blog/database-templates- in-azure- synapse-analytics/ba-p/2929112
Question 45:
You have an Azure subscription that is linked to a tenant in Microsoft Azure Active Directory (Azure AD), part of Microsoft Entra. The tenant that contains a security group named Group1. The subscription contains an Azure Data Lake Storage
account named myaccount1. The myaccount1 account contains two containers named container1 and container2.
You need to grant Group1 read access to container1. The solution must use the principle of least privilege.
Which role should you assign to Group1?
A. Storage Table Data Reader for myaccount1
B. Storage Blob Data Reader for container1
C. Storage Blob Data Reader for myaccount1
D. Storage Table Data Reader for container1
Correct Answer: B
Storage Blob Data Reader
Read and list Azure Storage containers and blobs.
Incorrect:
Not A, Not C: The scope of the role should be container1, not the account.
Not A, not D: Storage Table Data Reader
Allows for read access to Azure Storage tables and entities
You are designing a dimension table in an Azure Synapse Analytics dedicated SQL pool.
You need to create a surrogate key for the table. The solution must provide the fastest query performance.
What should you use for the surrogate key?
A. a GUID column
B. a sequence object
C. an IDENTITY column
Correct Answer: C
Use IDENTITY to create surrogate keys using dedicated SQL pool in AzureSynapse Analytics.
Note: A surrogate key on a table is a column with a unique identifier for each row. The key is not generated from the table data. Data modelers like to create surrogate keys on their tables when they design data warehouse models. You can use the IDENTITY property to achieve this goal simply and effectively without affecting load performance. Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data- warehouse/sql-data- warehouse-tables-identity
Question 47:
You have an Azure data factory that connects to a Microsoft Purview account. The data factory is registered in Microsoft Purview.
You update a Data Factory pipeline.
You need to ensure that the updated lineage is available in Microsoft Purview.
What should you do first?
A. Locate the related asset in the Microsoft Purview portal.
B. Execute the pipeline.
C. Disconnect the Microsoft Purview account from the data factory.
D. Execute an Azure DevOps build pipeline.
Correct Answer: B
Run pipeline and push lineage data to Microsoft Purview
Step 1: Connect Data Factory to your Microsoft Purview account
Step 2: Run pipeline in Data Factory
You can create pipelines, Copy activities and Dataflow activities in Data Factory. You don't need any additional configuration for lineage data capture. The lineage data will automatically be captured during the activities execution.
Step 3: Monitor lineage reporting status
After you run the pipeline, in the pipeline monitoring view, you can check the lineage reporting status by clicking the following Lineage status button.
Step 4: View lineage information in your Microsoft Purview account
On Microsoft Purview UI, you can browse assets and choose type "Azure Data Factory". You can also search the Data Catalog using keywords.
You have an Azure subscription that contains an Azure SQL database named DB1 and a storage account named storage1. The storage1 account contains a file named File1.txt. File1.txt contains the names of selected tables in DB1. You need to use an Azure Synapse pipeline to copy data from the selected tables in DB1 to the files in storage1. The solution must meet the following requirements:
1.
The Copy activity in the pipeline must be parameterized to use the data in File1.txt to identify the source and destination of the copy.
2.
Copy activities must occur in parallel as often as possible.
Which two pipeline activities should you include in the pipeline? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.
A. If Condition
B. ForEach
C. Lookup
D. Get Metadata
Correct Answer: BC
Lookup: This is a control activity that retrieves a dataset from any of the supported data sources and makes it available for use by subsequent activities in the pipeline. You can use a Lookup activity to read File1.txt from storage1 and store its content as an array variable1. ForEach: This is a control activity that iterates over a collection and executes specified activities in a loop. You can use a ForEach activity to loop over the array variable from the Lookup activity and pass each table name as a parameter to a Copy activity that copies data from DB1 to storage11.
Question 49:
You have an Azure subscription that contains a Microsoft Purview account named MP1, an Azure data factory named DF1, and a storage account named storage. MP1 is configured 10 scan storage1. DF1 is connected to MP1 and contains 3
dataset named DS1. DS1 references 2 file in storage.
In DF1, you plan to create a pipeline that will process data from DS1.
You need to review the schema and lineage information in MP1 for the data referenced by DS1.
Which two features can you use to locate the information? Each correct answer presents a complete solution.
NOTE: Each correct answer is worth one point.
A. the Storage browser of storage1 in the Azure portal
B. the search bar in the Azure portal
C. the search bar in Azure Data Factory Studio
D. the search bar in the Microsoft Purview governance portal
Correct Answer: CD
The search bar in the Microsoft Purview governance portal: This is a feature that allows you to search for assets in your data estate using keywords, filters, and facets. You can use the search bar to find the files in storage1 that are referenced by DS1, and then view their schema and lineage information in the asset details page12. The search bar in Azure Data Factory Studio: This is a feature that allows you to search for datasets, linked services, pipelines, and other resources in your data factory. You can use the search bar to find DS1 in DF1, and then view its schema and lineage information in the dataset details page. You can also click on the Open in Purview button to open the corresponding asset in MP13. The two features that can be used to locate the schema and lineage information for the data referenced by DS1 are the search bar in Azure Data Factory Studio and the search bar in the Microsoft Purview governance portal. The search bar in Azure Data Factory Studio allows you to search for the dataset DS1 and view its properties and lineage. This can help you locate information about the source and destination data stores, as well as the transformations that were applied to the data. The search bar in the Microsoft Purview governance portal allows you to search for the storage account and view its metadata, including schema and lineage information. This can help you understand the different data assets that are stored in the storage account and how they are related to each other. The Storage browser of storage1 in the Azure portal may allow you to view the files that are stored in the storage account, but it does not provide lineage or schema information for those files. Similarly, the search bar in the Azure portal may allow you to search for resources in the Azure subscription, but it does not provide detailed information about the data assets themselves. References: What is Azure Purview? Use Azure Data Factory Studio
Question 50:
You have a Microsoft Purview account. The Lineage view of a CSV file is shown in the following exhibit.
How is the data for the lineage populated?
A. manually
B. by scanning data stores
C. by executing a Data Factory pipeline
Correct Answer: C
From the exhibit we see Copy_XferFolder (and even: From Data Factory).
The following example is a typical use case of data moving across multiple systems, where the Data Catalog would connect to each of the systems for lineage.
1.
Data Factory copies data from on-prem/raw zone to a landing zone in the cloud.
Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Microsoft exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DP-203 exam preparations and Microsoft certification application, do not hesitate to visit our Vcedump.com to find your solutions here.