Vcedump 100% Guareented DP-100 Questions and Answers. 100% Pass Guarantee. Latest Questions with Accurate Answers.

Exam Details

Exam Code
:DP-100
Exam Name
:Designing and Implementing a Data Science Solution on Azure
Certification
:Microsoft Certifications
Vendor
:Microsoft
Total Questions
:564 Q&As
Last Updated
:Aug 15, 2025

Microsoft Microsoft Certifications DP-100 Questions & Answers

Question 251:

You define a datastore named ml-data for an Azure Storage blob container. In the container, you have a folder named train that contains a file named data.csv. You plan to use the file to train a model by using the Azure Machine Learning
SDK.
You plan to train the model by using the Azure Machine Learning SDK to run an experiment on local compute.
You define a DataReference object by running the following code:
You need to load the training data. Which code segment should you use?
A. Option A
B. Option B
C. Option C
D. Option D
E. Option E

Correct Answer: E
Example:
data_folder = args.data_folder
# Load Train and Test data
train_data = pd.read_csv(os.path.join(data_folder, 'data.csv'))
Reference:
https://www.element61.be/en/resource/azure-machine-learning-services-complete-toolbox-ai
Question 252:

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You create an Azure Machine Learning service datastore in a workspace. The datastore contains the following files:
1.
/data/2018/Q1.csv
2.
/data/2018/Q2.csv
3.
/data/2018/Q3.csv
4.
/data/2018/Q4.csv
5.
/data/2019/Q1.csv
All files store data in the following format:
id,f1,f2,I 1,1,2,0 2,1,1,1 3,2,1,0 4,2,2,1
You run the following code:
You need to create a dataset named training_data and load the data from all files into a single data frame by using the following code:
Solution: Run the following code: Does the solution meet the goal?
A. Yes
B. No

Correct Answer: B
Define paths with two file paths instead.
Use Dataset.Tabular_from_delimeted as the data isn't cleansed.
Reference:
https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-register-datasets
Question 253:

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You create an Azure Machine Learning service datastore in a workspace. The datastore contains the following files:
1.
/data/2018/Q1.csv
2.
/data/2018/Q2.csv
3.
/data/2018/Q3.csv
4.
/data/2018/Q4.csv
5.
/data/2019/Q1.csv
All files store data in the following format:
id,f1,f2,I 1,1,2,0 2,1,1,1 3,2,1,0 4,2,2,1
You run the following code:
You need to create a dataset named training_data and load the data from all files into a single data frame by using the following code:
Solution: Run the following code:
Does the solution meet the goal?
A. Yes
B. No

Correct Answer: A
Use two file paths.
Use Dataset.Tabular_from_delimeted as the data isn't cleansed.
Note:
A TabularDataset represents data in a tabular format by parsing the provided file or list of files. This provides you with the ability to materialize the data into a pandas or Spark DataFrame so you can work with familiar data preparation and
training libraries without having to leave your notebook. You can create a TabularDataset object from .csv, .tsv, .parquet, .jsonl files, and from SQL query results.
Reference:
https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-register-datasets
Question 254:

You register a file dataset named csv_folder that references a folder. The folder includes multiple comma- separated values (CSV) files in an Azure storage blob container. You plan to use the following code to run a script that loads data from the file dataset. You create and instantiate the following variables:
You have the following code:
You need to pass the dataset to ensure that the script can read the files it references.
Which code segment should you insert to replace the code comment?
A. inputs=[file_dataset.as_named_input('training_files')],
B. inputs=[file_dataset.as_named_input('training_files').as_mount()],
C. inputs=[file_dataset.as_named_input('training_files').to_pandas_dataframe ()],
D. script_params={'--training_files': file_dataset},

Correct Answer: B
Example:
from azureml.train.estimator import Estimator
script_params = {
# to mount files referenced by mnist dataset
'--data-folder': mnist_file_dataset.as_named_input('mnist_opendataset').as_mount(), '--regularization': 0.5
}
est = Estimator(source_directory=script_folder,
script_params=script_params,
compute_target=compute_target,
environment_definition=env,
entry_script='train.py')
Reference:
https://docs.microsoft.com/en-us/azure/machine-learning/tutorial-train-models-with-aml
Question 255:

You are creating a new Azure Machine Learning pipeline using the designer.
The pipeline must train a model using data in a comma-separated values (CSV) file that is published on a website. You have not created a dataset for this file.
You need to ingest the data from the CSV file into the designer pipeline using the minimal administrative effort.
Which module should you add to the pipeline in Designer?
A. Convert to CSV
B. Enter Data Manually
C. Import Data
D. Dataset

Correct Answer: D
The preferred way to provide data to a pipeline is a Dataset object. The Dataset object points to data that lives in or is accessible from a datastore or at a Web URL. The Dataset class is abstract, so you will create an instance of either a
FileDataset (referring to one or more files) or a TabularDataset that's created by from one or more files with delimited columns of data.
Example:
from azureml.core import Dataset
iris_tabular_dataset = Dataset.Tabular.from_delimited_files([(def_blob_store, 'train-dataset/iris.csv')])
Reference: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-your-first-pipeline
Question 256:

You are a lead data scientist for a project that tracks the health and migration of birds. You create a multi- class image classification deep learning model that uses a set of labeled bird photographs collected by experts. You have 100,000 photographs of birds. All photographs use the JPG format and are stored in an Azure blob container in an Azure subscription.
You need to access the bird photograph files in the Azure blob container from the Azure Machine Learning service workspace that will be used for deep learning model training. You must minimize data movement.
What should you do?
A. Create an Azure Data Lake store and move the bird photographs to the store.
B. Create an Azure Cosmos DB database and attach the Azure Blob containing bird photographs storage to the database.
C. Create and register a dataset by using TabularDataset class that references the Azure blob storage containing bird photographs.
D. Register the Azure blob storage containing the bird photographs as a datastore in Azure Machine Learning service.
E. Copy the bird photographs to the blob datastore that was created with your Azure Machine Learning service workspace.

Correct Answer: D
We recommend creating a datastore for an Azure Blob container. When you create a workspace, an Azure blob container and an Azure file share are automatically registered to the workspace.
Reference: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-access-data
Question 257:

You use the Azure Machine Learning service to create a tabular dataset named training_data. You plan to use this dataset in a training script.
You create a variable that references the dataset using the following code:
training_ds = workspace.datasets.get("training_data")
You define an estimator to run the script.
You need to set the correct property of the estimator to ensure that your script can access the training_data dataset.
Which property should you set?
A. environment_definition = {"training_data":training_ds}
B. inputs = [training_ds.as_named_input('training_ds')]
C. script_params = {"--training_ds":training_ds}
D. source_directory = training_ds

Correct Answer: B
Example:
# Get the training dataset
diabetes_ds = ws.datasets.get("Diabetes Dataset")
# Create an estimator that uses the remote compute
hyper_estimator = SKLearn(source_directory=experiment_folder, inputs=[diabetes_ds.as_named_input('diabetes')], # Pass the dataset as an input compute_target = cpu_cluster,
conda_packages=['pandas','ipykernel','matplotlib'],
pip_packages=['azureml-sdk','argparse','pyarrow'],
entry_script='diabetes_training.py')
Reference:
https://notebooks.azure.com/GraemeMalcolm/projects/azureml-primers/html/04%20-%20Optimizing% 20Model%20Training.ipynb
Question 258:

You use Azure Machine Learning Studio to build a machine learning experiment.
You need to divide data into two distinct datasets.
Which module should you use?
A. Split Data
B. Load Trained Model
C. Assign Data to Clusters
D. Group Data into Bins

Correct Answer: D
The Group Data into Bins module supports multiple options for binning data. You can customize how the bin edges are set and how values are apportioned into the bins.
References: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/group-data-into-bins
Question 259:

You are solving a classification task.
You must evaluate your model on a limited data sample by using k-fold cross-validation. You start by configuring a k parameter as the number of splits.
You need to configure the k parameter for the cross-validation.
Which value should you use?
A. k=1
B. k=10
C. k=0.5
D. k=0.9

Correct Answer: B
Leave One Out (LOO) cross-validation
Setting K = n (the number of observations) yields n-fold and is called leave-one out cross-validation (LOO), a special case of the K-fold approach.
LOO CV is sometimes useful but typically doesn't shake up the data enough. The estimates from each fold are highly correlated and hence their average can have high variance. This is why the usual choice is K=5 or 10. It provides a good
compromise for the bias-variance tradeoff.
Question 260:

You are evaluating a completed binary classification machine learning model.
You need to use the precision as the evaluation metric.
Which visualization should you use?
A. violin plot
B. Gradient descent
C. Scatter plot
D. Receiver Operating Characteristic (ROC) curve

Correct Answer: D
Receiver operating characteristic (or ROC) is a plot of the correctly classified labels vs. the incorrectly classified labels for a particular model. Incorrect Answers:
A: A violin plot is a visual that traditionally combines a box plot and a kernel density plot.
B: Gradient descent is a first-order iterative optimization algorithm for finding the minimum of a function. To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or approximate gradient) of the function at the current point.
C: A scatter plot graphs the actual values in your data against the values predicted by the model. The scatter plot displays the actual values along the X-axis, and displays the predicted values along the Y-axis. It also displays a line that illustrates the perfect prediction, where the predicted value exactly matches the actual value.
References: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-understand-automated-ml#confusion- matrix

Related Exams:

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Microsoft exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DP-100 exam preparations and Microsoft certification application, do not hesitate to visit our Vcedump.com to find your solutions here.

Exam Details

Exam Code

Exam Name

Certification

Vendor

Total Questions

Last Updated

Microsoft Microsoft Certifications DP-100 Questions & Answers

Question 251:

Question 252:

Question 253:

Question 254:

Question 255:

Question 256:

Question 257:

Question 258:

Question 259:

Question 260:

Related Exams:

62-193

70-243

70-355

77-420

77-427

77-725

77-726

77-727

77-728

77-731

Tips on How to Prepare for the Exams

Designing and Implementing a Data Science Solution on Azure

Exam Details

Exam Code

Exam Name

Certification

Vendor

Total Questions

Last Updated

Microsoft Microsoft Certifications DP-100 Questions & Answers

Question 251:

Question 252:

Question 253:

Question 254:

Question 255:

Question 256:

Question 257:

Question 258:

Question 259:

Question 260:

Related Exams:

Tips on How to Prepare for the Exams