Which of the following describes concept drift?
A. Concept drift is when there is a change in the distribution of an input variable
B. Concept drift is when there is a change in the distribution of a target variable
C. Concept drift is when there is a change in the relationship between input variables and target variables
D. Concept drift is when there is a change in the distribution of the predicted target given by the model
E. None of these describe Concept drift
A machine learning engineer is monitoring categorical input variables for a production machine learning application. The engineer believes that missing values are becoming more prevalent in more recent data for a particular value in one of
the categorical input variables.
Which of the following tools can the machine learning engineer use to assess their theory?
A. Kolmogorov-Smirnov (KS) test
B. One-way Chi-squared Test
C. Two-way Chi-squared Test
D. Jenson-Shannon distance
E. None of these
Which of the following is a simple, low-cost method of monitoring numeric feature drift?
A. Jensen-Shannon test
B. Summary statistics trends
C. Chi-squared test
D. None of these can be used to monitor feature drift
E. Kolmogorov-Smirnov (KS) test
A data scientist has developed a model to predict ice cream sales using the expected temperature and expected number of hours of sun in the day. However, the expected temperature is dropping beneath the range of the input variable on
which the model was trained.
Which of the following types of drift is present in the above scenario?
A. Label drift
B. None of these
C. Concept drift
D. Prediction drift
E. Feature drift
A data scientist wants to remove the star_rating column from the Delta table at the location path. To do this, they need to load in data and drop the star_rating column. Which of the following code blocks accomplishes this task?
A. spark.read.format(“delta”).load(path).drop(“star_rating”)
B. spark.read.format(“delta”).table(path).drop(“star_rating”)
C. Delta tables cannot be modified
D. spark.read.table(path).drop(“star_rating”)
E. spark.sql(“SELECT * EXCEPT star_rating FROM path”)
Which of the following operations in Feature Store Client fs can be used to return a Spark DataFrame of a data set associated with a Feature Store table?
A. fs.create_table
B. fs.write_table
C. fs.get_table
D. There is no way to accomplish this task with fs
E. fs.read_table
A machine learning engineer is in the process of implementing a concept drift monitoring solution. They are planning to use the following steps:
1.
Deploy a model to production and compute predicted values
2.
Obtain the observed (actual) label values
3.
_____
4.
Run a statistical test to determine if there are changes over time Which of the following should be completed as Step #3?
A. Obtain the observed values (actual) feature values
B. Measure the latency of the prediction time
C. Retrain the model
D. None of these should be completed as Step #3
E. Compute the evaluation metric using the observed and predicted values
Which of the following is a reason for using Jensen-Shannon (JS) distance over a Kolmogorov-Smirnov (KS) test for numeric feature drift detection?
A. All of these reasons
B. JS is not normalized or smoothed
C. None of these reasons
D. JS is more robust when working with large datasets
E. JS does not require any manual threshold or cutoff determinations
A data scientist has developed and logged a scikit-learn random forest model model, and then they ended their Spark session and terminated their cluster. After starting a new cluster, they want to review the feature_importances_ of the
original model object.
Which of the following lines of code can be used to restore the model object so that feature_importances_ is available?
A. mlflow.load_model(model_uri)
B. client.list_artifacts(run_id)["feature-importances.csv"]
C. mlflow.sklearn.load_model(model_uri)
D. This can only be viewed in the MLflow Experiments UI
E. client.pyfunc.load_model(model_uri)
Which of the following is a simple statistic to monitor for categorical feature drift?
A. Mode
B. None of these
C. Mode, number of unique values, and percentage of missing values
D. Percentage of missing values
E. Number of unique values
Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Databricks exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DATABRICKS-MACHINE-LEARNING-PROFESSIONAL exam preparations and Databricks certification application, do not hesitate to visit our Vcedump.com to find your solutions here.