Exam Details

  • Exam Code
    :DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK
  • Exam Name
    :Databricks Certified Associate Developer for Apache Spark 3.0
  • Certification
    :Databricks Certification
  • Vendor
    :Databricks
  • Total Questions
    :180 Q&As
  • Last Updated
    :Apr 25, 2024

Databricks Databricks Certification DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK Questions & Answers

  • Question 1:

    The code block shown below should return a one-column DataFrame where the column storeId is converted to string type. Choose the answer that correctly fills the blanks in the code block to accomplish this.

    transactionsDf.__1__(__2__.__3__(__4__))

    A. 1. select

    2.

    col("storeId")

    3.

    cast

    4.

    StringType

    B. 1. select

    2.

    col("storeId")

    3.

    as

    4.

    StringType

    C. 1. cast

    2.

    "storeId"

    3.

    as

    4.

    StringType()

    D. 1. select

    2.

    col("storeId")

    3.

    cast

    4.

    StringType()

    E. 1. select

    2.

    storeId

    3.

    cast

    4.

    StringType()

  • Question 2:

    Which of the following code blocks concatenates rows of DataFrames transactionsDf and transactionsNewDf, omitting any duplicates?

    A. transactionsDf.concat(transactionsNewDf).unique()

    B. transactionsDf.union(transactionsNewDf).distinct()

    C. spark.union(transactionsDf, transactionsNewDf).distinct()

    D. transactionsDf.join(transactionsNewDf, how="union").distinct()

    E. transactionsDf.union(transactionsNewDf).unique()

  • Question 3:

    The code block displayed below contains an error. The code block is intended to return all columns of DataFrame transactionsDf except for columns predError, productId, and value.

    Find the error.

    Excerpt of DataFrame transactionsDf:

    transactionsDf.select(~col("predError"), ~col("productId"), ~col("value"))

    A. The select operator should be replaced by the drop operator and the arguments to the drop operator should be column names predError, productId and value wrapped in the col operator so they should be expressed like drop(col(predError), col(productId), col(value)).

    B. The select operator should be replaced with the deselect operator.

    C. The column names in the select operator should not be strings and wrapped in the col operator, so they should be expressed like select(~col(predError), ~col(productId), ~col(value)).

    D. The select operator should be replaced by the drop operator.

    E. The select operator should be replaced by the drop operator and the arguments to the drop operator should be column names predError, productId and value as strings.

  • Question 4:

    The code block displayed below contains an error. The code block should return DataFrame transactionsDf, but with the column storeId renamed to storeNumber. Find the error.

    Code block:

    transactionsDf.withColumn("storeNumber", "storeId")

    A. Instead of withColumn, the withColumnRenamed method should be used.

    B. Arguments "storeNumber" and "storeId" each need to be wrapped in a col() operator.

    C. Argument "storeId" should be the first and argument "storeNumber" should be the second argument to the withColumn method.

    D. The withColumn operator should be replaced with the copyDataFrame operator.

    E. Instead of withColumn, the withColumnRenamed method should be used and argument "storeId" should be the first and argument "storeNumber" should be the second argument to that method.

  • Question 5:

    The code block displayed below contains an error. The code block should use Python method find_most_freq_letter to find the letter present most in column itemName of DataFrame itemsDf and return it in a new column most_frequent_letter. Find the error.

    Code block:

    1.

    find_most_freq_letter_udf = udf(find_most_freq_letter)

    2.

    itemsDf.withColumn("most_frequent_letter", find_most_freq_letter("itemName"))

    A. Spark is not using the UDF method correctly.

    B. The UDF method is not registered correctly, since the return type is missing.

    C. The "itemName" expression should be wrapped in col().

    D. UDFs do not exist in PySpark.

    E. Spark is not adding a column.

  • Question 6:

    Which of the following describes characteristics of the Spark UI?

    A. Via the Spark UI, workloads can be manually distributed across executors.

    B. Via the Spark UI, stage execution speed can be modified.

    C. The Scheduler tab shows how jobs that are run in parallel by multiple users are distributed across the cluster.

    D. There is a place in the Spark UI that shows the property spark.executor.memory.

    E. Some of the tabs in the Spark UI are named Jobs, Stages, Storage, DAGs, Executors, and SQL.

  • Question 7:

    Which of the following code blocks performs an inner join between DataFrame itemsDf and DataFrame transactionsDf, using columns itemId and transactionId as join keys, respectively?

    A. itemsDf.join(transactionsDf, "inner", itemsDf.itemId == transactionsDf.transactionId)

    B. itemsDf.join(transactionsDf, itemId == transactionId)

    C. itemsDf.join(transactionsDf, itemsDf.itemId == transactionsDf.transactionId, "inner")

    D. itemsDf.join(transactionsDf, "itemsDf.itemId == transactionsDf.transactionId", "inner")

    E. itemsDf.join(transactionsDf, col(itemsDf.itemId) == col(transactionsDf.transactionId))

  • Question 8:

    The code block shown below should return a DataFrame with all columns of DataFrame transactionsDf, but only maximum 2 rows in which column productId has at least the value 2. Choose the answer that correctly fills the blanks in the code block to accomplish this.

    transactionsDf.__1__(__2__).__3__

    A. 1. where

    2.

    "productId" > 2

    3.

    max(2)

    B. 1. where

    2.

    transactionsDf[productId] >= 2

    3.

    limit(2)

    C. 1. filter

    2.

    productId > 2

    3.

    max(2)

    D. 1. filter

    2.

    col("productId") >= 2

    3.

    limit(2)

    E. 1. where

    2.

    productId >= 2

    3.

    limit(2)

  • Question 9:

    In which order should the code blocks shown below be run in order to return the number of records that are not empty in column value in the DataFrame resulting from an inner join of DataFrame transactionsDf and itemsDf on columns productId and itemId, respectively?

    1.

    .filter(~isnull(col('value')))

    2.

    .count()

    3.

    transactionsDf.join(itemsDf, col("transactionsDf.productId")==col("itemsDf.itemId"))

    4.

    transactionsDf.join(itemsDf, transactionsDf.productId==itemsDf.itemId, how='inner')

    5.

    .filter(col('value').isnotnull())

    6.

    .sum(col('value'))

    A. 4, 1, 2

    B. 3, 1, 6

    C. 3, 1, 2

    D. 3, 5, 2

    E. 4, 6

  • Question 10:

    The code block shown below should return a column that indicates through boolean variables whether rows in DataFrame transactionsDf have values greater or equal to 20 and smaller or equal to 30 in column storeId and have the value 2 in column productId. Choose the answer that correctly fills the blanks in the code block to accomplish this.

    transactionsDf.__1__((__2__.__3__) __4__ (__5__))

    A. 1. select

    2.

    col("storeId")

    3.

    between(20, 30)

    4.

    and

    5.

    col("productId")==2

    B. 1. where

    2.

    col("storeId")

    3.

    geq(20).leq(30)

    4.

    and

    5.

    col("productId")==2

    C. 1. select

    2.

    "storeId"

    3.

    between(20, 30)

    4.

    andand

    5.

    col("productId")==2

    D. 1. select

    2.

    col("storeId")

    3.

    between(20, 30)

    4.

    andand

    5.

    col("productId")=2

    E. 1. select

    2.

    col("storeId")

    3.

    between(20, 30)

    4.

    and

    5.

    col("productId")==2

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Databricks exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK exam preparations and Databricks certification application, do not hesitate to visit our Vcedump.com to find your solutions here.