Exam Details

  • Exam Code
    :DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK
  • Exam Name
    :Databricks Certified Associate Developer for Apache Spark 3.0
  • Certification
    :Databricks Certifications
  • Vendor
    :Databricks
  • Total Questions
    :180 Q&As
  • Last Updated
    :Jul 02, 2025

Databricks Databricks Certifications DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK Questions & Answers

  • Question 11:

    Which of the following code blocks produces the following output, given DataFrame transactionsDf?

    Output:

    1.root

    2.

    |-- transactionId: integer (nullable = true)

    3.

    |-- predError: integer (nullable = true)

    4.

    |-- value: integer (nullable = true)

    5.

    |-- storeId: integer (nullable = true)

    6.

    |-- productId: integer (nullable = true)

    7.

    |-- f: integer (nullable = true)

    DataFrame transactionsDf:

    1.+-------------+---------+-----+-------+---------+----+

    2.|transactionId|predError|value|storeId|productId| f|

    3.+-------------+---------+-----+-------+---------+----+

    4.| 1| 3| 4| 25| 1|null|

    5.| 2| 6| 7| 2| 2|null|

    6.| 3| 3| null| 25| 3|null|

    7.+-------------+---------+-----+-------+---------+----+

    A. transactionsDf.schema.print()

    B. transactionsDf.rdd.printSchema()

    C. transactionsDf.rdd.formatSchema()

    D. transactionsDf.printSchema()

    E. print(transactionsDf.schema)

  • Question 12:

    The code block shown below should return a new 2-column DataFrame that shows one attribute from column attributes per row next to the associated itemName, for all suppliers in column supplier whose name includes Sports. Choose the answer that correctly fills the blanks in the code block to accomplish this.

    Sample of DataFrame itemsDf:

    1.+------+----------------------------------+-----------------------------+-------------------+

    2.|itemId|itemName |attributes |supplier |

    3.+------+----------------------------------+-----------------------------+-------------------+

    4.|1 |Thick Coat for Walking in the Snow|[blue, winter, cozy] |Sports Company Inc.| 5.|2 |Elegant Outdoors Summer Dress |[red, summer, fresh, cooling]|YetiX |

    6.|3 |Outdoors Backpack |[green, summer, travel] |Sports Company Inc.|

    7.+------+----------------------------------+-----------------------------+-------------------+

    Code block:

    itemsDf.__1__(__2__).select(__3__, __4__) A. 1. filter

    2.

    col("supplier").isin("Sports")

    3.

    "itemName"

    4.

    explode(col("attributes"))

    B. 1. where

    2.

    col("supplier").contains("Sports")

    3.

    "itemName"

    4.

    "attributes"

    C. 1. where

    2.

    col(supplier).contains("Sports")

    3.

    explode(attributes)

    4.

    itemName

    D. 1. where

    2.

    "Sports".isin(col("Supplier"))

    3.

    "itemName"

    4.

    array_explode("attributes")

    E. 1. filter

    2.

    col("supplier").contains("Sports")

    3.

    "itemName"

    4.

    explode("attributes")

  • Question 13:

    Which of the following code blocks returns the number of unique values in column storeId of DataFrame transactionsDf?

    A. transactionsDf.select("storeId").dropDuplicates().count()

    B. transactionsDf.select(count("storeId")).dropDuplicates()

    C. transactionsDf.select(distinct("storeId")).count()

    D. transactionsDf.dropDuplicates().agg(count("storeId"))

    E. transactionsDf.distinct().select("storeId").count()

  • Question 14:

    Which of the following code blocks returns a 2-column DataFrame that shows the distinct values in column productId and the number of rows with that productId in DataFrame transactionsDf?

    A. transactionsDf.count("productId").distinct()

    B. transactionsDf.groupBy("productId").agg(col("value").count())

    C. transactionsDf.count("productId")

    D. transactionsDf.groupBy("productId").count()

    E. transactionsDf.groupBy("productId").select(count("value"))

  • Question 15:

    Which of the following code blocks sorts DataFrame transactionsDf both by column storeId in ascending and by column productId in descending order, in this priority?

    A. transactionsDf.sort("storeId", asc("productId"))

    B. transactionsDf.sort(col(storeId)).desc(col(productId))

    C. transactionsDf.order_by(col(storeId), desc(col(productId)))

    D. transactionsDf.sort("storeId", desc("productId"))

    E. transactionsDf.sort("storeId").sort(desc("productId"))

  • Question 16:

    The code block shown below should show information about the data type that column storeId of DataFrame transactionsDf contains. Choose the answer that correctly fills the blanks in the code block to accomplish this.

    Code block:

    transactionsDf.__1__(__2__).__3__

    A. 1. select

    2.

    "storeId"

    3.

    print_schema()

    B. 1. limit

    2.

    1

    3.

    columns

    C. 1. select

    2.

    "storeId"

    3.

    printSchema()

    D. 1. limit

    2.

    "storeId"

    3.

    printSchema()

    E. 1. select

    2.

    storeId

    3.

    dtypes

  • Question 17:

    Which of the following describes Spark actions?

    A. Writing data to disk is the primary purpose of actions.

    B. Actions are Spark's way of exchanging data between executors.

    C. The driver receives data upon request by actions.

    D. Stage boundaries are commonly established by actions.

    E. Actions are Spark's way of modifying RDDs.

  • Question 18:

    Which of the elements in the labeled panels represent the operation performed for broadcast variables?

    Larger image

    A. 2, 5

    B. 3

    C. 2, 3

    D. 1, 2

    E. 1, 3, 4

  • Question 19:

    The code block displayed below contains an error. The code block should save DataFrame transactionsDf at path path as a parquet file, appending to any existing parquet file. Find the error.

    Code block:

    A. transactionsDf.format("parquet").option("mode", "append").save(path)

    B. The code block is missing a reference to the DataFrameWriter.

    C. save() is evaluated lazily and needs to be followed by an action.

    D. The mode option should be omitted so that the command uses the default mode.

    E. The code block is missing a bucketBy command that takes care of partitions.

    F. Given that the DataFrame should be saved as parquet file, path is being passed to the wrong method.

  • Question 20:

    Which of the following code blocks applies the Python function to_limit on column predError in table transactionsDf, returning a DataFrame with columns transactionId and result?

    A. 1.spark.udf.register("LIMIT_FCN", to_limit) 2.spark.sql("SELECT transactionId, LIMIT_FCN(predError) AS result FROM transactionsDf")

    B. 1.spark.udf.register("LIMIT_FCN", to_limit) 2.spark.sql("SELECT transactionId, LIMIT_FCN(predError) FROM transactionsDf AS result")

    C. 1.spark.udf.register("LIMIT_FCN", to_limit) 2.spark.sql("SELECT transactionId, to_limit(predError) AS result FROM transactionsDf") spark.sql ("SELECT transactionId, udf(to_limit(predError)) AS result FROM transactionsDf")

    D. 1.spark.udf.register(to_limit, "LIMIT_FCN") 2.spark.sql("SELECT transactionId, LIMIT_FCN(predError) AS result FROM transactionsDf")

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Databricks exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK exam preparations and Databricks certification application, do not hesitate to visit our Vcedump.com to find your solutions here.