Exam Details

  • Exam Code
    :DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK
  • Exam Name
    :Databricks Certified Associate Developer for Apache Spark 3.0
  • Certification
    :Databricks Certifications
  • Vendor
    :Databricks
  • Total Questions
    :180 Q&As
  • Last Updated
    :Jul 02, 2025

Databricks Databricks Certifications DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK Questions & Answers

  • Question 71:

    Which of the following code blocks returns a copy of DataFrame itemsDf where the column supplier has been renamed to manufacturer?

    A. itemsDf.withColumn(["supplier", "manufacturer"])

    B. itemsDf.withColumn("supplier").alias("manufacturer")

    C. itemsDf.withColumnRenamed("supplier", "manufacturer")

    D. itemsDf.withColumnRenamed(col("manufacturer"), col("supplier"))

    E. itemsDf.withColumnsRenamed("supplier", "manufacturer")

  • Question 72:

    The code block shown below should write DataFrame transactionsDf to disk at path csvPath as a single CSV file, using tabs (\t characters) as separators between columns, expressing missing

    values as string n/a, and omitting a header row with column names. Choose the answer that correctly fills the blanks in the code block to accomplish this.

    transactionsDf.__1__.write.__2__(__3__, " ").__4__.__5__(csvPath)

    A. 1. coalesce(1)

    2.

    option

    3.

    "sep"

    4.

    option("header", True)

    5.

    path

    B. 1. coalesce(1)

    2.

    option

    3.

    "colsep"

    4.

    option("nullValue", "n/a")

    5.

    path

    C. 1. repartition(1)

    2.

    option

    3.

    "sep"

    4.

    option("nullValue", "n/a")

    5.

    csv

    D. 1. csv

    2.

    option

    3.

    "sep"

    4.

    option("emptyValue", "n/a")

    5.

    path

    ?

    1.

    repartition(1)

    2.

    mode

    3.

    "sep"

    4.

    mode("nullValue", "n/a")

    5.

    csv

  • Question 73:

    The code block shown below should set the number of partitions that Spark uses when shuffling data for joins or aggregations to 100. Choose the answer that correctly fills the blanks in the code block to accomplish this.

    spark.sql.shuffle.partitions

    __1__.__2__.__3__(__4__, 100)

    A. 1. spark

    2.

    conf

    3.

    set

    4.

    "spark.sql.shuffle.partitions"

    B. 1. pyspark

    2.

    config

    3.

    set

    4.

    spark.shuffle.partitions

    C. 1. spark

    2.

    conf

    3.

    get

    4.

    "spark.sql.shuffle.partitions"

    D. 1. pyspark

    2.

    config

    3.

    set

    4.

    "spark.sql.shuffle.partitions"

    E. 1. spark

    2.

    conf

    3.

    set

    4.

    "spark.sql.aggregate.partitions"

  • Question 74:

    Which of the following DataFrame operators is never classified as a wide transformation?

    A. DataFrame.sort()

    B. DataFrame.aggregate()

    C. DataFrame.repartition()

    D. DataFrame.select()

    E. DataFrame.join()

  • Question 75:

    Which of the following describes a shuffle?

    A. A shuffle is a process that is executed during a broadcast hash join.

    B. A shuffle is a process that compares data across executors.

    C. A shuffle is a process that compares data across partitions.

    D. A shuffle is a Spark operation that results from DataFrame.coalesce().

    E. A shuffle is a process that allocates partitions to executors.

  • Question 76:

    The code block shown below should return a copy of DataFrame transactionsDf without columns value and productId and with an additional column associateId that has the value 5. Choose the answer that correctly fills the blanks in the code block to accomplish this.

    transactionsDf.__1__(__2__, __3__).__4__(__5__, 'value')

    A. 1. withColumn

    2.

    'associateId'

    3.

    5

    4.

    remove

    5.

    'productId'

    B. 1. withNewColumn

    2.

    associateId

    3.

    lit(5)

    4.

    drop

    5.

    productId

    C. 1. withColumn

    2.

    'associateId'

    3.

    lit(5)

    4.

    drop

    5.

    'productId'

    D. 1. withColumnRenamed

    2.

    'associateId'

    3.

    5

    4.

    drop

    5.

    'productId'

    E. 1. withColumn

    2.

    col(associateId)

    3.

    lit(5)

    4.

    drop

    5.

    col(productId)

  • Question 77:

    The code block shown below should return all rows of DataFrame itemsDf that have at least 3 items in column itemNameElements. Choose the answer that correctly fills the blanks in the code block to accomplish this.

    Example of DataFrame itemsDf:

    1.+------+----------------------------------+-------------------+------------------------------------------+

    2.|itemId|itemName |supplier |itemNameElements |

    3.+------+----------------------------------+-------------------+------------------------------------------+

    4.|1 |Thick Coat for Walking in the Snow|Sports Company Inc.|[Thick, Coat, for, Walking, in, the, Snow]|

    5.|2 |Elegant Outdoors Summer Dress |YetiX |[Elegant, Outdoors, Summer, Dress] |

    6.|3 |Outdoors Backpack |Sports Company Inc.|[Outdoors, Backpack] |

    7.+------+----------------------------------+-------------------+------------------------------------------+

    Code block:

    itemsDf.__1__(__2__(__3__)__4__)

    A. 1. select

    2.

    count

    3.

    col("itemNameElements")

    4.

    >3

    B. 1. filter

    2.

    count

    3.

    itemNameElements

    4.

    >=3

    C. 1. select

    2.

    count

    3.

    "itemNameElements"

    4.

    >3

    D. 1. filter

    2.

    size

    3.

    "itemNameElements"

    4.

    >=3

    E. 1. select

    2.

    size

    3.

    "itemNameElements"

    4.

    >3

  • Question 78:

    In which order should the code blocks shown below be run in order to create a DataFrame that shows the mean of column predError of DataFrame transactionsDf per column storeId and productId, where productId should be either 2 or 3 and the returned DataFrame should be sorted in ascending order by column storeId, leaving out any nulls in that column?

    DataFrame transactionsDf:

    1.+-------------+---------+-----+-------+---------+----+

    2.|transactionId|predError|value|storeId|productId| f|

    3.+-------------+---------+-----+-------+---------+----+

    4.| 1| 3| 4| 25| 1|null|

    5.| 2| 6| 7| 2| 2|null|

    6.| 3| 3| null| 25| 3|null|

    7.| 4| null| null| 3| 2|null|

    8.| 5| null| null| null| 2|null|

    9.| 6| 3| 2| 25| 2|null|

    10.+-------------+---------+-----+-------+---------+----+

    1.

    .mean("predError")

    2.

    .groupBy("storeId")

    3.

    .orderBy("storeId")

    4.

    transactionsDf.filter(transactionsDf.storeId.isNotNull())

    5.

    .pivot("productId", [2, 3])

    A. 4, 5, 2, 3, 1

    B. 4, 2, 1

    C. 4, 1, 5, 2, 3

    D. 4, 2, 5, 1, 3

    E. 4, 3, 2, 5, 1

  • Question 79:

    Which of the following describes slots?

    A. Slots are dynamically created and destroyed in accordance with an executor's workload.

    B. To optimize I/O performance, Spark stores data on disk in multiple slots.

    C. A Java Virtual Machine (JVM) working as an executor can be considered as a pool of slots for task execution.

    D. A slot is always limited to a single core. Slots are the communication interface for executors and are used for receiving commands and sending results to the driver.

  • Question 80:

    Which of the following describes Spark's standalone deployment mode?

    A. Standalone mode uses a single JVM to run Spark driver and executor processes.

    B. Standalone mode means that the cluster does not contain the driver.

    C. Standalone mode is how Spark runs on YARN and Mesos clusters.

    D. Standalone mode uses only a single executor per worker per application.

    E. Standalone mode is a viable solution for clusters that run multiple frameworks, not only Spark.

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Databricks exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK exam preparations and Databricks certification application, do not hesitate to visit our Vcedump.com to find your solutions here.