Vcedump 100% Guareented DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK Questions and Answers. 100% Pass Guarantee. Latest Questions with Accurate Answers.

Exam Details

Exam Code
:DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK
Exam Name
:Databricks Certified Associate Developer for Apache Spark 3.0
Certification
:Databricks Certifications
Vendor
:Databricks
Total Questions
:180 Q&As
Last Updated
:Jul 02, 2025

Databricks Databricks Certifications DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK Questions & Answers

Question 161:

In which order should the code blocks shown below be run in order to assign articlesDf a DataFrame that lists all items in column attributes ordered by the number of times these items occur, from most to least often?
Sample of DataFrame articlesDf:
1.
articlesDf = articlesDf.groupby("col")
2.
articlesDf = articlesDf.select(explode(col("attributes")))
3.
articlesDf = articlesDf.orderBy("count").select("col")
4.
articlesDf = articlesDf.sort("count",ascending=False).select("col")
5.
articlesDf = articlesDf.groupby("col").count()
A. 4, 5
B. 2, 5, 3
C. 5, 2
D. 2, 3, 4
E. 2, 5, 4

Correct Answer: E
Question 162:

Which of the following code blocks creates a new DataFrame with two columns season and wind_speed_ms where column season is of data type string and column wind_speed_ms is of data type double?
A. spark.DataFrame({"season": ["winter","summer"], "wind_speed_ms": [4.5, 7.5]})
B. spark.createDataFrame([("summer", 4.5), ("winter", 7.5)], ["season", "wind_speed_ms"])
C. 1. from pyspark.sql import types as T
2. spark.createDataFrame((("summer", 4.5), ("winter", 7.5)),
D. StructType([T.StructField("season", T.CharType()), T.StructField("season",
E. DoubleType())]))
F. spark.newDataFrame([("summer", 4.5), ("winter", 7.5)], ["season", "wind_speed_ms"])
G. spark.createDataFrame({"season": ["winter","summer"], "wind_speed_ms": [4.5, 7.5]})

Correct Answer: B
Question 163:

Which of the following code blocks creates a new DataFrame with 3 columns, productId, highest, and
lowest, that shows the biggest and smallest values of column value per value in column
productId from DataFrame transactionsDf?
Sample of DataFrame transactionsDf:
1.+-------------+---------+-----+-------+---------+----+
2.|transactionId|predError|value|storeId|productId| f|
3.+-------------+---------+-----+-------+---------+----+
4.| 1| 3| 4| 25| 1|null|
5.| 2| 6| 7| 2| 2|null|
6.| 3| 3| null| 25| 3|null|
7.| 4| null| null| 3| 2|null|
8.| 5| null| null| null| 2|null|
9.| 6| 3| 2| 25| 2|null|
10.+-------------+---------+-----+-------+---------+----+
A. transactionsDf.max('value').min('value')
B. transactionsDf.agg(max('value').alias('highest'), min('value').alias('lowest'))
C. transactionsDf.groupby(col(productId)).agg(max(col(value)).alias("highest"), min(col(value)).alias ("lowest"))
D. transactionsDf.groupby('productId').agg(max('value').alias('highest'), min('value').alias('lowest'))
E. transactionsDf.groupby("productId").agg({"highest": max("value"), "lowest": min("value")})

Correct Answer: D
Question 164:

Which of the following code blocks stores DataFrame itemsDf in executor memory and, if insufficient memory is available, serializes it and saves it to disk?
A. itemsDf.persist(StorageLevel.MEMORY_ONLY)
B. itemsDf.cache(StorageLevel.MEMORY_AND_DISK)
C. itemsDf.store()
D. itemsDf.cache()
E. itemsDf.write.option('destination', 'memory').save()

Correct Answer: D
Question 165:

The code block displayed below contains an error. The code block should produce a DataFrame with color
as the only column and three rows with color values of red, blue, and green, respectively.
Find the error.
Code block:
1.spark.createDataFrame([("red",), ("blue",), ("green",)], "color")
2.Instead of calling spark.createDataFrame, just DataFrame should be called.
A. The commas in the tuples with the colors should be eliminated.
B. The colors red, blue, and green should be expressed as a simple Python list, and not a list of tuples.
C. Instead of color, a data type should be specified.
D. The "color" expression needs to be wrapped in brackets, so it reads ["color"].

Correct Answer: D
Question 166:

Which of the following code blocks returns a DataFrame showing the mean value of column "value" of DataFrame transactionsDf, grouped by its column storeId?
A. transactionsDf.groupBy(col(storeId).avg())
B. transactionsDf.groupBy("storeId").avg(col("value"))
C. transactionsDf.groupBy("storeId").agg(avg("value"))
D. transactionsDf.groupBy("storeId").agg(average("value"))
E. transactionsDf.groupBy("value").average()

Correct Answer: C
Question 167:

The code block shown below should return a copy of DataFrame transactionsDf with an added column
cos. This column should have the values in column value converted to degrees and having the cosine of
those converted values taken, rounded to two decimals. Choose the answer that correctly fills the blanks in
the code block to accomplish this.
Code block:
transactionsDf.__1__(__2__, round(__3__(__4__(__5__)),2))
A. 1. withColumn
2.
col("cos")
3.
cos
4.
degrees
5.
transactionsDf.value
B. 1. withColumnRenamed
2.
"cos"
3.
cos
4.
degrees
5.
"transactionsDf.value"
C. 1. withColumn
2.
"cos"
3.
cos
4.
degrees
5.
transactionsDf.value
D. 1. withColumn
2.
col("cos")
3.
cos
4.
degrees
5.
col("value")
E. 1. withColumn
2.
"cos"
3.
degrees
4.
cos
5.
col("value")

Correct Answer: C
Question 168:

The code block displayed below contains multiple errors. The code block should return a DataFrame that
contains only columns transactionId, predError, value and storeId of DataFrame
transactionsDf. Find the errors.
Code block:
transactionsDf.select([col(productId), col(f)])
Sample of transactionsDf:
1.+-------------+---------+-----+-------+---------+----+
2.|transactionId|predError|value|storeId|productId| f| 3.+-------------+---------+-----+-------+---------+----+
4.| 1| 3| 4| 25| 1|null|
5.| 2| 6| 7| 2| 2|null|
6.| 3| 3| null| 25| 3|null|
7.+-------------+---------+-----+-------+---------+----+
A. The column names should be listed directly as arguments to the operator and not as a list.
B. The select operator should be replaced by a drop operator, the column names should be listed directly as arguments to the operator and not as a list, and all column names should be expressed as strings without being wrapped in a col() operator.
C. The select operator should be replaced by a drop operator.
D. The column names should be listed directly as arguments to the operator and not as a list and following the pattern of how column names are expressed in the code block, columns productId and f should be replaced by transactionId, predError, value and storeId.
E. The select operator should be replaced by a drop operator, the column names should be listed directly as arguments to the operator and not as a list, and all col() operators should be removed.

Correct Answer: B
Question 169:

Which of the following code blocks returns all unique values across all values in columns value and productId in DataFrame transactionsDf in a one-column DataFrame?
A. tranactionsDf.select('value').join(transactionsDf.select('productId'), col('value')==col('productId'), 'outer')
B. transactionsDf.select(col('value'), col('productId')).agg({'*': 'count'})
C. transactionsDf.select('value', 'productId').distinct()
D. transactionsDf.select('value').union(transactionsDf.select('productId')).distinct()
E. transactionsDf.agg({'value': 'collect_set', 'productId': 'collect_set'})

Correct Answer: D
Question 170:

The code block shown below should store DataFrame transactionsDf on two different executors, utilizing the executors' memory as much as possible, but not writing anything to disk. Choose the answer that correctly fills the blanks in the code block to accomplish this.
1.from pyspark import StorageLevel 2.transactionsDf.__1__(StorageLevel.__2__).__3__
A. 1. cache
2.
MEMORY_ONLY_2
3.
count()
B. 1. persist
2.
DISK_ONLY_2
3.
count()
C. 1. persist
2.
MEMORY_ONLY_2
3.
select()
D. 1. cache
2.
DISK_ONLY_2
3.
count()
E. 1. persist
2.
MEMORY_ONLY_2
3.
count()

Correct Answer: E

Related Exams:

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Databricks exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK exam preparations and Databricks certification application, do not hesitate to visit our Vcedump.com to find your solutions here.

Exam Details

Exam Code

Exam Name

Certification

Vendor

Total Questions

Last Updated

Databricks Databricks Certifications DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK Questions & Answers

Question 161:

Question 162:

Question 163:

Question 164:

Question 165:

Question 166:

Question 167:

Question 168:

Question 169:

Question 170:

Related Exams:

DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK

DATABRICKS-CERTIFIED-DATA-ANALYST-ASSOCIATE

DATABRICKS-CERTIFIED-DATA-ENGINEER-ASSOCIATE

DATABRICKS-CERTIFIED-GENERATIVE-AI-ENGINEER-ASSOCIATE

DATABRICKS-CERTIFIED-PROFESSIONAL-DATA-ENGINEER

DATABRICKS-CERTIFIED-PROFESSIONAL-DATA-SCIENTIST

DATABRICKS-MACHINE-LEARNING-ASSOCIATE

DATABRICKS-MACHINE-LEARNING-PROFESSIONAL

Tips on How to Prepare for the Exams

Databricks Certified Associate Developer for Apache Spark 3.0

Exam Details

Exam Code

Exam Name

Certification

Vendor

Total Questions

Last Updated

Databricks Databricks Certifications DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK Questions & Answers

Question 161:

Question 162:

Question 163:

Question 164:

Question 165:

Question 166:

Question 167:

Question 168:

Question 169:

Question 170:

Related Exams:

Tips on How to Prepare for the Exams