Exam Details

  • Exam Code
    :CCA175
  • Exam Name
    :CCA Spark and Hadoop Developer Exam
  • Certification
    :Cloudera Certified Associate CCA
  • Vendor
    :Cloudera
  • Total Questions
    :95 Q&As
  • Last Updated
    :May 12, 2024

Cloudera Cloudera Certified Associate CCA CCA175 Questions & Answers

  • Question 11:

    Problem Scenario 13 : You have been given following mysql database details as well as other info. user=retail_dba password=cloudera database=retail_db jdbc URL = jdbc:mysql://quickstart:3306/retail_db Please accomplish following.

    1.

    Create a table in retailedb with following definition.

    CREATE table departments_export (department_id int(11), department_name varchar(45),

    created_date T1MESTAMP DEFAULT NOWQ);

    2.

    Now import the data from following directory into departments_export table,

    /user/cloudera/departments new

  • Question 12:

    Problem Scenario 38 : You have been given an RDD as below,

    val rdd: RDD[Array[Byte]]

    Now you have to save this RDD as a SequenceFile. And below is the code snippet.

    import org.apache.hadoop.io.compress.GzipCodec

    rdd.map(bytesArray => (A.get(), new B(bytesArray))).saveAsSequenceFile('7output/path",classOt[GzipCodec]) What would be the correct replacement for A and B in above snippet.

  • Question 13:

    Problem Scenario 43 : You have been given following code snippet.

    val grouped = sc.parallelize(Seq(((1,"twoM), List((3,4), (5,6)))))

    val flattened = grouped.flatMap {A =>

    groupValues.map { value => B }

    }

    You need to generate following output.

    Hence replace A and B

    Array((1,two,3,4),(1,two,5,6))

  • Question 14:

    Problem Scenario 75 : You have been given MySQL DB with following details. user=retail_dba password=cloudera database=retail_db table=retail_db.orders table=retail_db.order_items jdbc URL = jdbc:mysql://quickstart:3306/retail_db Please accomplish following activities.

    1.

    Copy "retail_db.order_items" table to hdfs in respective directory p90_order_items .

    2.

    Do the summation of entire revenue in this table using pyspark.

    3.

    Find the maximum and minimum revenue as well.

    4.

    Calculate average revenue

    Columns of ordeMtems table : (order_item_id , order_item_order_id ,

    order_item_product_id, order_item_quantity,order_item_subtotal,order_

    item_subtotal,order_item_product_price)

  • Question 15:

    Problem Scenario 60 : You have been given below code snippet.

    val a = sc.parallelize(List("dog", "salmon", "salmon", "rat", "elephant"}, 3}

    val b = a.keyBy(_.length)

    val c = sc.parallelize(List("dog","cat","gnu","salmon","rabbit","turkey","woif","bear","bee"), 3)

    val d = c.keyBy(_.length)

    operation1

    Write a correct code snippet for operationl which will produce desired output, shown below.

    Array[(lnt, (String, String))] = Array((6,(salmon,salmon)), (6,(salmon,rabbit)),

    (6,(salmon,turkey)), (6,(salmon,salmon)), (6,(salmon,rabbit)),

    (6,(salmon,turkey)), (3,(dog,dog)), (3,(dog,cat)), (3,(dog,gnu)), (3,(dog,bee)), (3,(rat,dog)),

    (3,(rat,cat)), (3,(rat,gnu)), (3,(rat,bee)))

  • Question 16:

    Problem Scenario 86 : In Continuation of previous question, please accomplish following activities.

    1.

    Select Maximum, minimum, average , Standard Deviation, and total quantity.

    2.

    Select minimum and maximum price for each product code.

    3.

    Select Maximum, minimum, average , Standard Deviation, and total quantity for each product code, hwoever make sure Average and Standard deviation will have maximum two decimal values.

    4.

    Select all the product code and average price only where product count is more than or equal to 3.

    5.

    Select maximum, minimum , average and total of all the products for each code. Also produce the same across all the products.

  • Question 17:

    Problem Scenario 92 : You have been given a spark scala application, which is bundled in

    jar named hadoopexam.jar.

    Your application class name is com.hadoopexam.MyTask

    You want that while submitting your application should launch a driver on one of the cluster

    node.

    Please complete the following command to submit the application.

    spark-submit XXX -master yarn \

    YYY SSPARK HOME/lib/hadoopexam.jar 10

  • Question 18:

    Problem Scenario 94 : You have to run your Spark application on yarn with each executor

    20GB and number of executors should be 50. Please replace XXX, YYY, ZZZ

    export HADOOP_CONF_DIR=XXX

    ./bin/spark-submit \

    -class com.hadoopexam.MyTask \

    xxx\

    -deploy-mode cluster \ # can be client for client mode

    YYY\

    222 \

    /path/to/hadoopexam.jar \

    1000

  • Question 19:

    Problem Scenario 73 : You have been given data in json format as below.

    {"first_name":"Ankit", "last_name":"Jain"}

    {"first_name":"Amir", "last_name":"Khan"}

    {"first_name":"Rajesh", "last_name":"Khanna"}

    {"first_name":"Priynka", "last_name":"Chopra"}

    {"first_name":"Kareena", "last_name":"Kapoor"}

    {"first_name":"Lokesh", "last_name":"Yadav"}

    Do the following activity

    1.

    create employee.json file locally.

    2.

    Load this file on hdfs

    3.

    Register this data as a temp table in Spark using Python.

    4.

    Write select query and print this data.

    5.

    Now save back this selected data in json format.

  • Question 20:

    Problem Scenario 10 : You have been given following mysql database details as well as other info. user=retail_dba password=cloudera database=retail_db jdbc URL = jdbc:mysql://quickstart:3306/retail_db Please accomplish following.

    1. Create a database named hadoopexam and then create a table named departments in it, with following fields. department_id int, department_name string

    e.g. location should be hdfs://quickstart.cloudera:8020/user/hive/warehouse/hadoopexam.db/departments

    2.

    Please import data in existing table created above from retaidb.departments into hive table hadoopexam.departments.

    3.

    Please import data in a non-existing table, means while importing create hive table named hadoopexam.departments_new

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Cloudera exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your CCA175 exam preparations and Cloudera certification application, do not hesitate to visit our Vcedump.com to find your solutions here.