Exam Details

  • Exam Code
    :CCA175
  • Exam Name
    :CCA Spark and Hadoop Developer Exam
  • Certification
    :Cloudera Certified Associate CCA
  • Vendor
    :Cloudera
  • Total Questions
    :95 Q&As
  • Last Updated
    :May 12, 2024

Cloudera Cloudera Certified Associate CCA CCA175 Questions & Answers

  • Question 21:

    Problem Scenario 88 : You have been given below three files product.csv (Create this file in hdfs) productID,productCode,name,quantity,price,supplierid 1001,PEN,Pen Red,5000,1.23,501 1002,PEN,Pen Blue,8000,1.25,501 1003,PEN,Pen Black,2000,1.25,501 1004,PEC,Pencil 2B,10000,0.48,502 1005,PEC,Pencil 2H,8000,0.49,502 1006,PEC,Pencil HB,0,9999.99,502 2001,PEC,Pencil 3B,500,0.52,501 2002,PEC,Pencil 4B,200,0.62,501 2003,PEC,Pencil 5B,100,0.73,501 2004,PEC,Pencil 6B,500,0.47,502 supplier.csv supplierid,name,phone 501,ABC Traders,88881111 502,XYZ Company,88882222 503,QQ Corp,88883333 products_suppliers.csv productID,supplierID 2001,501 2002,501 2003,501 2004,502 2001,503 Now accomplish all the queries given in solution.

    1.

    It is possible that, same product can be supplied by multiple supplier. Now find each product, its price according to each supplier.

    2.

    Find all the supllier name, who are supplying 'Pencil 3B'

    3.

    Find all the products , which are supplied by ABC Traders.

  • Question 22:

    Problem Scenario 59 : You have been given below code snippet.

    val x = sc.parallelize(1 to 20)

    val y = sc.parallelize(10 to 30) operationl

    z.collect

    Write a correct code snippet for operationl which will produce desired output, shown below.

    Array[lnt] = Array(16,12, 20,13,17,14,18,10,19,15,11)

  • Question 23:

    Problem Scenario 95 : You have to run your Spark application on yarn with each executor Maximum heap size to be 512MB and Number of processor cores to allocate on each executor will be 1 and Your main application required three values as input arguments V1 V2 V3. Please replace XXX, YYY, ZZZ ./bin/spark-submit -class com.hadoopexam.MyTask --master yarn-cluster--num-executors 3 --driver-memory 512m XXX YYY lib/hadoopexam.jarZZZ

  • Question 24:

    Problem Scenario 93 : You have to run your Spark application with locally 8 thread or locally on 8 cores. Replace XXX with correct values. spark-submit --class com.hadoopexam.MyTask XXX \ -deploy-mode cluster SSPARK_HOME/lib/hadoopexam.jar 10

  • Question 25:

    Problem Scenario 89 : You have been given below patient data in csv format, patientID,name,dateOfBirth,lastVisitDate 1001,Ah Teck,1991-12-31,2012-01-20 1002,Kumar,2011-10-29,2012-09-20 1003,Ali,2011-01-30,2012-10-21 Accomplish following activities.

    1.

    Find all the patients whose lastVisitDate between current time and '2012-09-15'

    2.

    Find all the patients who born in 2011

    3.

    Find all the patients age

    4.

    List patients whose last visited more than 60 days ago

    5.

    Select patients 18 years old or younger

  • Question 26:

    Problem Scenario 58 : You have been given below code snippet.

    val a = sc.parallelize(List("dog", "tiger", "lion", "cat", "spider", "eagle"), 2) val b =

    a.keyBy(_.length)

    operation1

    Write a correct code snippet for operationl which will produce desired output, shown below.

    Array[(lnt, Seq[String])] = Array((4,ArrayBuffer(lion)), (6,ArrayBuffer(spider)),

    (3,ArrayBuffer(dog, cat)), (5,ArrayBuffer(tiger, eagle}}}

  • Question 27:

    Problem Scenario 14 : You have been given following mysql database details as well as other info. user=retail_dba password=cloudera database=retail_db jdbc URL = jdbc:mysql://quickstart:3306/retail_db Please accomplish following activities.

    1.

    Create a csv file named updated_departments.csv with the following contents in local file system. updated_departments.csv 2,fitness 3,footwear 12,fathematics 13,fcience 14,engineering 1000,management

    2.

    Upload this csv file to hdfs filesystem,

    3.

    Now export this data from hdfs to mysql retaildb.departments table. During upload make sure existing department will just updated and new departments needs to be inserted.

    4.

    Now update updated_departments.csv file with below content. 2,Fitness 3,Footwear 12,Fathematics 13,Science 14,Engineering 1000,Management 2000,Quality Check

    5.

    Now upload this file to hdfs.

    6.

    Now export this data from hdfs to mysql retail_db.departments table. During upload make sure existing department will just updated and no new departments needs to be inserted.

  • Question 28:

    Problem Scenario 61 : You have been given below code snippet. val a = sc.parallelize(List("dog", "salmon", "salmon", "rat", "elephant"), 3)

    val b = a.keyBy(_.length)

    val c = sc.parallelize(List("dog","cat","gnu","salmon","rabbit","turkey","wolf","bear","bee"), 3)

    val d = c.keyBy(_.length) operationl

    Write a correct code snippet for operationl which will produce desired output, shown below.

    Array[(lnt, (String, Option[String]}}] = Array((6,(salmon,Some(salmon))),

    (6,(salmon,Some(rabbit))),

    (6,(salmon,Some(turkey))), (6,(salmon,Some(salmon))), (6,(salmon,Some(rabbit))),

    (6,(salmon,Some(turkey))), (3,(dog,Some(dog))), (3,(dog,Some(cat))),

    (3,(dog,Some(dog))), (3,(dog,Some(bee))), (3,(rat,Some(dogg)), (3,(rat,Some(cat)j),

    (3,(rat.Some(gnu))). (3,(rat,Some(bee))), (8,(elephant,None)))

  • Question 29:

    Problem Scenario 53 : You have been given below code snippet. val a = sc.parallelize(1 to 10, 3) operation1 b.collect Output 1 Array[lnt] = Array(2, 4, 6, 8,10) operation2 Output 2 Array[lnt] = Array(1,2, 3) Write a correct code snippet for operation1 and operation2 which will produce desired output, shown above.

  • Question 30:

    Problem Scenario 33 : You have given a files as below. spark5/EmployeeName.csv (id,name) spark5/EmployeeSalary.csv (id,salary) Data is given below: EmployeeName.csv E01,Lokesh E02,Bhupesh E03,Amit E04,Ratan E05,Dinesh E06,Pavan E07,Tejas E08,Sheela E09,Kumar E10,Venkat EmployeeSalary.csv E01,50000 E02,50000 E03,45000 E04,45000 E05,50000 E06,45000 E07,50000 E08,10000 E09,10000 E10,10000 Now write a Spark code in scala which will load these two tiles from hdfs and join the same, and produce the (name.salary) values. And save the data in multiple tile group by salary (Means each file will have name of employees with same salary). Make sure file name include salary as well.

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Cloudera exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your CCA175 exam preparations and Cloudera certification application, do not hesitate to visit our Vcedump.com to find your solutions here.