Exam Details

  • Exam Code
    :CCA-500
  • Exam Name
    :Cloudera Certified Administrator for Apache Hadoop (CCAH)
  • Certification
    :CCAH
  • Vendor
    :Cloudera
  • Total Questions
    :60 Q&As
  • Last Updated
    :

Cloudera CCAH CCA-500 Questions & Answers

  • Question 1:

    Which two are features of Hadoop's rack topology? (Choose two)

    A. Configuration of rack awareness is accomplished using a configuration file. You cannot use a rack topology script.

    B. Hadoop gives preference to intra-rack data transfer in order to conserve bandwidth

    C. Rack location is considered in the HDFS block placement policy

    D. HDFS is rack aware but MapReduce daemon are not

    E. Even for small clusters on a single rack, configuring rack awareness will improve performance

  • Question 2:

    Your company stores user profile records in an OLTP databases. You want to join these records with web server logs you have already ingested into the Hadoop file system. What is the best way to obtain and ingest these user records?

    A. Ingest with Hadoop streaming

    B. Ingest using Hive's IQAD DATA command

    C. Ingest with sqoop import

    D. Ingest with Pig's LOAD command

    E. Ingest using the HDFS put command

  • Question 3:

    A user comes to you, complaining that when she attempts to submit a Hadoop job, it fails. There is a Directory in HDFS named /data/input. The Jar is named j.jar, and the driver class is named DriverClass.

    She runs the command:

    Hadoop jar j.jar DriverClass /data/input/data/output

    The error message returned includes the line: PriviligedActionException as:training (auth:SIMPLE) cause:org.apache.hadoop.mapreduce.lib.input.invalidInputException: Input path does not exist: file:/data/input

    What is the cause of the error?

    A. The user is not authorized to run the job on the cluster

    B. The output directory already exists

    C. The name of the driver has been spelled incorrectly on the command line

    D. The directory name is misspelled in HDFS

    E. The Hadoop configuration files on the client do not point to the cluster

  • Question 4:

    You have a cluster running with a FIFO scheduler enabled. You submit a large job A to the cluster, which

    you expect to run for one hour. Then, you submit job B to the cluster, which you expect to run a couple of

    minutes only.

    You submit both jobs with the same priority.

    Which two best describes how FIFO Scheduler arbitrates the cluster resources for job and its tasks?

    (Choose two)

    A. Because there is a more than a single job on the cluster, the FIFO Scheduler will enforce a limit on the percentage of resources allocated to a particular job at any given time

    B. Tasks are scheduled on the order of their job submission

    C. The order of execution of job may vary

    D. Given job A and submitted in that order, all tasks from job A are guaranteed to finish before all tasks from job B

    E. The FIFO Scheduler will give, on average, and equal share of the cluster resources over the job lifecycle

    F. The FIFO Scheduler will pass an exception back to the client when Job B is submitted, since all slots on the cluster are use

  • Question 5:

    You are running a Hadoop cluster with MapReduce version 2 (MRv2) on YARN. You consistently see that MapReduce map tasks on your cluster are running slowly because of excessive garbage collection of JVM, how do you increase JVM heap size property to 3GB to optimize performance?

    A. yarn.application.child.java.opts=-Xsx3072m

    B. yarn.application.child.java.opts=-Xmx3072m

    C. mapreduce.map.java.opts=-Xms3072m

    D. mapreduce.map.java.opts=-Xmx3072m

  • Question 6:

    Your Hadoop cluster contains nodes in three racks. You have not configured the dfs.hosts property in the NameNode's configuration file. What results?

    A. The NameNode will update the dfs.hosts property to include machines running the DataNode daemon on the next NameNode reboot or with the command dfsadmin refreshNodes

    B. No new nodes can be added to the cluster until you specify them in the dfs.hosts file

    C. Any machine running the DataNode daemon can immediately join the cluster

    D. Presented with a blank dfs.hosts property, the NameNode will permit DataNodes specified in mapred.hosts to join the cluster

  • Question 7:

    You have recently converted your Hadoop cluster from a MapReduce 1 (MRv1) architecture to MapReduce 2 (MRv2) on YARN architecture. Your developers are accustomed to specifying map and reduce tasks (resource allocation) tasks when they run jobs: A developer wants to know how specify to reduce tasks when a specific job runs. Which method should you tell that developers to implement?

    A. MapReduce version 2 (MRv2) on YARN abstracts resource allocation away from the idea of "tasks" into memory and virtual cores, thus eliminating the need for a developer to specify the number of reduce tasks, and indeed preventing the developer from specifying the number of reduce tasks.

    B. In YARN, resource allocations is a function of megabytes of memory in multiples of 1024mb. Thus, they should specify the amount of memory resource they need by executing D mapreducereduces.memory-mb-2048

    C. In YARN, the ApplicationMaster is responsible for requesting the resource required for a specific launch. Thus, executing D yarn.applicationmaster.reduce.tasks=2 will specify that the ApplicationMaster launch two task contains on the worker nodes.

    D. Developers specify reduce tasks in the exact same way for both MapReduce version 1 (MRv1) and MapReduce version 2 (MRv2) on YARN. Thus, executing D mapreduce.job.reduces-2 will specify reduce tasks.

    E. In YARN, resource allocation is function of virtual cores specified by the ApplicationManager making requests to the NodeManager where a reduce task is handeled by a single container (and thus a single virtual core). Thus, the developer needs to specify the number of virtual cores to the NodeManager by executing p yarn.nodemanager.cpu-vcores=2

  • Question 8:

    You are running a Hadoop cluster with a NameNode on host mynamenode. What are two ways to determine available HDFS space in your cluster?

    A. Run hdfs fs du / and locate the DFS Remaining value

    B. Run hdfs dfsadmin report and locate the DFS Remaining value

    C. Run hdfs dfs / and subtract NDFS Used from configured Capacity

    D. Connect to http://mynamenode:50070/dfshealth.jsp and locate the DFS remaining value

  • Question 9:

    In CDH4 and later, which file contains a serialized form of all the directory and files inodes in the filesystem, giving the NameNode a persistent checkpoint of the filesystem metadata?

    A. fstime

    B. VERSION

    C. Fsimage_N (where N reflects transactions up to transaction ID N)

    D. Edits_N-M (where N-M transactions between transaction ID N and transaction ID N)

  • Question 10:

    You have just run a MapReduce job to filter user messages to only those of a selected geographical region. The output for this job is in a directory named westUsers, located just below your home directory in HDFS. Which command gathers these into a single file on your local file system?

    A. Hadoop fs getmerge R westUsers.txt

    B. Hadoop fs getemerge westUsers westUsers.txt

    C. Hadoop fs cp westUsers/* westUsers.txt

    D. Hadoop fs get westUsers westUsers.txt

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Cloudera exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your CCA-500 exam preparations and Cloudera certification application, do not hesitate to visit our Vcedump.com to find your solutions here.