In a MapReduce job, you want each of your input files processed by a single map task. How do you configure a MapReduce job so that a single map task processes each input file regardless of how many blocks the input file occupies?
A. Increase the parameter that controls minimum split size in the job configuration.
B. Write a custom MapRunner that iterates over all key-value pairs in the entire file.
C. Set the number of mappers equal to the number of input files you want to process.
D. Write a custom FileInputFormat and override the method isSplitable to always return false.
Which Two of the following statements are true about hdfs? Choose 2 answers
A. An HDFS file that is larger than dfs.block.size is split into blocks
B. Blocks are replicated to multiple datanodes
C. HDFS works best when storing a large number of relatively small files
D. Block sizes for all files must be the same size
Which one of the following statements is false about HCatalog?
A. Provides a shared schema mechanism
B. Designed to be used by other programs such as Pig, Hive and MapReduce
C. Stores HDFS data in a database for performing SQL-like ad-hoc queries
D. Exists as a subproject of Hive
You want to Ingest log files Into HDFS, which tool would you use?
A. HCatalog
B. Flume
C. Sqoop
D. Ambari
Which Hadoop component is responsible for managing the distributed file system metadata?
A. NameNode
B. Metanode
C. DataNode
D. NameSpaceManager
Which HDFS command displays the contents of the file x in the user's HDFS home directory?
A. hadoop fs -Is x
B. hdfs fs -get x
C. hadoop fs -cat x
D. hadoop fs -cp x
What types of algorithms are difficult to express in MapReduce v1 (MRv1)?
A. Algorithms that require applying the same mathematical function to large numbers of individual binary records.
B. Relational operations on large amounts of structured and semi-structured data.
C. Algorithms that require global, sharing states.
D. Large-scale graph algorithms that require one-step link traversal.
E. Text analysis algorithms on large collections of unstructured text (e.g, Web crawls).
In the reducer, the MapReduce API provides you with an iterator over Writable values. What does calling the next () method return?
A. It returns a reference to a different Writable object time.
B. It returns a reference to a Writable object from an object pool.
C. It returns a reference to the same Writable object each time, but populated with different data.
D. It returns a reference to a Writable object. The API leaves unspecified whether this is a reused object or a new object.
E. It returns a reference to the same Writable object if the next value is the same as the previous value, or a new Writable object otherwise.
Given the following Pig command:
logevents = LOAD andapos;input/my.logandapos; AS (date:chararray, levehstring, code:int, message:string);
Which one of the following statements is true?
A. The logevents relation represents the data from the my.log file, using a comma as the parsing delimiter
B. The logevents relation represents the data from the my.log file, using a tab as the parsing delimiter
C. The first field of logevents must be a properly-formatted date string or table return an error
D. The statement is not a valid Pig command
Determine which best describes when the reduce method is first called in a MapReduce job?
A. Reducers start copying intermediate key-value pairs from each Mapper as soon as it has completed. The programmer can configure in the job what percentage of the intermediate data should arrive before the reduce method begins.
B. Reducers start copying intermediate key-value pairs from each Mapper as soon as it has completed. The reduce method is called only after all intermediate data has been copied and sorted.
C. Reduce methods and map methods all start at the beginning of a job, in order to provide optimal performance for map-only or reduce-only jobs.
D. Reducers start copying intermediate key-value pairs from each Mapper as soon as it has completed. The reduce method is called as soon as the intermediate key-value pairs start to arrive.
Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Hortonworks exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your APACHE-HADOOP-DEVELOPER exam preparations and Hortonworks certification application, do not hesitate to visit our Vcedump.com to find your solutions here.