What are the characteristics of Big Data?
A. Data volume,processing complexity,and data structure variety.
B. Data volume,business importance,and data structure variety.
C. Data type,processing complexity,and data structure variety.
D. Data volume,processing complexity,and business importance.
You are analyzing data in order to build a classifier model. You discover non-linear data and discontinuities that will affect the model. Which analytical method would you recommend?
A. Decision Trees
B. Logistic Regression
C. ARIMA
D. Linear Regression
You are studying the behavior of a population, and you are provided with multidimensional data at the individual level. You have identified four specific individuals who are valuable to your study, and would like to find all users who are most similar to each individual. Which algorithm is the most appropriate for this study?
A. K-means clustering
B. Linear regression
C. Association rules
D. Decision trees
Which R data structure allows elements to have different data types?
A. List
B. Vector
C. Matrix
D. Array
Which key role for a successful analytic project can consult and advise the project team on the value of end results and how these will be used on a day-to-day basis?
A. Business User
B. Project Manager
C. Data Scientist
D. Business Intelligence Analyst
A disk drive manufacturer has a defect rate of less than 1.0% with 98% confidence. A quality assurance team samples 1000 disk drives and finds 14 defective units. Which action should the team recommend?
A. The manufacturing process should be inspected for problems.
B. A larger sample size should be taken to determine if the plant is functioning properly
C. A smaller sample size should be taken to determine if the plant is functioning properly
D. The manufacturing process is functioning properly and no further action is required.
Consider the example of an analysis for fraud detection on credit card usage. You will need to ensure higher-risk transactions that may indicate fraudulent credit card activity are retained in your data for analysis, and not dropped as outliers during pre-processing. What will be your approach for loading data into the analytical sandbox for this analysis?
A. ELT
B. ETL
C. EDW
D. OLTP
Trend, seasonal, and cyclical are components of a time series. What is another component?
A. Irregular
B. Linear
C. Quadratic
D. Exponential
Which word or phrase completes the statement? Unix is to bash as Hadoop is to:
A. Pig
B. HDFS
C. Sqoop
D. NameNode
A call center for a large electronics company handles an average of 35, 000 support calls a day. The head of the call center would like to optimize the staffing of the call center during the rollout of a new product due to recent customer complaints of long wait times. You have been asked to create a model to optimize call center costs and customer wait times. The goals for this project include:
1.
Relative to the release of a product, how does the call volume change over time?
2.
How to best optimize staffing based on the call volume for the newly released product, relative to old products.
3.
Historically, what time of day does the call center need to be most heavily staffed?
4.
Determine the frequency of calls by both product type and customer language. Which goals are suitable to be completed with MapReduce?
A. Goal 2 and 4
B. Goal 1 and 3
C. Goals 1,2,3,4
D. Goals 2,3,4
Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only EMC exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your E20-026 exam preparations and EMC certification application, do not hesitate to visit our Vcedump.com to find your solutions here.