Exam Details

  • Exam Code
    :DAS-C01
  • Exam Name
    :AWS Certified Data Analytics - Specialty (DAS-C01)
  • Certification
    :Amazon Certifications
  • Vendor
    :Amazon
  • Total Questions
    :285 Q&As
  • Last Updated
    :Apr 27, 2025

Amazon Amazon Certifications DAS-C01 Questions & Answers

  • Question 261:

    A financial institution is building an Amazon QuickSight business intelligence (BI) dashboard to show financial performance and analyze trends. The development team is using an Amazon Redshift database in the development environment and is having difficulty with validating the accuracy of the metrics calculation algorithm due to the lack of quality data. The Redshift production environment database is 500 TB and is in a different AWS account in the same AWS Region as the development environment account. The company needs to use up-to-date production environment data for development purposes.

    Which solution MOST cost-effectively meets these requirements?

    A. Setup data streaming with Amazon Kinesis Data Streams from the production environment Redshift database to replicate the data to the development environment Redshift database.

    B. Create a Redshift datashare to share the production environment data with the development team.

    C. Upload the data from Amazon Redshift to Amazon S3. Then load the data directly from Amazon S3 to the development environment Redshift cluster using the COPY command.

    D. Create Redshift views that are configured to share all the data between the production and development clusters.

  • Question 262:

    A company has a fitness tracker application that generates data from subscribers. The company needs real-time reporting on this data. The data is sent immediately, and the processing latency must be less than 1 second. The company wants to perform anomaly detection on the data as the data is collected. The company also requires a solution that minimizes operational overhead.

    Which solution meets these requirements?

    A. Amazon EMR cluster with Apache Spark streaming, Spark SQL, and Spark's machine learning library (MLlib)

    B. Amazon Kinesis Data Firehose with Amazon S3 and Amazon Athena

    C. Amazon Kinesis Data Firehose with Amazon QuickSight

    D. Amazon Kinesis Data Streams with Amazon Kinesis Data Analytics

  • Question 263:

    A media company has decided to use Amazon OpenSearch Service to analyze real-time data about popular musical artists and songs. The company expects to ingest millions of new data events daily that will flow through an Amazon Kinesis data stream. The data must be converted to a format that is compatible with the OpenSearch Service index.

    Which method should a data analytics specialist use to ingest the data with the LEAST amount of operational overhead?

    A. Use Amazon Kinesis Data Firehose with an AWS Lambda function for data transformation and delivery to OpenSearch Service.

    B. Use a Logstash pipeline with prebuilt filters for data transformation and delivery to OpenSearch Service.

    C. Use the Kinesis agent with an AWS Lambda function for data transformation and delivery to OpenSearch Service.

    D. Use the Kinesis Client Library (KCL) for data transformation and delivery to OpenSearch Service.

  • Question 264:

    A company uses an Amazon Kinesis data stream to ingest application monitoring data from its mobile apps. Thousands of users use the mobile apps. The company also has an application deployed in Amazon Kinesis Data Analytics for Apache Flink to gain insights from the streaming data. The Flink application uses the Kinesis data stream as the source.

    The company is experiencing increases in the number of mobile app users and application activities. As a result, the throughput of the Flink application is decreasing. The source Kinesis data stream is showing increasingly high values for the MillisBehindLatest metric.

    Which combination of steps should a data analytics specialist take to improve the throughput of the Flink application? (Choose three.)

    A. Increase the Flink application's parallelism.

    B. Use checkpoints and savepoints for the Flink application.

    C. Reduce the amount of logging generated from the Flink application. Log records only when necessary.

    D. Check the Flink application's CPU metric. If this metric is above 75%, configure auto scaling for the Flink application.

    E. Increase the number of shards in the Kinesis data stream.

    F. Restart the Flink application frequently.

  • Question 265:

    A data analyst at a fast-growing retail company needs to store data coming in from several dozen marketing campaigns. Each source will write its output to a CSV file that is stored in Amazon S3. The data will later be analyzed by individual campaign managers using Amazon Athena to roughly track the number of daily unique visits to their specific campaign websites over time. The company wants to minimize the cost of data analysis.

    Which combination of actions would lead to the MOST efficient one-time analysis of the data? (Choose two.)

    A. Use an AWS Glue job to convert all files to Apache ORC format. Use the COUNT(DISTINCT column) function to obtain a count of unique visitors.

    B. Create one S3 bucket for all the data. Partition the data by date.

    C. Create a separate S3 bucket for each campaign. Partition the data by date.

    D. Create a separate S3 bucket for each month. Partition the data by campaign.

    E. Use an AWS Glue job to convert all files to Apache Parquet format. Use the approx_distinct() function to obtain a count of unique visitors.

  • Question 266:

    A company needs to launch an Amazon EMR cluster in a VPC. The EMR cluster must not have any access to the internet. Additionally, the EMR cluster's access to other AWS services must not be through the internet. Which solution will meet these requirements?

    A. Launch the EMR cluster in a private subnet. Configure a NAT gateway for access to other AWS services.

    B. Launch the EMR cluster in a private subnet. Configure a NAT instance for access to other AWS services.

    C. Launch the EMR cluster in a private subnet. Configure a VPC endpoint for access to other AWS services.

    D. Launch the EMR cluster in a public subnet. Configure a VPC endpoint for access to other AWS services.

  • Question 267:

    A company needs to build a data lake on AWS. The company must provide row-level data access and column-level data access to relevant teams on a need-to-know basis. The teams will access the data by using Amazon Athena, Amazon Redshift Spectrum, and Apache Hive on Amazon EMR.

    Which solution will meet these requirements MOST cost-effectively?

    A. Use Amazon S3 for data lake storage. Use S3 access policies to restrict data access by rows and columns. Provide data access through Amazon S3.

    B. Use Amazon S3 for data lake storage. Use Apache Ranger on Amazon EMR to restrict data access by rows and columns. Provide data access by using Apache Pig.

    C. Use Amazon Redshift for data lake storage. Use Redshift security policies to restrict data access by rows and columns. Provide data access by using Apache Spark and Amazon Athena federated queries.

    D. Use Amazon S3 for data lake storage. Use AWS Lake Formation to restrict data access by rows and columns. Provide data access through Lake Formation.

  • Question 268:

    A company's data science team is designing a shared dataset repository on a Windows server. The data repository will store a large amount of training data that the data science team commonly uses in its machine learning models. The data

    scientists create a random number of new datasets each day.

    The company needs a solution that provides persistent, scalable file storage and high levels of throughput and IOPS. The solution also must be highly available and must integrate with Active Directory for access control.

    Which solution will meet these requirements with the LEAST development effort?

    A. Store datasets as files in an Amazon EMR cluster. Set the Active Directory domain for authentication.

    B. Store datasets as files in Amazon FSx for Windows File Server. Set the Active Directory domain for authentication.

    C. Store datasets as tables in a multi-node Amazon Redshift cluster. Set the Active Directory domain for authentication.

    D. Store datasets as global tables in Amazon DynamoDB. Build an application to integrate authentication with the Active Directory domain.

  • Question 269:

    A startup company runs its data processing and machine learning workloads on Amazon EMR. To increase productivity, the company granted permissions to engineers and analysts to create EMR clusters. A recent security review showed

    that some EMR clusters had ports open to the public internet.

    A data analytics specialist must make changes so that any new EMR clusters have public access blocked when they are created.

    Which solution will meet these requirements with the LEAST development effort?

    A. Create an AWS Lambda function that checks whether the Amazon EMR security group is open to the public internet. Invoke a Lambda function when an EMR cluster is created. Integrate the function with Amazon Simple Notification Service (Amazon SNS) to send notification email.

    B. Turn on the block public access feature by using the Amazon EMR console.

    C. Use AWS Config to track the security on the EMR cluster. Use Amazon EventBridge to send notifications when an open cluster is detected. Create an AWS Lambda function that the notification invokes to block public access.

    D. Update security groups to remove inbound traffic from IPv4 0.0.0.0/0 or IPv6 ::/0.

  • Question 270:

    A company creates daily and monthly business metrics from data that partners provide. Each day, the partners deliver JSON data files to an Amazon S3 bucket that the company owns. The S3 object keys use Apache Hive style date

    partitions. The company uses an Amazon EventBridge rule to invoke an AWS Lambda function that reads all objects in the S3 bucket to aggregate the daily and monthly metrics.

    The company performs occasional analysis that requires access to historical data. As more data has accumulated, the Lambda function is timing out frequently. A data analytics specialist must prevent the Lambda function timeouts.

    Which solution will meet these requirements with the LEAST operational overhead?

    A. Update the EventBridge rule to invoke AWS Step Functions to retry the Lambda function if the function fails.

    B. Modify the Lambda function to delete older S3 objects during the daily processing.

    C. Modify the Lambda function to query the S3 objects by using Amazon Athena with date filters.

    D. Create an AWS Glue job to invoke the Lambda function. Update the EventBridge rule to invoke the AWS Glue job.

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Amazon exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DAS-C01 exam preparations and Amazon certification application, do not hesitate to visit our Vcedump.com to find your solutions here.