DATA-ENGINEER-ASSOCIATE Exam Details

  • Exam Code
    :DATA-ENGINEER-ASSOCIATE
  • Exam Name
    :AWS Certified Data Engineer - Associate (DEA-C01)
  • Certification
    :Amazon Certifications
  • Vendor
    :Amazon
  • Total Questions
    :403 Q&As
  • Last Updated
    :May 29, 2026

Amazon DATA-ENGINEER-ASSOCIATE Online Questions & Answers

  • Question 321:

    A company needs a solution to store and query product data that has variable attributes. The solution must support unpredictable and high-volume queries with single-digit millisecond latency, even during sudden traffic spikes. The solution must retrieve items by a primary identifier named Product ID. The solution must allow flexible queries by secondary attributes named Category and Brand.

    Which solutionwill meet these requirements?

    A. Use an Amazon DynamoDB table with on-demand capacity to store product data. Store products by primary key. Use global secondary indexes (GSIs) to store secondary attributes.
    B. Use Amazon Aurora with a Multi-AZ deployment to store product data. Use read replicas. Create indexes for primary and secondary attributes.
    C. Use an Amazon OpenSearch Serverless cluster with dynamic scaling to store product data. Index product data by primary and secondary attributes.
    D. Use Amazon ElastiCache (Redis OSS) and Amazon S3 to store product data. Use Amazon Athena to run flexible secondary attribute queries.

  • Question 322:

    A media company uploads large video files to Amazon S3 for processing. After processing, the company needs to keep the original files for 90 days in case the files require reprocessing. After 90 days, the company can delete the files to reduce storage costs. The company stores the processed videos in a different S3 bucket.

    Which S3 Lifecycle configuration will meet these requirements for the original files MOST cost-effectively?

    A. Store the files in S3 Standard for 90 days. Transition the files to S3 Glacier Flexible Retrieval for long-term storage. Then expire the files.
    B. Store the files in S3 Standard for 90 days. Enable versioning. Enable Object Lock on the files for 90 days. Then expire the files.
    C. Store the files in S3 Standard for 90 days. Implement S3 Lifecycle management to expire the files.
    D. Store the files in S3 Intelligent-Tiering for 90 days. Enable versioning. Add S3 Lifecycle management to expire the files.

  • Question 323:

    A company wants to combine data from multiple software as a service (SaaS) applications for analysis.

    A data engineering team needs to use Amazon QuickSight to perform the analysis and build dashboards.

    A data engineer needs to extract the data from the SaaS applications and make the data available for QuickSight queries.

    Which solution will meet these requirements in the MOST operationally efficient way?

    A. Create AWS Lambda functions that call the required APIs to extract the data from the applications. Store the data in an Amazon S3 bucket. Use AWS Glue to catalog the data in the S3 bucket. Create a data source and a dataset in QuickSight.
    B. Use AWS Lambda functions as Amazon Athena data source connectors to run federated queries against the SaaS applications. Create an Athena data source and a dataset in QuickSight.
    C. Use Amazon AppFlow to create a flow for each SaaS application. Set an Amazon S3 bucket as the destination. Schedule the flows to extract the data to the bucket. Use AWS Glue to catalog the data in the S3 bucket. Create a data source and a dataset in QuickSight.
    D. Export data the from the SaaS applications as Microsoft Excel files. Create a data source and a dataset in QuickSight by uploading the Excel files.

  • Question 324:

    A data engineer needs to optimize the performance of a data pipeline that handles retail orders. Data about the orders is ingested daily into an Amazon S3 bucket.

    The data engineer runs queries once each week to extract metrics from the orders data based the order date for multiple date ranges. The data engineer needs an optimization solution that ensures the query performance will not degrade when the volume of data increases.

    Which solution will meet this requirement MOST cost-effectively?

    A. Partition the data based on order date. Use Amazon Athena to query the data.
    B. Partition the data based on order date. Use Amazon Redshift to query the data.
    C. Partition the data based on load date. Use Amazon EMR to query the data.
    D. Partition the data based on load date. Use Amazon Aurora to query the data.

  • Question 325:

    A company stores objects in an Amazon S3 bucket. The company crawls the objects so that Amazon Athena can query the data.

    A data engineer manually moved all objects from the partition with a path prefix of status=01 to the prefix status=02. The status=01 partition location is now empty. However, the status=01 partition location still appears in the AWS Glue Data Catalog metadata.

    Which Athena command should the data engineer run to resolve the metadata discrepancy?

    A. MSCK REPAIR TABLE
    B. ALTER TABLE DROP PARTITION
    C. ALTER TABLE SET TBLPROPERTIES
    D. ALTER TABLE CHANGE COLUMN

  • Question 326:

    A retail company uses an Amazon Redshift data warehouse and an Amazon S3 bucket. The company ingests retail order data into the S3 bucket every day.

    The company stores all order data at a single path within the S3 bucket. The data has more than 100 columns. The company ingests the order data from a third-party application that generates more than 30 files in CSV format every day. Each CSV file is between 50 and 70 MB in size. The company uses Amazon Redshift Spectrum to run queries that select sets of columns. Users aggregate metrics based on daily orders. Recently, users have reported that the performance of the queries has degraded. A data engineer must resolve the performance issues for the queries.

    Which combination of steps will meet this requirement with LEAST developmental effort? (Choose Two.)

    A. Configure the third-party application to create the files in a columnar format.
    B. Develop an AWS Glue ETL job to convert the multiple daily CSV files to one file for each day.
    C. Partition the order data in the S3 bucket based on order date.
    D. Configure the third-party application to create the files in JSON format.
    E. Load the JSON data into the Amazon Redshift table in a SUPER type column.

  • Question 327:

    A company has an application that is deployed on AWS. The application uses Amazon Simple Notification Service (Amazon SNS) with multiple topics. The company's security team needs to be able to audit all Publish and PublishBatch API actions for all the SNS topics. The company's application team and security team must also be able to query the audit data. The company has already established an event data store in AWS CloudTrail Lake to collect all events.

    Which solution will meet these requirements with the LEAST operational overhead?

    A. Enable management events for the SNS topics. Create a table in AWS Glue Data Catalog. Query the data by using Amazon Athena.
    B. Enable management events for the SNS topics. Use CloudTrail Lake to query the audit data.
    C. Enable data events for the SNS topics. Use CloudTrail Lake to query the audit data.
    D. Enable data events for the SNS topics. Create a table in AWS Glue Data Catalog. Query the data by using Amazon Athena.

  • Question 328:

    A lab uses IoT sensors to monitor humidity, temperature, and pressure for a project. The sensors send 100 KB of data every 10 seconds. A downstream process will read the data from an Amazon S3 bucket every

    30 seconds.

    Which solution will deliver the data to the S3 bucket with the LEAST latency?

    A. Use Amazon Kinesis Data Streams and Amazon Kinesis Data Firehose to deliver the data to the S3 bucket. Use the default buffer interval for Kinesis Data Firehose.
    B. Use Amazon Kinesis Data Streams to deliver the data to the S3 bucket. configure the stream to use 5 provisioned shards.
    C. Use Amazon Kinesis Data Streams and call the Kinesis Client Library to deliver the data to the S3 bucket. Use a 5 second buffer interval from an application.
    D. Use Amazon Managed Service for Apache Flink (previously known as Amazon Kinesis Data Analytics) and Amazon Kinesis Data Firehose to deliver the data to the S3 bucket. Use a 5 second buffer interval for Kinesis Data Firehose.

  • Question 329:

    A data engineer manages Athena external tables over Hive-style partitions in Amazon S3. New partitions are added daily, and occasionally old partition locations are manually deleted.

    Which statements are correct? (Choose two.)

    A. MSCK REPAIR TABLE can add compatible new partitions that exist in S3 but are missing from table metadata.
    B. ALTER TABLE DROP PARTITION can remove stale partition metadata after partition data is deleted from S3.
    C. MSCK REPAIR TABLE automatically removes partition metadata for deleted S3 prefixes.
    D. S3 Lifecycle rules update AWS Glue Data Catalog partition metadata automatically.
    E. DynamoDB TTL is required before Athena can query partitioned S3 data.

  • Question 330:

    An ecommerce company wants to use AWS to migrate data pipelines from an on-premises environment into the AWS Cloud. The company currently uses a third-party too in the on-premises environment to orchestrate data ingestion processes.

    The company wants a migration solution that does not require the company to manage servers. The solution must be able to orchestrate Python and Bash scripts. The solution must not require the company to refactor any code.

    Which solution will meet these requirements with the LEAST operational overhead?

    A. AWS Lambda
    B. Amazon Managed Workflows for Apache Airflow (Amazon MWAA)
    C. AWS Step Functions
    D. AWS Glue

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Amazon exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DATA-ENGINEER-ASSOCIATE exam preparations and Amazon certification application, do not hesitate to visit our Vcedump.com to find your solutions here.