DATA-ENGINEER-ASSOCIATE Exam Details

  • Exam Code
    :DATA-ENGINEER-ASSOCIATE
  • Exam Name
    :AWS Certified Data Engineer - Associate (DEA-C01)
  • Certification
    :Amazon Certifications
  • Vendor
    :Amazon
  • Total Questions
    :403 Q&As
  • Last Updated
    :May 29, 2026

Amazon DATA-ENGINEER-ASSOCIATE Online Questions & Answers

  • Question 231:

    A company uses AWS Glue ETL pipelines to process data. The company uses Amazon Athena to analyze data in an Amazon S3 bucket.

    To better understand shipping timelines, the company decides to collect and store shipping and delivery dates in addition to order data. The company adds a data quality check to ensure that shipping date is greater than order date and that delivery date is greater than shipping date. Orders that fail the quality check must be stored in a second S3 bucket.

    Which solution will meet these requirements MOST cost-effectively?

    A. Use the AWS Glue DataBrew DATEDIFF function to create two additional columns. Check the new columns.
    B. Use Athena to query all three date columns, and compare the columns.
    C. Use AWS Glue Data Quality to create a custom rule that uses the three date columns.
    D. Use an AWS Glue crawler to populate an AWS Glue Data Catalog. Use the three date columns to create a filter.

  • Question 232:

    A media company wants to build a real-time analytics pipeline to process customer activity events across the company's website and mobile app. The company wants to build a solution to ingest millions of events with minimum latency. The solution must be scalable and durable enough so that no data is lost.

    Which solution will meet these requirements in the MOST cost-effective way?

    A. Set up an Amazon Kinesis Data Streams pipeline to ingest data, process the data by using AWS Lambda functions, and store the results in Amazon Redshift for analytics.
    B. Schedule an AWS Glue job to fetch user interaction logs every 10 minutes from Amazon S3. Configure the AWS Glue job to transform and store the data in Amazon Redshift for analytics.
    C. Configure Amazon S3 Event Notifications to invoke an AWS Lambda function to process every new interaction log file. Store the result in Amazon Redshift for analytics.
    D. Deploy an Amazon Managed Streaming for Apache Kafka (Amazon MSK) cluster. Use self-managed consumers to process and distribute data in real-time. Integrate with Amazon Redshift for enhanced analytics.

  • Question 233:

    A company must copy files from an on-premises NFS server to Amazon S3 every night. The solution must transfer only changed data after the first run and must avoid custom file-transfer scripts.

    Which service should a data engineer use?

    A. AWS DataSync
    B. Amazon AppFlow
    C. Amazon Kinesis Data Streams
    D. Amazon Redshift Spectrum

  • Question 234:

    A company is building a new application that ingests CSV files into Amazon Redshift. The company has developed the frontend for the application.

    The files are stored in an Amazon S3 bucket. Files are no larger than 5 MB.

    A data engineer is developing the extract, transform, and load (ETL) pipeline for the CSV files. The data engineer configured a Redshift cluster and an AWS Lambda function that copies the data out of the files into the Redshift cluster.

    Which additional steps should the data engineer perform to meet these requirements?

    A. Configure the bucket to send S3 event notifications to Amazon EventBridge. Configure an EventBridge rule that matches S3 new object created events. Set the Lambda function as the target.
    B. Configure the $3 bucket to send S3 event notifications to an Amazon Simple Queue Service (Amazon SQS) queue. Configure the Lambda function to proce the queue.
    C. Configure AWS Database Migration Service (AWS DMS) to stream new S3 objects to a data stream in Amazon Kinesis Data Streams. Set the Lambda function as the target of the data stream.
    D. Configure an Amazon EventBridge rule that matches S3 new object created events. Set an Amazon Simple Queue Service (Amazon SQS) queue as the target of the rule. Configure the Lambda function to proce the queue.

  • Question 235:

    A company needs to collect logs for an Amazon RDS for MySQL database and make the logs available for audits. The logs must track each user that modifies data in the database or makes changes to the database instance.

    Which solution will meet these requirements?

    A. Enable Amazon CloudWatch Logs. Create metric filters to monitor database changes and instance-level changes. Configure automated notification systems to send near real-time alerts for suspicious database operations.
    B. Configure an Amazon EventBridge rule to monitor database activity. Create an AWS Lambda function to process EventBridge events and store them in Amazon OpenSearch Service.
    C. Configure AWS CloudTrail to log API calls. Use Amazon CloudWatch Logs for basic monitoring. Use IAM policies to control access to the logs. Set up scheduled reporting for log audits.
    D. Enable and configure native Amazon RDS database audit logging. Enable Amazon CloudWatch Logs. Configure metric filters and alarms. Configure AWS CloudTrail audit logging.

  • Question 236:

    A company extracts approximately 1 TB of data every day from data sources such as SAP HANA, Microsoft SQL Server, MongoDB, Apache Kafka, and Amazon DynamoDB. Some of the data sources have undefined data schemas or data schemas that change.

    A data engineer must implement a solution that can detect the schema for these data sources. The solution must extract, transform, and load the data to an Amazon S3 bucket. The company has a service level agreement (SLA) to load the data into the S3 bucket within 15 minutes of data creation.

    Which solution will meet these requirements with the LEAST operational overhead?

    A. Use Amazon EMR to detect the schema and to extract, transform, and load the data into the S3 bucket. Create a pipeline in Apache Spark.
    B. Use AWS Glue to detect the schema and to extract, transform, and load the data into the S3 bucket. Create a pipeline in Apache Spark.
    C. Create a PvSpark proqram in AWS Lambda to extract, transform, and load the data into the S3 bucket.
    D. Create a stored procedure in Amazon Redshift to detect the schema and to extract, transform, and load the data into a Redshift Spectrum table. Access the table from Amazon S3.

  • Question 237:

    A company uses Apache Airflow DAGs for complex data pipeline orchestration. The company wants a managed AWS service for Airflow so the data engineering team does not operate Airflow servers, schedulers, and web servers.

    Which service should the company use?

    A. Amazon Managed Workflows for Apache Airflow (Amazon MWAA)
    B. Amazon Managed Streaming for Apache Kafka (Amazon MSK)
    C. Amazon Managed Grafana
    D. AWS Database Migration Service

  • Question 238:

    A company is planning to upgrade its Amazon Elastic Block Store (Amazon EBS) General Purpose SSD storage from gp2 to gp3. The company wants to prevent any interruptions in its Amazon EC2 instances that will cause data loss during the migration to the upgraded storage.

    Which solution will meet these requirements with the LEAST operational overhead?

    A. Create snapshots of the gp2 volumes. Create new gp3 volumes from the snapshots. Attach the new gp3 volumes to the EC2 instances.
    B. Create new gp3 volumes. Gradually transfer the data to the new gp3 volumes. When the transfer is complete, mount the new gp3 volumes to the EC2 instances to replace the gp2 volumes.
    C. Change the volume type of the existing gp2 volumes to gp3. Enter new values for volume size, IOPS, and throughput.
    D. Use AWS DataSync to create new gp3 volumes. Transfer the data from the original gp2 volumes to the new gp3 volumes.

  • Question 239:

    A team is implementing data quality checks in AWS Glue ETL. The team needs fixed checks for required columns and wants anomaly detection for row count trends after historical statistics are available.

    Which capabilities should the team use? (Choose two.)

    A. Define AWS Glue Data Quality rules with DQDL, such as completeness checks for required columns.
    B. Enable AWS Glue Data Quality anomaly detection for supported metrics such as row count.
    C. Use S3 Versioning as the primary way to detect missing column values.
    D. Use Redshift VACUUM to detect row count anomalies in S3 files.
    E. Use AWS Transfer Family user mappings as data quality rules.

  • Question 240:

    A company has an on-premises PostgreSQL database that contains customer data. The company wants to migrate the customer data to an Amazon Redshift data warehouse. The company has established a VPN connection between the on-premises database and AWS.

    The on-premises database is continuously updated. The company must ensure that the data in Amazon Redshift is updated as quickly as poible.

    Which solution will meet these requirements?

    A. Use the pg_dump utility to generate a backup of the PostgreSQL database. Use the AWS Schema Conversion Tool (AWS SCT) to upload the backup to Amazon Redshift. Set up a cron job to perform a backup. Upload the backup to Amazon Redshift every night.
    B. Create an AWS Database Migration Service (AWS DMS) full-load task. Set Amazon Redshift as the target. Configure the task to use the change data capture (CDC) feature.
    C. Use the pg_dump utility to generate a backup of the PostgreSQL database. Upload the backup to an Amazon S3 bucket. Use the COPY command to import the data into Amazon Redshift.
    D. Create an AWS Database Migration Service (AWS DMS) full-load task. Set Amazon Redshift as the target. Configure the task to perform a full load of the database to Amazon Redshift every night.

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Amazon exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DATA-ENGINEER-ASSOCIATE exam preparations and Amazon certification application, do not hesitate to visit our Vcedump.com to find your solutions here.