DATA-ENGINEER-ASSOCIATE Exam Details

  • Exam Code
    :DATA-ENGINEER-ASSOCIATE
  • Exam Name
    :AWS Certified Data Engineer - Associate (DEA-C01)
  • Certification
    :Amazon Certifications
  • Vendor
    :Amazon
  • Total Questions
    :403 Q&As
  • Last Updated
    :May 29, 2026

Amazon DATA-ENGINEER-ASSOCIATE Online Questions & Answers

  • Question 281:

    A data engineer wants to orchestrate a set of extract, transform, and load (ETL) jobs that run on AWS. The ETL jobs contain tasks that must run Apache Spark jobs on Amazon EMR, make API calls to Salesforce, and load data into Amazon Redshift.

    The ETL jobs need to handle failures and retries automatically. The data engineer needs to use Python to orchestrate the jobs.

    Which service will meet these requirements?

    A. Amazon Managed Workflows for Apache Airflow (Amazon MWAA)
    B. AWS Step Functions
    C. AWS Glue
    D. Amazon EventBridge

  • Question 282:

    A data engineer needs to use AWS Step Functions to design an orchestration workflow. The workflow must parallel process a large collection of data files and apply a specific transformation to each file.

    Which Step Functions state should the data engineer use to meet these requirements?

    A. Parallel state
    B. Choice state
    C. Map state
    D. Wait state

  • Question 283:

    A company uses AWS Step Functions to orchestrate a data pipeline. The company has configured the Step Functions logs to push to Amazon CloudWatch Logs when the log level is FATAL.

    The company has enabled logs for all AWS services in the pipeline.

    A state named "preprocessing" invokes an AWS Lambda function named "preprocessing." The Lambda function preprocesses data before proceeding to the next state. The company needs to find error details if an error occurs during the data preprocessing.

    Which CloudWatch Logs log group should the company check to find the error details?

    A. The Step Functions TaskFailed event in the /aws/vendedlogs/states log group
    B. The AWS CloudTrail logs SendTaskFailure event in the CloudTrail/logs/preprocessing log group
    C. The Lambda logs in the laws/lambda/preprocessing log group
    D. The Step Functions TaskSucceeded event in the /aws/vendedlogs/states log group

  • Question 284:

    A company is setting up a new Amazon SageMaker Unified Studio domain. Each of the company's business units needs isolated control over its own assets, projects, and metadata. Specific datasets must be shareable with other business units upon approval. The company also requires centralized user authentication and identity mapping.

    Which solution will meet these requirements?

    A. Configure each business unit as a domain unit with delegated ownership and fine-grained permissions policies. Give users the ability to share assets across domain units with explicit access control. Assign API keys to users for authentication to access the domain portal.
    B. Configure business units as separate domain units with owner permissions. Restrict projects exclusively to owners to prevent data sharing between domains. Configure AWS IAM Identity Center for centralized authentication. Map user profiles to their respective domain units.
    C. Configure business units to be represented as separate domains. Establish isolated environments with no shared administrative policies. Configure AWS IAM Identity Center for centralized authentication. Delegate administration at the domain level.
    D. Configure each business unit as a separate domain unit to manage permissions on assets, projects, and metadata. Configure AWS IAM Identity Center for centralized authentication. Map user profiles to their respective domain units. Enable cross-business unit sharing through access requests. Instruct domain unit owners to approve or deny the requests.

  • Question 285:

    A data engineer finished testing an Amazon Redshift stored procedure that processes and inserts data into a table that is not mission critical. The engineer wants to automatically run the stored procedure on a daily basis.

    Which solution will meet this requirement in the MOST cost-effective way?

    A. Create an AWS Lambda function to schedule a cron job to run the stored procedure.
    B. Schedule and run the stored procedure by using the Amazon Redshift Data API in an Amazon EC2 Spot Instance.
    C. Use query editor v2 to run the stored procedure on a schedule.
    D. Schedule an AWS Glue Python shell job to run the stored procedure.

  • Question 286:

    A data engineer needs to build an extract, transform, and load (ETL) job. The ETL job will process daily incoming .csv files that users upload to an Amazon S3 bucket. The size of each S3 object is less than 100

    MB.

    Which solution will meet these requirements MOST cost-effectively?

    A. Write a custom Python application. Host the application on an Amazon Elastic Kubernetes Service (Amazon EKS) cluster.
    B. Write a PySpark ETL script. Host the script on an Amazon EMR cluster.
    C. Write an AWS Glue PySpark job. Use Apache Spark to transform the data.
    D. Write an AWS Glue Python shell job. Use pandas to transform the data.

  • Question 287:

    A data engineer needs to onboard a new data producer into AWS. The data producer needs to migrate data products to AWS.

    The data producer maintains many data pipelines that support a business application. Each pipeline must have service accounts and their corresponding credentials. The data engineer must establish a secure connection from the data producer's on-premises data center to AWS. The data engineer must not use the public internet to transfer data from an on-premises data center to AWS.

    Which solution will meet these requirements?

    A. Instruct the new data producer to create Amazon Machine Images (AMIs) on Amazon Elastic Container Service (Amazon ECS) to store the code base of the application. Create security groups in a public subnet that allow connections only to the on-premises data center.
    B. Create an AWS Direct Connect connection to the on-premises data center. Store the service account credentials in AWS Secrets manager.
    C. Create a security group in a public subnet. Configure the security group to allow only connections from the CIDR blocks that correspond to the data producer. Create Amazon S3 buckets than contain presigned URLS that have one-day expiration dates.
    D. Create an AWS Direct Connect connection to the on-premises data center. Store the application keys in AWS Secrets Manager. Create Amazon S3 buckets that contain resigned URLS that have one-day expiration dates.

  • Question 288:

    A company wants to implement real-time analytics capabilities. The company wants to use Amazon Kinesis Data Streams and Amazon Redshift to ingest and process streaming data at the rate of several gigabytes per second. The company wants to derive near real-time insights by using existing business intelligence (BI) and analytics tools.

    Which solution will meet these requirements with the LEAST operational overhead?

    A. Use Kinesis Data Streams to stage data in Amazon S3. Use the COPY command to load data from Amazon S3 directly into Amazon Redshift to make the data immediately available for real-time analysis.
    B. Access the data from Kinesis Data Streams by using SQL queries. Create materialized views directly on top of the stream. Refresh the materialized views regularly to query the most recent stream data.
    C. Create an external schema in Amazon Redshift to map the data from Kinesis Data Streams to an Amazon Redshift object. Create a materialized view to read data from the stream. Set the materialized view to auto refresh.
    D. Connect Kinesis Data Streams to Amazon Kinesis Data Firehose. Use Kinesis Data Firehose to stage the data in Amazon S3. Use the COPY command to load the data from Amazon S3 to a table in Amazon Redshift.

  • Question 289:

    A company needs a solution to manage costs for an existing Amazon DynamoDB table. The company also needs to control the size of the table. The solution must not disrupt any ongoing read or write operations.

    The company wants to use a solution that automatically deletes data from the table after 1 month.

    Which solution will meet these requirements with the LEAST ongoing maintenance?

    A. Use the DynamoDB TTL feature to automatically expire data based on timestamps.
    B. Configure a scheduled Amazon EventBridge rule to invoke an AWS Lambda function to check for data that is older than 1 month. Configure the Lambda function to delete old data.
    C. Configure a stream on the DynamoDB table to invoke an AWS Lambda function. Configure the Lambda function to delete data in the table that is older than 1 month.
    D. Use an AWS Lambda function to periodically scan the DynamoDB table for data that is older than 1 month. Configure the Lambda function to delete old data.

  • Question 290:

    A company stores Apache Parquet files in an Amazon S3 data lake. The data lake receives thousands of files from multiple sources every hour. The files range in size from 50 KB to 100 KB.

    The company is evaluating the implementation of Apache Iceberg tables for the data lake. The company is using AWS Glue Data Catalog as part of the evaluation. The company needs a solution to optimize query performance in Iceberg. The solution must ensure that Iceberg table performance does not degrade when more files are added over time.

    Which solution will meet these requirements?

    A. Use an AWS Glue job to compact the files into a standard size of 512 MB at the end of each day. Run an AWS Glue crawler to update the Data Catalog.
    B. Configure the Data Catalog to automatically compact the files every minute. Most Voted
    C. Configure Iceberg table properties to enable automatic compaction based on thresholds for file size and the number of files.
    D. Implement a partition strategy in Amazon S3. Run an AWS Glue crawler to update the Data Catalog every 5 minutes.

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Amazon exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DATA-ENGINEER-ASSOCIATE exam preparations and Amazon certification application, do not hesitate to visit our Vcedump.com to find your solutions here.