DATA-ENGINEER-ASSOCIATE Exam Details

  • Exam Code
    :DATA-ENGINEER-ASSOCIATE
  • Exam Name
    :AWS Certified Data Engineer - Associate (DEA-C01)
  • Certification
    :Amazon Certifications
  • Vendor
    :Amazon
  • Total Questions
    :403 Q&As
  • Last Updated
    :May 29, 2026

Amazon DATA-ENGINEER-ASSOCIATE Online Questions & Answers

  • Question 151:

    A data engineer is optimizing query performance in Amazon Athena notebooks that use Apache Spark to analyze large datasets that are stored in Amazon S3. The data is partitioned. An AWS Glue crawler updates the partitions.

    The data engineer wants to minimize the amount of data that is scanned to improve efficiency of Athena queries.

    Which solution will meet these requirements?

    A. Apply partition filters in the queries.
    B. Increase the frequency of AWS Glue crawler invocations to update the data catalog more often.
    C. Organize the data that is in Amazon S3 by using a nested directory structure.
    D. Configure Spark to use in-memory caching for frequently acceed data.

  • Question 152:

    A company is building data processing pipelines by using AWS Glue. The pipelines access data stored in Amazon S3. The company has organized the data into folders with prefixes that represent different classification levels. The company needs to restrict AWS Glue jobs to access only specific prefixes based on the data classification. The company must also restrict access to business hours (9 AM to 5 PM).

    Which elements must the company include in a custom IAM policy to meet these requirements?

    A. A Resource element with S3 object Amazon Resource Name (ARN) patterns that use wildcards for each prefix and a Condition element that uses the $util.time variable with TimeGreaterThan and TimeLessThan operators
    B. A Resource element with S3 object Amazon Resource Name (ARN) patterns that use wildcards for each prefix and a Condition element that uses the aws:CurrentTime condition key with DateGreaterThan and DateLessThan operators
    C. A Condition element that uses the s3:prefix condition key to restrict folder access and aws:CurrentTime with DateGreaterThanEquals and DateLessThanEquals to restrict hours of operation
    D. A Condition element that uses the s3:ResourceAccount condition key to restrict bucket access and a Deny statement that applies outside of business hours

  • Question 153:

    A company uses Amazon Athena for one-time queries against data that is in Amazon S3. The company has several use cases. The company must implement permission controls to separate query processes and access to query history among users, teams, and applications that are in the same AWS account.

    Which solution will meet these requirements?

    A. Create an S3 bucket for each use case. Create an S3 bucket policy that grants permissions to appropriate individual IAM users. Apply the S3 bucket policy to the S3 bucket.
    B. Create an Athena workgroup for each use case. Apply tags to the workgroup. Create an IAM policy that uses the tags to apply appropriate permissions to the workgroup.
    C. Create an JAM role for each use case. Assign appropriate permissions to the role for each use case.Associate the role with Athena.
    D. Create an AWS Glue Data Catalog resource policy that grants permissions to appropriate individual IAM users for each use case. Apply the resource policy to the specific tables that Athena uses.

  • Question 154:

    A data engineer needs to join data from multiple sources to perform a one-time analysis job. The data is stored in Amazon DynamoDB, Amazon RDS, Amazon Redshift, and Amazon S3.

    Which solution will meet this requirement MOST cost-effectively?

    A. Use an Amazon EMR provisioned cluster to read from all sources. Use Apache Spark to join the data and perform the analysis.
    B. Copy the data from DynamoDB, Amazon RDS, and Amazon Redshift into Amazon S3. Run Amazon Athena queries directly on the S3 files.
    C. Use Amazon Athena Federated Query to join the data from all data sources.
    D. Use Redshift Spectrum to query data from DynamoDB, Amazon RDS, and Amazon S3 directly from Redshift.

  • Question 155:

    A company stores time-series data that is collected from streaming services in an Amazon S3 bucket. The company must ensure that only workloads that are deployed within the company's VPC can access the data.

    Which solution will meet this requirement?

    A. Create an S3 bucket policy that uses a condition to allow access only to traffic that originates from the company's VPC.
    B. Apply a security group to the S3 bucket that allows connections only from the company's VPC CIDR block.
    C. Define an IAM policy that denies access to all users unless the request originates from within the company's VPC.
    D. Use a network ACL on the VPC subnets to allow only specific resources to access the S3 bucket.

  • Question 156:

    A retail company uses Amazon Aurora PostgreSQL to process and store live transactional data. The company uses an Amazon Redshift cluster for a data warehouse.

    An extract, transform, and load (ETL) job runs every morning to update the Redshift cluster with new data from the PostgreSQL database. The company has grown rapidly and needs to cost optimize the Redshift cluster.

    A data engineer needs to create a solution to archive historical data. The data engineer must be able to run analytics queries that effectively combine data from live transactional data in PostgreSQL, current data in Redshift, and archived historical data. The solution must keep only the most recent 15 months of data in Amazon Redshift to reduce costs.

    Which combination of steps will meet these requirements? (Choose Two.)

    A. Configure the Amazon Redshift Federated Query feature to query live transactional data that is in the PostgreSQL database.
    B. Configure Amazon Redshift Spectrum to query live transactional data that is in the PostgreSQL database.
    C. Schedule a monthly job to copy data that is older than 15 months to Amazon S3 by using the UNLOAD command. Delete the old data from the Redshift cluster. Configure Amazon Redshift Spectrum to access historical data in Amazon S3.
    D. Schedule a monthly job to copy data that is older than 15 months to Amazon S3 Glacier Flexible Retrieval by using the UNLOAD command. Delete the old data from the Redshift duster. Configure Redshift Spectrum to access historical data from S3 Glacier Flexible Retrieval.
    E. Create a materialized view in Amazon Redshift that combines live, current, and historical data from different sources.

  • Question 157:

    A technology company currently uses Amazon Kinesis Data Streams to collect log data in real time. The company wants to use Amazon Redshift for downstream real-time queries and to enrich the log data.

    Which solution will ingest data into Amazon Redshift with the LEAST operational overhead?

    A. Set up an Amazon Data Firehose delivery stream to send data to a Redshift provisioned cluster table.
    B. Set up an Amazon Data Firehose delivery stream to send data to Amazon S3. Configure a Redshift provisioned cluster to load data every minute.
    C. Configure Amazon Managed Service for Apache Flink (previously known as Amazon Kinesis Data Analytics) to send data directly to a Redshift provisioned cluster table.
    D. Use Amazon Redshift streaming ingestion from Kinesis Data Streams and to present data as a materialized view.

  • Question 158:

    A data governance team must reduce the risk of exposing sensitive customer data in an S3 data lake. The team needs to discover potential PII and enforce fine-grained access to the governed tables.

    Which services should the team use? (Choose two.)

    A. Amazon Macie
    B. AWS Lake Formation
    C. AWS Budgets
    D. Amazon Route 53
    E. AWS CodeDeploy

  • Question 159:

    A data engineer maintains custom Python scripts that perform a data formatting process that many AWS Lambda functions use. When the data engineer needs to modify the Python scripts, the data engineer must manually update all the Lambda functions.

    The data engineer requires a less manual way to update the Lambda functions.

    Which solution will meet this requirement?

    A. Store the custom Python scripts in a shared Amazon S3 bucket. Store a pointer to the custom scripts in the execution context object.
    B. Package the custom Python scripts into Lambda layers. Apply the Lambda layers to the Lambda functions.
    C. Store the custom Python scripts in a shared Amazon S3 bucket. Store a pointer to the customer scripts in environment variables.
    D. Assign the same alias to each Lambda function. Call each Lambda function by specifying the function's alias.

  • Question 160:

    A data engineer must orchestrate a data pipeline that consists of one AWS Lambda function and one AWS Glue job. The solution must integrate with AWS services.

    Which solution will meet these requirements with the LEAST management overhead?

    A. Use an AWS Step Functions workflow that includes a state machine. Configure the state machine to run the Lambda function and then the AWS Glue job.
    B. Use an Apache Airflow workflow that is deployed on an Amazon EC2 instance. Define a directed acyclic graph (DAG) in which the first task is to call the Lambda function and the second task is to call the AWS Glue job.
    C. Use an AWS Glue workflow to run the Lambda function and then the AWS Glue job.
    D. Use an Apache Airflow workflow that is deployed on Amazon Elastic Kubernetes Service (Amazon EKS). Define a directed acyclic graph (DAG) in which the first task is to call the Lambda function and the second task is to call the AWS Glue job.

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Amazon exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DATA-ENGINEER-ASSOCIATE exam preparations and Amazon certification application, do not hesitate to visit our Vcedump.com to find your solutions here.