DATA-ENGINEER-ASSOCIATE Exam Details

  • Exam Code
    :DATA-ENGINEER-ASSOCIATE
  • Exam Name
    :AWS Certified Data Engineer - Associate (DEA-C01)
  • Certification
    :Amazon Certifications
  • Vendor
    :Amazon
  • Total Questions
    :403 Q&As
  • Last Updated
    :May 29, 2026

Amazon DATA-ENGINEER-ASSOCIATE Online Questions & Answers

  • Question 181:

    A company needs to store semi-structured transactional data for an application in a database. The database must be serverless. The application writes the data infrequently, but it reads the data frequently.

    The application must retrieve the data within milliseconds.

    Which solution will meet these requirements with the LEAST operational overhead?

    A. Store the data in an Amazon S3 Standard bucket. Enable S3 Transfer Acceleration.
    B. Store the data in an Amazon S3 Apache Iceberg table. Enable S3 Transfer Acceleration.
    C. Store the data in an Amazon RDS for MySQL cluster. Configure RDS Optimized Reads for the cluster.
    D. Store the data in an Amazon DynamoDB table. Configure a DynamoDB Accelerator cache.

  • Question 182:

    A financial company recently added more features to its mobile app. The new features required the company to create a new topic in an existing Amazon Managed Streaming for Apache Kafka (Amazon MSK) cluster.

    A few days after the company added the new topic, Amazon CloudWatch raised an alarm on the RootDiskUsed metric for the MSK cluster.

    How should the company address the CloudWatch alarm?

    A. Expand the storage of the MSK broker. Configure the MSK cluster storage to expand automatically.
    B. Expand the storage of the Apache ZooKeeper nodes.
    C. Update the MSK broker instance to a larger instance type. Restart the MSK cluster.
    D. Specify the Target-Volume-in-GiB parameter for the existing topic.

  • Question 183:

    A company has a gaming application that stores data in Amazon DynamoDB tables. A data engineer needs to ingest the game data into an Amazon OpenSearch Service cluster. Data updates must occur in near real time.

    Which solution will meet these requirements?

    A. Use AWS Step Functions to periodically export data from the Amazon DynamoDB tables to an Amazon S3 bucket. Use an AWS Lambda function to load the data into Amazon OpenSearch Service.
    B. Configure an AW5 Glue job to have a source of Amazon DynamoDB and a destination of Amazon OpenSearch Service to transfer data in near real time.
    C. Use Amazon DynamoDB Streams to capture table changes. Use an AWS Lambda function to process and update the data in Amazon OpenSearch Service.
    D. Use a custom OpenSearch plugin to sync data from the Amazon DynamoDB tables.

  • Question 184:

    A company stores customer data that contains personally identifiable information (PII) in an Amazon Redshift cluster. The company's marketing, claims, and analytics teams need to be able to access the customer data.

    The marketing team should have access to obfuscated claim information but should have full access to customer contact information.

    The claims team should have access to customer information for each claim that the team processes.

    The analytics team should have access only to obfuscated PII data.

    Which solution will enforce these data access requirements with the LEAST administrative overhead?

    A. Create a separate Redshift cluster for each team. Load only the required data for each team. Restrict access to clusters based on the teams.
    B. Create views that include required fields for each of the data requirements. Grant the teams access only to the view that each team requires.
    C. Create a separate Amazon Redshift database role for each team. Define masking policies that apply for each team separately. Attach appropriate masking policies to each team role.
    D. Move the customer data to an Amazon S3 bucket. Use AWS Lake Formation to create a data lake. Use fine-grained security capabilities to grant each team appropriate permissions to access the data.

  • Question 185:

    A company processes 500 GB of audience and advertising data daily, storing CSV files in Amazon S3 with schemas registered in AWS Glue Data Catalog. They need to convert these files to Apache Parquet format and store them in an S3 bucket.

    The solution requires a long-running workflow with 15 GiB memory capacity to process the data concurrently, followed by a correlation process that begins only after the first two processes complete.

    Which solution will meet these requirements with the LEAST operational overhead?

    A. Use Amazon Managed Workflows for Apache Airflow (Amazon MWAA) to orchestrate the workflow by using AWS Glue. Configure AWS Glue to begin the third process after the first two processes have finished.
    B. Use Amazon EMR to run each process in the workflow. Create an Amazon Simple Queue Service (Amazon SQS) queue to handle messages that indicate the completion of the first two processes. Configure an AWS Lambda function to process the SQS queue by running the third process.
    C. Use AWS Glue workflows to run the first two processes in parallel. Ensure that the third process starts after the first two processes have finished.
    D. Use AWS Step Functions to orchestrate a workflow that uses multiple AWS Lambda functions. Ensure that the third process starts after the first two processes have finished.

  • Question 186:

    An analytics workload in Athena scans several terabytes of CSV files each day. Most queries read only a few columns and filter by event_date. The team wants to reduce scanned data and improve query performance.

    Which data layout should the data engineer choose?

    A. Convert the data to Apache Parquet and partition the S3 layout by event_date.
    B. Combine all CSV files into one uncompressed file in a single S3 prefix.
    C. Convert the data to JSON and remove all partition prefixes.
    D. Store the data in plain text files with random object key prefixes only.

  • Question 187:

    A company has an Amazon Redshift data warehouse that users access by using a variety of IAM roles.

    More than 100 users access the data warehouse every day.

    The company wants to control user access to the objects based on each user's job role, permissions, and how sensitive the data is.

    Which solution will meet these requirements?

    A. Use the role-based access control (RBAC) feature of Amazon Redshift.
    B. Use the row-level security (RLS) feature of Amazon Redshift.
    C. Use the column-level security (CLS) feature of Amazon Redshift.
    D. Use dynamic data masking policies in Amazon Redshift.

  • Question 188:

    A company aggregates high-frequency sensor telemetry into an Amazon S3 data lake. Each sensor stream emits structured records every hour. The records include metadata such as sensor category, unit ID, operational state, event timestamp, and site location. The data scales up to millions of records each day. The company runs complex queries each day to uncover performance insights specific to sensor categories.

    Which solution will meet these requirements with the FASTEST query execution time?

    A. Persist the data in Apache ORC format. Partition the data by date. Sort the data by sensor category.
    B. Persist the data in CSV format. Partition the data by date. Sort the data by operational status.
    C. Persist the data in Parquet format. Partition the data by sensor category. Sort the data by date
    D. Persist the data in CSV format. Partition the data by date. Sort the data by sensor category.

  • Question 189:

    A company uses an Amazon Redshift cluster as a data warehouse that is shared acro two departments.

    To comply with a security policy, each department must have unique acce permiions.

    Department A must have acce to tables and views for Department

    A. Group tables and views for each department into dedicated schemas. Manage permiions at the schema level.
    B. Group tables and views for each department into dedicated databases. Manage permiions at the database level.
    C. Update the names of the tables and views to follow a naming convention that contains the department names. Manage permiions based on the new naming convention.
    D. Create an IAM user group for each department. Use identity-based IAM policies to grant table and view permiions based on the IAM user group.

  • Question 190:

    An insurance company stores transaction data that the company compressed with gzip.

    The company needs to query the transaction data for occasional audits.

    Which solution will meet this requirement in the MOST cost-effective way?

    A. Store the data in Amazon Glacier Flexible Retrieval. Use Amazon S3 Glacier Select to query the data.
    B. Store the data in Amazon S3. Use Amazon S3 Select to query the data.
    C. Store the data in Amazon S3. Use Amazon Athena to query the data.
    D. Store the data in Amazon Glacier Instant Retrieval. Use Amazon Athena to query the data.

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Amazon exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DATA-ENGINEER-ASSOCIATE exam preparations and Amazon certification application, do not hesitate to visit our Vcedump.com to find your solutions here.