DATA-ENGINEER-ASSOCIATE Exam Details

  • Exam Code
    :DATA-ENGINEER-ASSOCIATE
  • Exam Name
    :AWS Certified Data Engineer - Associate (DEA-C01)
  • Certification
    :Amazon Certifications
  • Vendor
    :Amazon
  • Total Questions
    :403 Q&As
  • Last Updated
    :May 29, 2026

Amazon DATA-ENGINEER-ASSOCIATE Online Questions & Answers

  • Question 211:

    A company analyzes data in a data lake every quarter to perform inventory assessments. A data engineer uses AWS Glue DataBrew to detect any personally identifiable information (PII) about customers within the data. The company's privacy policy considers some custom categories of information to be PII. However, the categories are not included in standard DataBrew data quality rules.

    The data engineer needs to modify the current process to scan for the custom PII categories across multiple datasets within the data lake.

    Which solution will meet these requirements with the LEAST operational overhead?

    A. Manually review the data for custom PII categories.
    B. Implement custom data quality rules in Data Brew. Apply the custom rules across datasets.
    C. Develop custom Python scripts to detect the custom PII categories. Call the scripts from DataBrew.
    D. Implement regex patterns to extract PII information from fields during extract transform, and load (ETL) operations into the data lake.

  • Question 212:

    A nance company receives data from third-party data providers and stores the data as objects in an Amazon S3 bucket.

    The company ran an AWS Glue crawler on the objects to create a data catalog. The AWS Glue crawler created multiple tables. However, the company expected that the crawler would create only one table.

    The company needs a solution that will ensure the AVS Glue crawler creates only one table.

    Which combination of solutions will meet this requirement? (Choose two.)

    A. Ensure that the object format, compression type, and schema are the same for each object.
    B. Ensure that the object format and schema are the same for each object. Do not enforce consistency for the compression type of each object.
    C. Ensure that the schema is the same for each object. Do not enforce consistency for the file format and compression type of each object.
    D. Ensure that the structure of the prefix for each S3 object name is consistent.
    E. Ensure that all S3 object names follow a similar pattern.

  • Question 213:

    A company is developing an application that runs on Amazon EC2 instances. Currently, the data that the application generates is temporary. However, the company needs to persist the data, even if the EC2 instances are terminated.

    A data engineer must launch new EC2 instances from an Amazon Machine Image (AMI) and configure the instances to preserve the data.

    Which solution will meet this requirement?

    A. Launch new EC2 instances by using an AMI that is backed by an EC2 instance store volume that contains the application data. Apply the default settings to the EC2 instances.
    B. Launch new EC2 instances by using an AMI that is backed by a root Amazon Elastic Block Store (Amazon EBS) volume that contains the application data. Apply the default settings to the EC2 instances.
    C. Launch new EC2 instances by using an AMI that is backed by an EC2 instance store volume. Attach an Amazon Elastic Block Store (Amazon EBS) volume to contain the application data. Apply the default settings to the EC2 instances.
    D. Launch new EC2 instances by using an AMI that is backed by an Amazon Elastic Block Store (Amazon EBS) volume. Attach an additional EC2 instance store volume to contain the application data. Apply the default settings to the EC2 instances.

  • Question 214:

    A data engineer creates an AWS Lambda function that an Amazon EventBridge event will invoke. When the data engineer tries to invoke the Lambda function by using an EventBridge event, an AccessDeniedException message appears.

    How should the data engineer resolve the exception?

    A. Ensure that the trust policy of the Lambda function execution role allows EventBridge to assume the execution role.
    B. Ensure that both the IAM role that EventBridge uses and the Lambda function's resource-based policy have the necessary permissions.
    C. Ensure that the subnet where the Lambda function is deployed is configured to be a private subnet.
    D. Ensure that EventBridge schemas are valid and that the event mapping configuration is correct.

  • Question 215:

    A company uses Amazon DataZone as a data governance and business catalog solution. The company stores data in an Amazon S3 data lake.

    The company uses AWS Glue with an AWS Glue Data Catalog.

    A data engineer needs to publish AWS Glue Data Quality scores to the Amazon DataZone portal.

    Which solution will meet this requirement?

    A. Create a data quality ruleset with Data Quality De nition language (DQDL) rules that apply to a specific AWS Glue table. Schedule the ruleset to run daily. configure the Amazon DataZone project to have an Amazon Redshift data source. Enable the data quality configuration for the data source.
    B. configure AWS Glue ETL jobs to use an Evaluate Data Quality transform. De fine a data quality ruleset inside the jobs. configure the Amazon DataZone project to have an AWS Glue data source. Enable the data quality configuration for the data source.
    C. Create a data quality ruleset with Data Quality De nition language (DQDL) rules that apply to a specific AWS Glue table. Schedule the ruleset to run daily. configure the Amazon DataZone project to have an AWS Glue data source. Enable the data quality configuration for the data source.
    D. configure AWS Glue ETL jobs to use an Evaluate Data Quality transform. De fine a data quality ruleset inside the jobs. configure the Amazon DataZone project to have an Amazon Redshift data source. Enable the data quality configuration for the data source.

  • Question 216:

    A data engineer needs to design a data pipeline that invokes an AWS Glue job. After the AWS Glue job finishes successfully, the pipeline needs to invoke three AWS Lambda functions. The pipeline must be serverless. The data engineer wants to see the entire pipeline lineage in a single interface.

    Which solution will meet these requirements?

    A. Configure a workflow in AWS Step Functions that invokes the AWS Glue job and the Lambda functions. View the lineage in Step Functions Workflow Studio.
    B. Deploy an Apache Airflow workflow that invokes the AWS Glue job and the Lambda functions. View the lineage in the Airflow UI.
    C. Build the pipeline in the AWS Glue job. Invoke the Lambda functions after the AWS Glue job runs. Use Amazon CloudWatch Logs Insights to view the lineage.
    D. Deploy a workflow in AWS Step Functions to invoke the AWS Glue job. In the job code, invoke the Lambda functions before the job finishes. View the lineage from the AWS Glue UI.

  • Question 217:

    A company uses an AWS Lambda function to transfer les from a legacy SFTP environment to Amazon S3 buckets. The Lambda function is VPC enabled to ensure that all communications between the Lambda function and other AVS services that are in the same VPC environment will occur over a secure network.

    The Lambda function is able to connect to the SFTP environment successfully. However, when the Lambda function attempts to upload les to the S3 buckets, the Lambda function returns timeout errors. A data engineer must resolve the timeout issues in a secure way.

    Which solution will meet these requirements in the MOST cost-effective way?

    A. Create a NAT gateway in the public subnet of the VPC. Route network traffic to the NAT gateway.
    B. Create a VPC gateway endpoint for Amazon S3. Route network traffic to the VPC gateway endpoint.
    C. Create a VPC interface endpoint for Amazon S3. Route network traffic to the VPC interface endpoint.
    D. Use a VPC internet gateway to connect to the internet. Route network traffic to the VPC internet gateway.

  • Question 218:

    A retail company uses AWS Glue for extract, transform, and load (ETL) operations on a dataset that contains information about customer orders. The company wants to implement specific validation rules to ensure data accuracy and consistency.

    Which solution will meet these requirements?

    A. Use AWS Glue job bookmarks to track the data for accuracy and consistency.
    B. Create custom AWS Glue Data Quality rulesets to de fine specific data quality checks.
    C. Use the built-in AWS Glue Data Quality transforms for standard data quality validations.
    D. Use AWS Glue Data Catalog to maintain a centralized data schema and metadata repository.

  • Question 219:

    A data engineer has implemented data quality rules in 1,000 AWS Glue Data Catalog tables. Because of a recent change in business requirements, the data engineer must edit the data quality rules.

    How should the data engineer meet this requirement with the LEAST operational overhead?

    A. Create a pipeline in AWS Glue ETL to edit the rules for each of the 1,000 Data Catalog tables. Use an AWS Lambda function to call the corresponding AWS Glue job for each Data Catalog table.
    B. Create an AWS Lambda function that makes an API call to AWS Glue Data Quality to make the edits.
    C. Create an Amazon EMR cluster. Run a pipeline on Amazon EMR that edits the rules for each Data Catalog table. Use an AWS Lambda function to run the EMR pipeline.
    D. Use the AWS Management Console to edit the rules within the Data Catalog.

  • Question 220:

    A hotel management company receives daily data files from each of its hotels. The company wants to upload its data to AWS. The company plans to use Amazon Athena to access the files. The company needs to protect the files from accidental deletion. The company will develop an application on its on-premises servers to automatically forward the files to a fully managed AWS ingestion service.

    Which solution will meet these requirements with the LEAST operational overhead?

    A. Use AWS DataSync to replicate data from the on-premises servers to Amazon Elastic File System (Amazon EFS). Configure automatic backups in AWS Backup.
    B. Use the Amazon Kinesis Agent on the on-premises servers to send data to Amazon Data Firehose. Store the data in an Amazon S3 bucket that has versioning enabled.
    C. Use AWS Glue jobs to ingest data from the on-premises servers into Amazon RDS. Enable automated backups for data protection.
    D. Use a self-managed Apache Kafka agent on the on-premises servers to stream data to Amazon Managed Streaming for Apache Kafka (Amazon MSK). Store the data in an Amazon S3 bucket with versioning enabled.

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Amazon exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DATA-ENGINEER-ASSOCIATE exam preparations and Amazon certification application, do not hesitate to visit our Vcedump.com to find your solutions here.