A company needs to store semi-structured transactional data for an application in a database. The database must be serverless. The application writes the data infrequently, but it reads the data frequently.
The application must retrieve the data within milliseconds.
Which solution will meet these requirements with the LEAST operational overhead?
A. Store the data in an Amazon S3 Standard bucket. Enable S3 Transfer Acceleration.A financial company recently added more features to its mobile app. The new features required the company to create a new topic in an existing Amazon Managed Streaming for Apache Kafka (Amazon MSK) cluster.
A few days after the company added the new topic, Amazon CloudWatch raised an alarm on the RootDiskUsed metric for the MSK cluster.
How should the company address the CloudWatch alarm?
A. Expand the storage of the MSK broker. Configure the MSK cluster storage to expand automatically.A company has a gaming application that stores data in Amazon DynamoDB tables. A data engineer needs to ingest the game data into an Amazon OpenSearch Service cluster. Data updates must occur in near real time.
Which solution will meet these requirements?
A. Use AWS Step Functions to periodically export data from the Amazon DynamoDB tables to an Amazon S3 bucket. Use an AWS Lambda function to load the data into Amazon OpenSearch Service.A company stores customer data that contains personally identifiable information (PII) in an Amazon Redshift cluster. The company's marketing, claims, and analytics teams need to be able to access the customer data.
The marketing team should have access to obfuscated claim information but should have full access to customer contact information.
The claims team should have access to customer information for each claim that the team processes.
The analytics team should have access only to obfuscated PII data.
Which solution will enforce these data access requirements with the LEAST administrative overhead?
A. Create a separate Redshift cluster for each team. Load only the required data for each team. Restrict access to clusters based on the teams.A company processes 500 GB of audience and advertising data daily, storing CSV files in Amazon S3 with schemas registered in AWS Glue Data Catalog. They need to convert these files to Apache Parquet format and store them in an S3 bucket.
The solution requires a long-running workflow with 15 GiB memory capacity to process the data concurrently, followed by a correlation process that begins only after the first two processes complete.
Which solution will meet these requirements with the LEAST operational overhead?
A. Use Amazon Managed Workflows for Apache Airflow (Amazon MWAA) to orchestrate the workflow by using AWS Glue. Configure AWS Glue to begin the third process after the first two processes have finished.An analytics workload in Athena scans several terabytes of CSV files each day. Most queries read only a few columns and filter by event_date. The team wants to reduce scanned data and improve query performance.
Which data layout should the data engineer choose?
A. Convert the data to Apache Parquet and partition the S3 layout by event_date.A company has an Amazon Redshift data warehouse that users access by using a variety of IAM roles.
More than 100 users access the data warehouse every day.
The company wants to control user access to the objects based on each user's job role, permissions, and how sensitive the data is.
Which solution will meet these requirements?
A. Use the role-based access control (RBAC) feature of Amazon Redshift.A company aggregates high-frequency sensor telemetry into an Amazon S3 data lake. Each sensor stream emits structured records every hour. The records include metadata such as sensor category, unit ID, operational state, event timestamp, and site location. The data scales up to millions of records each day. The company runs complex queries each day to uncover performance insights specific to sensor categories.
Which solution will meet these requirements with the FASTEST query execution time?
A. Persist the data in Apache ORC format. Partition the data by date. Sort the data by sensor category.A company uses an Amazon Redshift cluster as a data warehouse that is shared acro two departments.
To comply with a security policy, each department must have unique acce permiions.
Department A must have acce to tables and views for Department
A. Group tables and views for each department into dedicated schemas. Manage permiions at the schema level.An insurance company stores transaction data that the company compressed with gzip.
The company needs to query the transaction data for occasional audits.
Which solution will meet this requirement in the MOST cost-effective way?
A. Store the data in Amazon Glacier Flexible Retrieval. Use Amazon S3 Glacier Select to query the data.Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Amazon exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DATA-ENGINEER-ASSOCIATE exam preparations and Amazon certification application, do not hesitate to visit our Vcedump.com to find your solutions here.