A retail company's ecommerce website recently experienced performance issues when there was a one-day sale. The site reliability engineer wants to query all the web logs from the time of the sale to troubleshoot the performance issues. The web logs are stored in an Amazon S3 bucket.
Which solution MOST cost-effectively meets these requirements?
A. Create an Amazon Redshift cluster and use Amazon Redshift Spectrum to query the web logs.
B. Use Amazon S3 Select to query the web logs.
C. Load the web logs from Amazon S3 to an Amazon DynamoDB table and query the table.
D. Use Amazon Athena to query the web logs.
A retail company stores order invoices in an Amazon OpenSearch Service (Amazon Elasticsearch Service) cluster. Indices on the cluster are created monthly. Once a new month begins, no new writes are made to any of the indices from the previous months. The company has been expanding the storage on the Amazon OpenSearch Service (Amazon Elasticsearch Service) cluster to avoid running out of space, but the company wants to reduce costs. Most searches on the cluster are on the most recent 3 months of data, while the audit team requires infrequent access to older data to generate periodic reports. The most recent 3 months of data must be quickly available for queries, but the audit team can tolerate slower queries if the solution saves on cluster costs.
Which of the following is the MOST operationally efficient solution to meet these requirements?
A. Archive indices that are older than 3 months by using Index State Management (ISM) to create a policy to store the indices in Amazon S3 Glacier Instant Retrieval. When the audit team requires the archived data, restore the archived indices back to the Amazon OpenSearch Service (Amazon Elasticsearch Service) cluster.
B. Archive indices that are older than 3 months by taking manual snapshots and storing the snapshots in Amazon S3. When the audit team requires the archived data, restore the archived indices back to the Amazon OpenSearch Service (Amazon Elasticsearch Service) cluster.
C. Archive indices that are older than 3 months by using Index State Management (ISM) to create a policy to migrate the indices to Amazon OpenSearch Service (Amazon Elasticsearch Service) UltraWarm storage.
D. Archive indices that are older than 3 months by using Index State Management (ISM) to create a policy to migrate the indices to Amazon OpenSearch Service (Amazon Elasticsearch Service) UltraWarm storage. When the audit team requires the older data, migrate the indices in UltraWarm storage back to hot storage.
An online advertising company wants to perform sentiment analysis of social media data to measure the success of online advertisements. The company wants to implement an end-to-end streaming solution to continuously ingest data from various social networks, clean and transform the streaming data in near-real time, and make the data available for analytics and visualization with Amazon QuickSight. The company wants a solution that is easy to implement and manage so it can design better analytics solutions instead of provisioning and maintaining infrastructure.
Which solution meets these requirements with the LEAST amount of operational effort?
A. Use Amazon Kinesis Data Firehose to ingest the data. Author an AWS Glue streaming ETL job to transform the ingested data. Load the transformed data into an Amazon Redshift table.
B. Use Apache Kafka running on Amazon EC2 instances to ingest the data. Create an Amazon EMR Spark job to transform the ingested data. Use the COPY command to load the transformed data into an Amazon Redshift table.
C. Use Amazon Managed Streaming for Apache Kafka (Amazon MSK) to ingest the data. Create an Amazon EMR Spark job to transform the ingested data. Use the COPY command to load the transformed data into an Amazon Redshift table.
D. Use Amazon Kinesis Data Streams to ingest the data. Author an AWS Glue streaming ETL job to transform the ingested data. Load the transformed data into an Amazon Redshift table.
A company is using a single master node in an Amazon OpenSearch Service (Amazon Elasticsearch Service) cluster to provide a search API to its front-end applications. The company configured an automated process on AWS that monitors the OpenSearch Service cluster and automatically adds data nodes to scale the cluster, when needed. During the initial load testing, the system reacted to scaling events properly by adding data nodes, but every time a new data node is added, the company experiences a blue/green deployment that creates a disruption in service. The company wants to create highly available solution that will prevent these service disruptions.
Which solution meets these requirements?
A. Increase the number of OpenSearch Service master nodes from one to two.
B. Configure multi-zone awareness on the OpenSearch Service cluster.
C. Configure the OpenSearch Service cluster to use three dedicated master nodes.
D. Disable OpenSearch Service Auto-Tune and roll back its changes.
A public sector organization ingests large datasets from various relational databases into an Amazon S3 data lake on a daily basis. Data analysts need a mechanism to profile the data and diagnose data quality issues after the data is ingested into Amazon S3. The solution should allow the data analysts to visualize and explore the data quality metrics through a user interface.
Which set of steps provide a solution that meets these requirements?
A. Create a new AWS Glue DataBrew dataset for each dataset in the S3 data lake. Create a new DataBrew project for each dataset. Create a profile job for each project and schedule it to run daily. Instruct the data analysts to explore the data quality metrics by using the DataBrew console.
B. Create a new AWS Glue ETL job that uses the Deequ Spark library for data validation and schedule the ETL job to run daily. Store the output of the ETL job within an S3 bucket. Instruct the data analysts to query and visualize the data quality metrics by using the Amazon Athena console.
C. Schedule an AWS Lambda function to run daily by using Amazon EventBridge (Amazon CloudWatch Events). Configure the Lambda function to test the data quality of each object and store the results in an S3 bucket. Create an Amazon QuickSight dashboard to query and visualize the results. Instruct the data analysts to explore the data quality metrics using QuickSight.
D. Schedule an AWS Step Functions workflow to run daily by using Amazon EventBridge (Amazon CloudWatch Events). Configure the steps by using AWS Lambda functions to perform the data quality checks and update the catalog tags in the AWS Glue Data Catalog with the results. Instruct the data analysts to explore the data quality metrics using the Data Catalog console.
A company owns manufacturing facilities with Internet of Things (IoT) devices installed to monitor safety data. The company has configured an Amazon Kinesis data stream as a source for an Amazon Kinesis Data Firehose delivery stream, which outputs data to Amazon S3. The company's operations team wants to gain insights from the IoT data to monitor data quality at ingestion. The insights need to be derived in near-real time, and the output must be logged to Amazon DynamoDB for further analysis.
Which solution meets these requirements?
A. Create an Amazon Kinesis Data Analytics for SQL application to read and analyze the data in the data stream. Add an output configuration so that everything written to an in-application stream persists in a DynamoDB table.
B. Create an Amazon Kinesis Data Analytics for SQL application to read and analyze the data in the data stream. Add an output configuration so that everything written to an in-application stream is passed to an AWS Lambda function that saves the data in a DynamoDB table as persistent data.
C. Configure an AWS Lambda function to analyze the data in the Kinesis Data Firehose delivery stream. Save the output to a DynamoDB table.
D. Configure an AWS Lambda function to analyze the data in the Kinesis Data Firehose delivery stream and save the output to an S3 bucket. Schedule an AWS Glue job to periodically copy the data from the bucket to a DynamoDB table.
An online food delivery company wants to optimize its storage costs. The company has been collecting operational data for the last 10 years in a data lake that was built on Amazon S3 by using a Standard storage class. The company does not keep data that is older than 7 years. The data analytics team frequently uses data from the past 6 months for reporting and runs queries on data from the last 2 years about once a month. Data that is more than 2 years old is rarely accessed and is only used for audit purposes.
Which combination of solutions will optimize the company's storage costs? (Choose two.)
A. Create an S3 Lifecycle configuration rule to transition data that is older than 6 months to the S3 Standard-Infrequent Access (S3 Standard-IA) storage class. Create another S3 Lifecycle configuration rule to transition data that is older than 2 years to the S3 Glacier Deep Archive storage class.
B. Create an S3 Lifecycle configuration rule to transition data that is older than 6 months to the S3 One Zone-Infrequent Access (S3 One Zone-IA) storage class. Create another S3 Lifecycle configuration rule to transition data that is older than 2 years to the S3 Glacier Flexible Retrieval storage class.
C. Use the S3 Intelligent-Tiering storage class to store data instead of the S3 Standard storage class.
D. Create an S3 Lifecycle expiration rule to delete data that is older than 7 years.
E. Create an S3 Lifecycle configuration rule to transition data that is older than 7 years to the S3 Glacier Deep Archive storage class.
An ecommerce company uses Amazon Aurora PostgreSQL to process and store live transactional data and uses Amazon Redshift for its data warehouse solution. A nightly ETL job has been implemented to update the Redshift cluster with new data from the PostgreSQL database. The business has grown rapidly and so has the size and cost of the Redshift cluster. The company's data analytics team needs to create a solution to archive historical data and only keep the most recent 12 months of data in Amazon Redshift to reduce costs. Data analysts should also be able to run analytics queries that effectively combine data from live transactional data in PostgreSQL, current data in Redshift, and archived historical data.
Which combination of tasks will meet these requirements? (Choose three.)
A. Configure the Amazon Redshift Federated Query feature to query live transactional data in the PostgreSQL database.
B. Configure Amazon Redshift Spectrum to query live transactional data in the PostgreSQL database.
C. Schedule a monthly job to copy data older than 12 months to Amazon S3 by using the UNLOAD command, and then delete that data from the Redshift cluster. Configure Amazon Redshift Spectrum to access historical data in Amazon S3.
D. Schedule a monthly job to copy data older than 12 months to Amazon S3 Glacier Flexible Retrieval by using the UNLOAD command, and then delete that data from the Redshift cluster. Configure Redshift Spectrum to access historical data with S3 Glacier Flexible Retrieval.
E. Create a late-binding view in Amazon Redshift that combines live, current, and historical data from different sources.
F. Create a materialized view in Amazon Redshift that combines live, current, and historical data from different sources.
A financial company uses Amazon Athena to query data from an Amazon S3 data lake. Files are stored in the S3 data lake in Apache ORC format. Data analysts recently introduced nested fields in the data lake ORC files, and noticed that queries are taking longer to run in Athena. A data analysts discovered that more data than what is required is being scanned for the queries.
What is the MOST operationally efficient solution to improve query performance?
A. Flatten nested data and create separate files for each nested dataset.
B. Use the Athena query engine V2 and push the query filter to the source ORC file.
C. Use Apache Parquet format instead of ORC format.
D. Recreate the data partition strategy and further narrow down the data filter criteria.
A company uses Amazon OpenSearch Service (Amazon Elasticsearch Service) to store and analyze its website clickstream data. The company ingests 1 TB of data daily using Amazon Kinesis Data Firehose and stores one day's worth of data in an Amazon OpenSearch Service (Amazon Elasticsearch Service) cluster.
The company has very slow query performance on the Amazon OpenSearch Service (Amazon Elasticsearch Service) index and occasionally sees errors from Kinesis Data Firehose when attempting to write to the index. The Amazon OpenSearch Service (Amazon Elasticsearch Service) cluster has 10 nodes running a single index and 3 dedicated master nodes. Each data node has 1.5 TB of Amazon EBS storage attached and the cluster is configured with 1,000 shards. Occasionally, JVMMemoryPressure errors are found in the cluster logs.
Which solution will improve the performance of Amazon OpenSearch Service (Amazon Elasticsearch Service)?
A. Increase the memory of the Amazon OpenSearch Service (Amazon Elasticsearch Service) master nodes.
B. Decrease the number of Amazon OpenSearch Service (Amazon Elasticsearch Service) data nodes.
C. Decrease the number of Amazon OpenSearch Service (Amazon Elasticsearch Service) shards for the index.
D. Increase the number of Amazon OpenSearch Service (Amazon Elasticsearch Service) shards for the index.
Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Amazon exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DAS-C01 exam preparations and Amazon certification application, do not hesitate to visit our Vcedump.com to find your solutions here.