An organization has 10,000 devices that generate 10 GB of telemetry data per day, with each record size around 10 KB. Each record has 100 fields, and one field consists of unstructured log data with a "String" data type in the English language. Some fields are required for the real-time dashboard, but all fields must be available for long-term generation.
The organization also has 10 PB of previously cleaned and structured data, partitioned by Date, in a SAN that must be migrated to AWS within one month. Currently, the organization does not have any real-time capabilities in their solution. Because of storage limitations in the on-premises data warehouse, selective data is loaded while generating the long-term trend with ANSI SQL queries through JDBC for visualization. In addition to the one-time data loading, the organization needs a cost-effective and real-time solution.
How can these requirements be met? (Choose two.)
A. use AWS IoT to send data from devices to an Amazon SQS queue, create a set of workers in an Auto Scaling group and read records in batch from the queue to process and save the data. Fan out to an Amazon SNS queue attached with an AWS Lambda function to filter the request dataset and save it to Amazon Elasticsearch Service for real-time analytics.
B. Create a Direct Connect connection between AWS and the on-premises data center and copy the data to Amazon S3 using S3 Acceleration. Use Amazon Athena to query the data.
C. Use AWS IoT to send the data from devices to Amazon Kinesis Data Streams with the IoT rules engine. Use one Kinesis Data Firehose stream attached to a Kinesis stream to batch and stream the data partitioned by date. Use another Kinesis Firehose stream attached to the same Kinesis stream to filter out the required fields to ingest into Elasticsearch for real-time analytics.
D. Use AWS IoT to send the data from devices to Amazon Kinesis Data Streams with the IoT rules engine. Use one Kinesis Data Firehose stream attached to a Kinesis stream to stream the data into an Amazon S3 bucket partitioned by date. Attach an AWS Lambda function with the same Kinesis stream to filter out the required fields for ingestion into Amazon DynamoDB for real-time analytics.
E. use multiple AWS Snowball Edge devices to transfer data to Amazon S3, and use Amazon Athena to query the data.
A gaming organization is developing a new game and would like to offer real-time competition to their users. The data architecture has the following characteristics:
1.
The game application is writing events directly to Amazon DynamoDB from the user's mobile device.
2.
Users from the website can access their statistics directly from DynamoDB.
3.
The game servers are accessing DynamoDB to update the user's information.
4.
The data science team extracts data from DynamoDB for various applications.
5.
The engineering team has already agreed to the IAM roles and policies to use for the data science team and the application.
Which actions will provide the MOST security, while maintaining the necessary access to the website and game application? (Choose two.)
A. Use Amazon Cognito user pool to authenticate to both the website and the game application.
B. Use IAM identity federation to authenticate to both the website and the game application.
C. Create an IAM policy with PUT permission for both the website and the game application.
D. Create an IAM policy with fine-grained permission for both the website and the game application.
E. Create an IAM policy with PUT permission for the game application and an IAM policy with GET
permission for the website.
A gas company needs to monitor gas pressure in their pipelines. Pressure data is streamed from sensors placed throughout the pipelines to monitor the data in real time. When an anomaly is detected, the system must send a notification to open valve. An Amazon Kinesis stream collects the data from the sensors and an anomaly Kinesis stream triggers an AWS Lambda function to open the appropriate valve.
Which solution is the MOST cost-effective for responding to anomalies in real time?
A. Attach a Kinesis Firehose to the stream and persist the sensor data in an Amazon S3 bucket. Schedule an AWS Lambda function to run a query in Amazon Athena against the data in Amazon S3 to identify anomalies. When a change is detected, the Lambda function sends a message to the anomaly stream to open the valve.
B. Launch an Amazon EMR cluster that uses Spark Streaming to connect to the Kinesis stream and Spark machine learning to detect anomalies. When a change is detected, the Spark application sends a message to the anomaly stream to open the valve.
C. Launch a fleet of Amazon EC2 instances with a Kinesis Client Library application that consumes the stream and aggregates sensor data over time to identify anomalies. When an anomaly is detected, the application sends a message to the anomaly stream to open the valve.
D. Create a Kinesis Analytics application by using the RANDOM_CUT_FOREST function to detect an anomaly. When the anomaly score that is returned from the function is outside of an acceptable range, a message is sent to the anomaly stream to open the valve.
A real-time bidding company is rebuilding their monolithic application and is focusing on serving real-time
data. A large number of reads and writes are generated from thousands of concurrent users who follow
items and bid on the company's sale offers.
The company is experiencing high latency during special event spikes, with millions of concurrent users.
The company needs to analyze and aggregate a part of the data in near real time to feed an internal
dashboard.
What is the BEST approach for serving and analyzing data, considering the constraint of the row latency
on the highly demanded data?
A. Use Amazon Aurora with Multi Availability Zone and read replicas. Use Amazon ElastiCache in front of the read replicas to serve read-only content quickly. Use the same database as datasource for the
dashboard.
B. Use Amazon DynamoDB to store real-time data with Amazon DynamoDB. Accelerator to serve content quickly. use Amazon DynamoDB Streams to replay all changes to the table, process and stream to Amazon Elasti search Service with AWS Lambda.
C. Use Amazon RDS with Multi Availability Zone. Provisioned IOPS EBS volume for storage. Enable up to five read replicas to serve read-only content quickly. Use Amazon EMR with Sqoop to import Amazon RDS data into HDFS for analysis.
D. Use Amazon Redshift with a DC2 node type and a multi-mode cluster. Create an Amazon EC2 instance with pgpoo1 installed. Create an Amazon ElastiCache cluster and route read requests through pgpoo1, and use Amazon Redshift for analysis.
An organization's data warehouse contains sales data for reporting purposes. data governance policies prohibit staff from accessing the customers' credit card numbers.
How can these policies be adhered to and still allow a Data Scientist to group transactions that use the same credit card number?
A. Store a cryptographic hash of the credit card number.
B. Encrypt the credit card number with a symmetric encryption key, and give the key only to the authorized Data Scientist.
C. Mask the credit card numbers to only show the last four digits of the credit card number.
D. Encrypt the credit card number with an asymmetric encryption key and give the decryption key only to the authorized Data Scientist.
Multiple rows in an Amazon Redshift table were accidentally deleted. A System Administrator is restoring the table from the most recent snapshot. The snapshot contains all rows that were in the table before the deletion.
What is the SIMPLEST solution to restore the table without impacting users?
A. Restore the snapshot to a new Amazon Redshift cluster, then UNLOAD the table to Amazon S3. In the original cluster, TRUNCATE the table, then load the data from Amazon S3 by using a COPY command.
B. Use the Restore Table from a Snapshot command and specify a new table name DROP the original table, then RENAME the new table to the original table name.
C. Restore the snapshot to a new Amazon Redshift cluster. Create a DBLINK between the two clusters in the original cluster, TRUNCATE the destination table, then use an INSERT command to copy the data from the new cluster.
D. Use the ALTER TABLE REVERT command and specify a time stamp of immediately before the data deletion. Specify the Amazon Resource Name of the snapshot as the SOURCE and use the OVERWRITE REPLACE option.
An organization needs to store sensitive information on Amazon S3 and process it through Amazon EMR. Data must be encrypted on Amazon S3 and Amazon EMR at rest and in transit. Using Thrift Server, the Data Analysis team users HIVE to interact with this data. The organization would like to grant access to only specific databases and tables, giving permission only to the SELECT statement.
Which solution will protect the data and limit user access to the SELECT statement on a specific portion of data?
A. Configure Transparent Data Encryption on Amazon EMR. Create an Amazon EC2 instance and install Apache Ranger. Configure the authorization on the cluster to use Apache Ranger.
B. Configure data encryption at rest for EMR File System (EMRFS) on Amazon S3. Configure data encryption in transit for traffic between Amazon S3 and EMRFS. Configure storage and SQL base authorization on HiveServer2.
C. Use AWS KMS for encryption of data. Configure and attach multiple roles with different permissions based on the different user needs.
D. Configure Security Group on Amazon EMR. Create an Amazon VPC endpoint for Amazon S3.
Configure HiveServer2 to use Kerberos authentication on the cluster.
An organization is designing an Amazon DynamoDB table for an application that must meet the following requirements:
1.
Item size is 40 KB
2.
Read/write ratio 2000/500 sustained, respectively
3.
Heavily read-oriented and requires low latencies in the order of milliseconds
4.
The application runs on an Amazon EC2 instance
5.
Access to the DynamoDB table must be secure within the VPC
6.
Minimal changes to application code to improve performance using write-through cache
Which design options will BEST meet these requirements?
A. Size the DynamoDB table with 10000 RCUs/20000 WCUs, implement the DynamoDB Accelerator (DAX) for read performance, use VPC endpoints for DynamoDB, and implement an IAM role on the EC2 instance to secure DynamoDB access.
B. Size the DynamoDB table with 20000 RCUs/20000 WCUs, implement the DynamoDB Accelerator (DAX) for read performance, leverage VPC endpoints for DynamoDB, and implement an IAM user on the EC2 instance to secure DynamoDB access.
C. Size the DynamoDB table with 10000 RCUs/20000 WCUs, implement Amazon ElastiCache for read performance, set up a NAT gateway on VPC for the EC2 instance to access DynamoDB, and implement an IAM role on the EC2 instance to secure DynamoDB access.
D. Size the DynamoDB table with 20000 RCUs/20000 WCUs, implement Amazon ElastiCache for read performance, leverage VPC endpoints for DynamoDB, and implement an IAM user on the EC2 instance to secure DynamoDB access.
An advertising organization uses an application to process a stream of events that are received from
clients in multiple unstructured formats.
The application does the following:
Transforms the events into a single structured format and streams them to Amazon Kinesis for real-time
analysis.
Stores the unstructured raw events from the log files on local hard drivers that are rotated and uploaded to
Amazon S3.
The organization wants to extract campaign performance reporting using an existing Amazon redshift
cluster.
Which solution will provide the performance data with the LEAST number of operations?
A. Install the Amazon Kinesis Data Firehose agent on the application servers and use it to stream the log files directly to Amazon Redshift.
B. Create an external table in Amazon Redshift and point it to the S3 bucket where the unstructured raw
events are stored.
C. Write an AWS Lambda function that triggers every hour to load the new log files already in S3 to Amazon redshift.
D. Connect Amazon Kinesis Data Firehose to the existing Amazon Kinesis stream and use it to stream the event directly to Amazon Redshift.
An organization is developing a mobile social application and needs to collect logs from all devices on which it is installed. The organization is evaluating the Amazon Kinesis Data Streams to push logs and Amazon EMR to process data. They want to store data on HDFS using the default replication factor to replicate data among the cluster, but they are concerned about the durability of the data. Currently, they are producing 300 GB of raw data daily, with additional spikes during special events. They will need to scale out the Amazon EMR cluster to match the increase in streamed data.
Which solution prevents data loss and matches compute demand?
A. Use multiple Amazon EBS volumes on Amazon EMR to store processed data and scale out the Amazon EMR cluster as needed.
B. Use the EMR File System and Amazon S3 to store processed data and scale out the Amazon EMR cluster as needed.
C. Use Amazon DynamoDB to store processed data and scale out the Amazon EMR cluster as needed.
D. use Amazon Kinesis Data Firehose and, instead of using Amazon EMR, stream logs directly into Amazon Elasticsearch Service.
Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Amazon exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your BDS-C00 exam preparations and Amazon certification application, do not hesitate to visit our Vcedump.com to find your solutions here.