Free Amazon MLS-C01 Exam Actual Questions

The questions for MLS-C01 were last updated On Apr 30, 2025

At ValidExamDumps, we consistently monitor updates to the Amazon MLS-C01 exam questions by Amazon. Whenever our team identifies changes in the exam questions,exam objectives, exam focus areas or in exam requirements, We immediately update our exam questions for both PDF and online practice exams. This commitment ensures our customers always have access to the most current and accurate questions. By preparing with these actual questions, our customers can successfully pass the Amazon AWS Certified Machine Learning - Specialty exam on their first attempt without needing additional materials or study guides.

Other certification materials providers often include outdated or removed questions by Amazon in their Amazon MLS-C01 exam. These outdated questions lead to customers failing their Amazon AWS Certified Machine Learning - Specialty exam. In contrast, we ensure our questions bank includes only precise and up-to-date questions, guaranteeing their presence in your actual exam. Our main priority is your success in the Amazon MLS-C01 exam, not profiting from selling obsolete exam questions in PDF or Online Practice Test.

 

Question No. 1

A machine learning (ML) specialist uploads a dataset to an Amazon S3 bucket that is protected by server-side encryption with AWS KMS keys (SSE-KMS). The ML specialist needs to ensure that an Amazon SageMaker notebook instance can read the dataset that is in Amazon S3.

Which solution will meet these requirements?

Show Answer Hide Answer
Correct Answer: C

When an Amazon SageMaker notebook instance needs to access encrypted data in Amazon S3, the ML specialist must ensure that both Amazon S3 access permissions and AWS Key Management Service (KMS) decryption permissions are properly configured. The dataset in this scenario is stored with server-side encryption using an AWS KMS key (SSE-KMS), so the following steps are necessary:

S3 Read Permissions: Attach an IAM role to the SageMaker notebook instance with permissions that allow the s3:GetObject action for the specific S3 bucket storing the data. This will allow the notebook instance to read data from Amazon S3.

KMS Key Policy Permissions: Grant permissions in the KMS key policy to the IAM role assigned to the SageMaker notebook instance. This allows SageMaker to use the KMS key to decrypt data in the S3 bucket.

These steps ensure the SageMaker notebook instance can access the encrypted data stored in S3. The AWS documentation emphasizes that to access SSE-KMS encrypted data, the SageMaker notebook requires appropriate permissions in both the S3 bucket policy and the KMS key policy, making Option C the correct and secure approach.


Question No. 2

A company is setting up a mechanism for data scientists and engineers from different departments to access an Amazon SageMaker Studio domain. Each department has a unique SageMaker Studio domain.

The company wants to build a central proxy application that data scientists and engineers can log in to by using their corporate credentials. The proxy application will authenticate users by using the company's existing Identity provider (IdP). The application will then route users to the appropriate SageMaker Studio domain.

The company plans to maintain a table in Amazon DynamoDB that contains SageMaker domains for each department.

How should the company meet these requirements?

Show Answer Hide Answer
Correct Answer: A

The SageMaker CreatePresignedDomainUrl API is the best option to meet the requirements of the company. This API creates a URL for a specified UserProfile in a Domain. When accessed in a web browser, the user will be automatically signed in to the domain, and granted access to all of the Apps and files associated with the Domain's Amazon Elastic File System (EFS) volume. This API can only be called when the authentication mode equals IAM, which means the company can use its existing IdP to authenticate users. The company can use the DynamoDB table to store the domain IDs and user profile names for each department, and use the proxy application to query the table and generate the presigned URL for the appropriate domain according to the user's credentials. The presigned URL is valid only for a specified duration, which can be set by the SessionExpirationDurationInSeconds parameter. This can help enhance the security and prevent unauthorized access to the domains.

The other options are not suitable for the company's requirements. The SageMaker CreateHumanTaskUi API is used to define the settings for the human review workflow user interface, which is not related to accessing the SageMaker Studio domains. The SageMaker ListHumanTaskUis API is used to return information about the human task user interfaces in the account, which is also not relevant to the company's use case. The SageMaker CreatePresignedNotebookInstanceUrl API is used to create a URL to connect to the Jupyter server from a notebook instance, which is different from accessing the SageMaker Studio domain.

References:

* CreatePresignedDomainUrl

* CreatePresignedNotebookInstanceUrl

* CreateHumanTaskUi

* ListHumanTaskUis


Question No. 3

A machine learning (ML) specialist is using the Amazon SageMaker DeepAR forecasting algorithm to train a model on CPU-based Amazon EC2 On-Demand instances. The model currently takes multiple hours to train. The ML specialist wants to decrease the training time of the model.

Which approaches will meet this requirement7 (SELECT TWO )

Show Answer Hide Answer
Correct Answer: C, D

The best approaches to decrease the training time of the model are C and D, because they can improve the computational efficiency and parallelization of the training process. These approaches have the following benefits:

C: Replacing CPU-based EC2 instances with GPU-based EC2 instances can speed up the training of the DeepAR algorithm, as it can leverage the parallel processing power of GPUs to perform matrix operations and gradient computations faster than CPUs12.The DeepAR algorithm supports GPU-based EC2 instances such as ml.p2 and ml.p33.

D: Using multiple training instances can also reduce the training time of the DeepAR algorithm, as it can distribute the workload across multiple nodes and perform data parallelism4.The DeepAR algorithm supports distributed training with multiple CPU-based or GPU-based EC2 instances3.

The other options are not effective or relevant, because they have the following drawbacks:

A: Replacing On-Demand Instances with Spot Instances can reduce the cost of the training, but not necessarily the time, as Spot Instances are subject to interruption and availability5.Moreover, the DeepAR algorithm does not support checkpointing, which means that the training cannot resume from the last saved state if the Spot Instance is terminated3.

B: Configuring model auto scaling dynamically to adjust the number of instances automatically is not applicable, as this feature is only available for inference endpoints, not for training jobs6.

E: Using a pre-trained version of the model and running incremental training is not possible, as the DeepAR algorithm does not support incremental training or transfer learning3.The DeepAR algorithm requires a full retraining of the model whenever new data is added or the hyperparameters are changed7.

References:

1:GPU vs CPU: What Matters Most for Machine Learning? | by Louis (What's AI) Bouchard | Towards Data Science

2:How GPUs Accelerate Machine Learning Training | NVIDIA Developer Blog

3:DeepAR Forecasting Algorithm - Amazon SageMaker

4:Distributed Training - Amazon SageMaker

5:Managed Spot Training - Amazon SageMaker

6:Automatic Scaling - Amazon SageMaker

7:How the DeepAR Algorithm Works - Amazon SageMaker


Question No. 4

A company wants to conduct targeted marketing to sell solar panels to homeowners. The company wants to use machine learning (ML) technologies to identify which houses already have solar panels. The company has collected 8,000 satellite images as training data and will use Amazon SageMaker Ground Truth to label the data.

The company has a small internal team that is working on the project. The internal team has no ML expertise and no ML experience.

Which solution will meet these requirements with the LEAST amount of effort from the internal team?

Show Answer Hide Answer
Correct Answer: A

The solution A will meet the requirements with the least amount of effort from the internal team because it uses Amazon SageMaker Ground Truth and Amazon Rekognition Custom Labels, which are fully managed services that can provide the desired functionality. The solution A involves the following steps:

Set up a private workforce that consists of the internal team. Use the private workforce and the SageMaker Ground Truth active learning feature to label the data. Amazon SageMaker Ground Truth is a service that can create high-quality training datasets for machine learning by using human labelers. A private workforce is a group of labelers that the company can manage and control. The internal team can use the private workforce to label the satellite images as having solar panels or not.The SageMaker Ground Truth active learning feature can reduce the labeling effort by using a machine learning model to automatically label the easy examples and only send the difficult ones to the human labelers1.

Use Amazon Rekognition Custom Labels for model training and hosting. Amazon Rekognition Custom Labels is a service that can train and deploy custom machine learning models for image analysis. Amazon Rekognition Custom Labels can use the labeled data from SageMaker Ground Truth to train a model that can detect solar panels in satellite images.Amazon Rekognition Custom Labels can also host the model and provide an API endpoint for inference2.

The other options are not suitable because:

Option B: Setting up a private workforce that consists of the internal team, using the private workforce to label the data, and using Amazon Rekognition Custom Labels for model training and hosting will incur more effort from the internal team than using SageMaker Ground Truth active learning feature.The internal team will have to label all the images manually, without the assistance of the machine learning model that can automate some of the labeling tasks1.

Option C: Setting up a private workforce that consists of the internal team, using the private workforce and the SageMaker Ground Truth active learning feature to label the data, using the SageMaker Object Detection algorithm to train a model, and using SageMaker batch transform for inference will incur more operational overhead than using Amazon Rekognition Custom Labels. The company will have to manage the SageMaker training job, the model artifact, and the batch transform job.Moreover, SageMaker batch transform is not suitable for real-time inference, as it processes the data in batches and stores the results in Amazon S33.

Option D: Setting up a public workforce, using the public workforce to label the data, using the SageMaker Object Detection algorithm to train a model, and using SageMaker batch transform for inference will incur more operational overhead and cost than using a private workforce and Amazon Rekognition Custom Labels. A public workforce is a group of labelers from Amazon Mechanical Turk, a crowdsourcing marketplace. The company will have to pay the public workforce for each labeling task, and it may not have full control over the quality and security of the labeled data.The company will also have to manage the SageMaker training job, the model artifact, and the batch transform job, as explained in option C4.

References:

1: Amazon SageMaker Ground Truth

2: Amazon Rekognition Custom Labels

3: Amazon SageMaker Object Detection

4: Amazon Mechanical Turk


Question No. 5

A company uses a long short-term memory (LSTM) model to evaluate the risk factors of a particular energy

sector. The model reviews multi-page text documents to analyze each sentence of the text and categorize it as

either a potential risk or no risk. The model is not performing well, even though the Data Scientist has

experimented with many different network structures and tuned the corresponding hyperparameters.

Which approach will provide the MAXIMUM performance boost?

Show Answer Hide Answer
Correct Answer: D

Initializing the words by word2vec embeddings pretrained on a large collection of news articles related to the energy sector will provide the maximum performance boost for the LSTM model. Word2vec is a technique that learns distributed representations of words based on their co-occurrence in a large corpus of text. These representations capture semantic and syntactic similarities between words, which can help the LSTM model better understand the meaning and context of the sentences in the text documents. Using word2vec embeddings that are pretrained on a relevant domain (energy sector) can further improve the performance by reducing the vocabulary mismatch and increasing the coverage of the words in the text documents.References:

AWS Machine Learning Specialty Exam Guide

AWS Machine Learning Training - Text Classification with TF-IDF, LSTM, BERT: a comparison of performance

AWS Machine Learning Training - Machine Learning - Exam Preparation Path