#### 01. You are building a machine learning model on Amazon SageMaker, and you want to document the dataset origin, key parameters, and performance metrics to ensure transparency and reproducibility.
Which feature of Amazon SageMaker would allow you to document this information?
- SageMaker Model Cards
- SageMaker Feature Store
- SageMaker Ground Truth
- SageMaker Clarify
**CORRECT:** "SageMaker Model Cards" is the correct answer.
Amazon SageMaker Model Cards is a feature that allows you to document key details about your machine learning models, including dataset origin, model parameters, and performance metrics. This feature is designed to improve transparency and reproducibility by providing a centralized place to track important information about your models. By using Model Cards, teams can easily share and review the history, performance, and decisions behind each model, making it a useful tool for ensuring compliance, transparency, and accountability in machine learning projects.
**INCORRECT:** "SageMaker Ground Truth" is incorrect.
SageMaker Ground Truth is a tool for building and managing high-quality training datasets through human labeling. While important for data preparation, it doesn't document model details or performance.
**INCORRECT:** "SageMaker Clarify" is incorrect.
SageMaker Clarify helps detect bias in data and models and explains model predictions, but it is not used for documenting model parameters or performance metrics.
**INCORRECT:** "SageMaker Feature Store" is incorrect.
SageMaker Feature Store is a repository for storing and managing machine learning features for reuse, but it does not document the dataset origin or model performance metrics.
**References:** https://docs.aws.amazon.com/sagemaker/latest/dg/model-cards.html
Domain: Guidelines for Responsible AI
---
#### 02. A data science team wants to build a machine learning model without worrying about setting up and managing the underlying infrastructure.
Which feature of Amazon SageMaker allows them to easily build, train, and deploy machine learning models?
- Amazon SageMaker JumpStart
- Amazon SageMaker Ground Truth
- Amazon SageMaker Autopilot
- Amazon SageMaker Studio
**CORRECT:** "Amazon SageMaker Studio" is the correct answer.
Amazon SageMaker Studio is an integrated development environment (IDE) that allows data scientists and developers to build, train, and deploy machine learning models without managing the underlying infrastructure. It provides a fully managed environment where users can perform all the necessary tasks in the machine learning workflow, such as data preparation, model building, training, tuning, and deployment, all from a single interface. SageMaker Studio automatically provisions the required resources, helping the team focus on model development rather than infrastructure management.
**INCORRECT:** "Amazon SageMaker Autopilot" is incorrect.
SageMaker Autopilot automatically builds, trains, and tunes machine learning models based on input data. While it simplifies the model-building process, it is more specific to AutoML and doesn't offer the same comprehensive development environment as SageMaker Studio.
**INCORRECT:** "Amazon SageMaker Ground Truth" is incorrect.
SageMaker Ground Truth is a data labeling service that helps create training datasets for machine learning models. It focuses on labeling data rather than building, training, and deploying models.
**INCORRECT:** "Amazon SageMaker JumpStart" is incorrect.
SageMaker JumpStart provides pre-built models and solutions that users can quickly deploy. While helpful for getting started, it doesn't provide the full development environment that SageMaker Studio offers.
**References:** https://docs.aws.amazon.com/sagemaker/latest/dg/studio-updated.html
Domain: Fundamentals of Generative AI
---
#### 03. An enterprise company is using a generative AI model to write formal project update emails. To guide the model, a few project-related email samples are included in the input prompt so the model can follow the patterns and tone.
What best describes this prompt engineering technique?
- Data augmentation for training model accuracy
- Zero-shot prompting using pre-trained knowledge
- Chain-of-thought prompting to build step-by-step reasoning
- Few-shot prompting by example-based instruction
**CORRECT:** "Few-shot prompting by example-based instruction" is the correct answer.
Few-shot prompting is a technique where the input prompt includes a few examples of the desired output format or behavior. This helps guide the generative AI model to produce responses that match the expected tone, style, or structure. In this case, the enterprise company includes a few sample project update emails in the prompt. This allows the model to learn the tone and content structure from those examples and generate new formal project updates that align with the provided samples. This method is especially useful when consistency in communication is important, such as formal email writing, without requiring model fine-tuning.
**INCORRECT:** "Zero-shot prompting using pre-trained knowledge" is incorrect.
Zero-shot prompting provides the model with a task description or question without offering any examples. The model uses its pre-trained knowledge to respond. In this case, examples are provided in the prompt, making it a few-shot approach, not zero-shot.
**INCORRECT:** "Chain-of-thought prompting to build step-by-step reasoning" is incorrect.
Chain-of-thought prompting focuses on generating intermediate steps or explanations to reach a final answer, typically used for reasoning or problem-solving tasks. This scenario uses examples to demonstrate style and tone, not step-by-step reasoning.
**INCORRECT:** "Data augmentation for training model accuracy" is incorrect.
Data augmentation refers to the process of expanding training datasets to improve model learning during the training phase. This is unrelated to how the model is guided at inference time through prompting techniques.
**References:** https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-engineering-guidelines.html
Domain: Applications of Foundation Models
---
#### 04. What is the primary advantage of foundation models compared to traditional task-specific AI models?
- Foundation models require less computational power to fine-tune.
- Foundation models eliminate the need for any fine-tuning before deployment.
- Foundation models are pre-trained on diverse data and can be adapted to multiple tasks with minimal effort.
- Foundation models rely only on supervised learning, making them easier to interpret.
**CORRECT:** "Foundation models are pre-trained on diverse data and can be adapted to multiple tasks with minimal effort" is the correct answer.
Foundation models, such as GPT, BERT, and CLIP, are trained on large-scale diverse datasets and can be adapted to multiple downstream tasks with minimal additional training. Unlike traditional AI models, which require separate training for each specific task, foundation models serve as a general-purpose base that can be fine-tuned or used with prompt engineering to perform a variety of tasks, such as text generation, translation, summarization, and question answering. This flexibility reduces development time and computational costs associated with training models from scratch.
**INCORRECT:** "Foundation models require less computational power to fine-tune" is incorrect.
While foundation models allow for easier adaptation to new tasks, fine-tuning them can still be computationally expensive, especially for large-scale models like GPT-4 or LLaMA. Techniques like parameter-efficient fine-tuning (PEFT) and LoRA (Low-Rank Adaptation) help reduce computational overhead, but in general, fine-tuning still requires significant GPU/TPU resources.
**INCORRECT:** "Foundation models rely only on supervised learning, making them easier to interpret" is incorrect.
Most foundation models are trained using self-supervised learning, meaning they learn from large amounts of unlabeled data without explicit human annotation. This allows them to scale efficiently but does not necessarily make them easier to interpret. In fact, the black-box nature of large foundation models is a key challenge in AI interpretability.
**INCORRECT:** "Foundation models eliminate the need for any fine-tuning before deployment" is incorrect.
While some foundation models can be used directly with prompt engineering, many applications still require fine-tuning to improve performance on specific tasks or domains. Fine-tuning helps tailor the model's responses to industry-specific use cases, improving accuracy and relevance.
**References:** https://aws.amazon.com/what-is/foundation-models
Domain: Fundamentals of Generative AI
---
#### 05. A data science team is using Amazon Bedrock to run inference on a foundation model (FM) for content generation. They observe that the output is sometimes repetitive and lacks variation. The team wants to increase the randomness of responses while keeping the meaning coherent.
Which combination of inference parameters should they adjust to achieve this? (Select TWO.)
- temperature
- num_beams
- max_tokens
- top-p
- context_length
**CORRECT:** "temperature" is a correct answer.
The temperature parameter controls the randomness or creativity of the generated text. A higher temperature value (e.g., 0.8 or 1.0) makes the model more creative and diverse in its responses, while a lower value (e.g., 0.2 or 0.3) makes the output more focused and deterministic. When the output becomes repetitive or lacks variation, increasing the temperature helps introduce more diverse and less predictable text, without losing overall coherence.
**CORRECT:** "top-p" is also a correct answer.
Top-p (also known as nucleus sampling) is a probabilistic sampling method that influences the randomness of the model's responses. It selects from the smallest possible set of words whose cumulative probability exceeds the value of p. By adjusting top-p, especially increasing it, you allow the model to sample from a broader range of potential next words, which helps reduce repetition and makes the responses more varied. This is particularly effective when used in combination with temperature.
**INCORRECT:** "num_beams" is incorrect.
This parameter is used in beam search, which is a deterministic decoding method aimed at finding the most likely sequence. While it can improve fluency and reduce randomness, it typically decreases variation in the output, making responses more repetitive. It's not suitable when the goal is to increase diversity.
**INCORRECT:** "max_tokens" is incorrect.
This parameter sets the maximum number of tokens (words or parts of words) in the output. It affects the length of the response, not the randomness or creativity. Increasing this won't necessarily reduce repetition or improve variation.
**INCORRECT:** "context_length" is incorrect.
Context length refers to how much input (or previous conversation) the model can remember and consider while generating output. It does not affect randomness directly. Adjusting this won't help with variation or repetitive outputs.
**References:** https://docs.aws.amazon.com/bedrock/latest/userguide/inference-parameters.html
Domain: Applications of Foundation Models
---
#### 06. An AI development team is building a customer-facing application using pre-trained models available through AWS. To ensure they select models that align with responsible AI practices, appropriate use cases, and known limitations, they want to reference an AWS resource that provides transparency into how each model was designed and tested.
Which AWS resource best supports this need?
- AWS AI Service Cards provide insight into use cases, responsible design, and performance guidance.
- Amazon SageMaker JumpStart notebooks offer full model audit history and certification status.
- AWS Artifact Reports list AI model test results and user reviews for transparent selection.
- AWS Model Trace provides version history and runtime behavior analysis for AI models.
**CORRECT:** "AWS AI Service Cards provide insight into use cases, responsible design, and performance guidance" is the correct answer.
AWS AI Service Cards are designed to provide transparency into the capabilities, limitations, and responsible AI practices of pre-trained models provided by AWS. These cards include information such as intended use cases, ethical design considerations, performance characteristics, fairness, and bias insights. This helps teams evaluate whether a model is appropriate for their customer-facing applications and whether it aligns with their responsible AI requirements. It's the most relevant AWS resource when the goal is to understand model behavior and limitations before deployment.
**INCORRECT:** "Amazon SageMaker JumpStart notebooks offer full model audit history and certification status" is incorrect.
SageMaker JumpStart provides prebuilt solutions and example notebooks to help quickly get started with ML models. While useful for deployment and experimentation, it does not provide audit history, certifications, or detailed responsible AI insights.
**INCORRECT:** "AWS Model Trace provides version history and runtime behavior analysis for AI models" is incorrect.
AWS does not currently offer a service called "Model Trace." Even if such a tool existed, version history or runtime behavior analysis is not the same as understanding responsible use cases and model limitations.
**INCORRECT:** "AWS Artifact Reports list AI model test results and user reviews for transparent selection" is incorrect.
AWS Artifact provides compliance and audit documents such as ISO, SOC, and PCI reports for AWS services, not AI model-specific documentation or reviews. It is focused on regulatory and compliance, not responsible AI insights for models.
**References:**
https://aws.amazon.com/ai/responsible-ai/resources
https://aws.amazon.com/ai/responsible-ai
Domain: Guidelines for Responsible AI
---
#### 07. A digital health startup offers an AI-powered platform that generates real-time summaries of patient medical reports to assist clinicians during consultations. They currently use a foundation model via Amazon Bedrock, benefiting from its managed infrastructure and scalability. However, clinicians have reported latency issues that impact real-time responsiveness.
The startup wants to reduce latency while maintaining a low operational burden. They do not want to build or manage custom infrastructure such as hosting models themselves or setting up specialized ML pipelines.
Which approach offers the most suitable solution with the least complexity?
- Use Retrieval Augmented Generation (RAG) with pre-indexed medical documents in OpenSearch.
- Deploy a containerized version of the model to Amazon EC2 and build a custom API layer.
- Fine-tune the foundation model with a large dataset of annotated medical summaries.
- Use a few-shot prompting method with domain-specific examples provided during each request.
**CORRECT:** "Use a few-shot prompting method with domain-specific examples provided during each request" is the correct answer.
Few-shot prompting is a technique where you provide a foundation model with a few relevant examples within the prompt itself to guide its behavior. This approach doesn't require training or fine-tuning the model but instead leverages its pre-trained capabilities to understand the context. In this case, using domain-specific examples of medical summaries during each request can help the model generate more accurate and contextually relevant outputs. Since the startup is using Amazon Bedrock and wants to avoid managing infrastructure or building custom pipelines, few-shot prompting is ideal. It allows them to improve output quality and potentially reduce latency, as there's no additional overhead from training or external retrieval systems. It's a low-complexity, cost-effective solution that fits well with managed services like Bedrock.
**INCORRECT:** "Use Retrieval Augmented Generation (RAG) with pre-indexed medical documents in OpenSearch" is incorrect.
RAG combines external document retrieval with generation by the model, often using tools like OpenSearch. It adds extra components (indexing, query processing), which can increase latency and complexity. This option contradicts the startup's goal to reduce latency and avoid additional infrastructure management.
**INCORRECT:** "Fine-tune the foundation model with a large dataset of annotated medical summaries" is incorrect.
Fine-tuning involves training the model on a specialized dataset to improve domain performance. However, this requires a large, labeled dataset and computational resources. It also introduces operational complexity, including model hosting and versioning. Since the startup wants to avoid managing infrastructure, this is not the best fit.
**INCORRECT:** "Deploy a containerized version of the model to Amazon EC2 and build a custom API layer" is incorrect.
While this would give full control over the model and may reduce latency, it requires managing EC2 instances, containers, scaling, and security. This approach goes against the startup's desire to avoid building or managing infrastructure, making it the most complex option.
**References:**
https://docs.aws.amazon.com/bedrock/latest/userguide/design-a-prompt.html
https://aws.amazon.com/blogs/machine-learning/few-shot-prompt-engineering-and-fine-tuning-for-llms-in-amazon-bedrock
Domain: Applications of Foundation Models
---
#### 08. A medical research organization is considering applying unsupervised learning to a large dataset of patient symptoms without any labels.
Which use case aligns best with unsupervised learning?
- Predicting future patient visits based on historical appointment data
- Determining the next best step for a patient during a hospital stay
- Training a model to classify patients as high risk or low risk for a specific disease
- Identifying unknown subgroups or clusters of similar patient symptom patterns
**CORRECT:** "Identifying unknown subgroups or clusters of similar patient symptom patterns" is the correct answer.
Unsupervised learning is ideal for discovering patterns in data without predefined labels. Clustering techniques, such as K-Means, DBSCAN, or hierarchical clustering, can help group patients with similar symptom patterns.
This approach is useful in medical research for identifying previously unknown subgroups of patients who may share common risk factors, respond similarly to treatments, or have similar disease progressions. Such insights can aid in personalized medicine, early disease detection, and better treatment strategies without requiring labeled training data.
**INCORRECT:** "Predicting future patient visits based on historical appointment data" is incorrect.
This is a forecasting problem that typically uses supervised learning techniques such as regression models or time-series analysis. Since the goal is to predict a specific outcome (future visits), supervised learning is more appropriate.
**INCORRECT:** "Determining the next best step for a patient during a hospital stay" is incorrect.
This requires decision-making based on historical patient outcomes, making it a supervised learning or reinforcement learning task. Unsupervised learning does not predict future actions but instead identifies patterns in unlabeled data.
**INCORRECT:** "Training a model to classify patients as high risk or low risk for a specific disease" is incorrect.
Classification tasks require labeled data (e.g., patient risk levels), making this a supervised learning problem. Unsupervised learning does not use predefined labels and is not suitable for direct classification tasks.
**References:** https://aws.amazon.com/compare/the-difference-between-machine-learning-supervised-and-unsupervised
Domain: Fundamentals of AI and ML
---
#### 09. A social media company wants to integrate multiple data types (text posts, user profile data, and real-time streaming video) into a single AI model pipeline.
Which statement accurately describes a key challenge of handling different data types?
- Training on mixed data types requires separate ML frameworks for each data type, making a single pipeline impractical.
- Data preprocessing and feature engineering techniques must account for the unique structure and format of each data type.
- Large language models can seamlessly handle tabular, image, and video data without additional preprocessing.
- All data must be manually labeled with high accuracy, which is impossible for video streams.
**CORRECT:** "Data preprocessing and feature engineering techniques must account for the unique structure and format of each data type" is the correct answer.
Integrating multiple data types—such as text posts, user profile data (structured/tabular), and real-time video streams—requires careful data preprocessing and feature engineering. Each data type has a unique format and structure:
Text data requires tokenization, embedding, and NLP techniques.
Tabular data needs normalization, encoding categorical variables, and handling missing values.
Video data must be processed into frames, extracted as embeddings, or converted into a structured format for ML models.
A single AI model pipeline can handle multiple data types, but preprocessing steps must be designed to ensure compatibility across different modalities. This challenge is often addressed using multimodal learning techniques that combine various ML models or embeddings.
**INCORRECT:** "All data must be manually labeled with high accuracy, which is impossible for video streams" is incorrect.
While labeled data improves supervised learning, not all AI models require manual labeling. Self-supervised learning and unsupervised techniques can extract insights from unlabeled video streams. Also, many AI applications work with weakly labeled or semi-supervised datasets.
**INCORRECT:** "Training on mixed data types requires separate ML frameworks for each data type, making a single pipeline impractical" is incorrect.
While different data types may require specialized models (e.g., CNNs for images, RNNs for text), multimodal architectures (such as transformers or fusion models) allow a single AI pipeline to process and combine different data types effectively.
**INCORRECT:** "Large language models can seamlessly handle tabular, image, and video data without additional preprocessing" is incorrect.
LLMs are optimized for text-based tasks and do not natively process tabular or video data. While some multimodal models can integrate multiple data types, they still require preprocessing steps, such as converting images to embeddings or extracting features from videos before passing them to the model.
**References:**
https://docs.aws.amazon.com/sagemaker/latest/dg/data-prep.html
https://aws.amazon.com/blogs/machine-learning/simplify-multimodal-generative-ai-with-amazon-bedrock-data-automation
Domain: Fundamentals of AI and ML
---
#### 10. An e-commerce platform is facing issues where its generative AI assistant gives inconsistent product answers. The team wants to correct these issues in real-time using lightweight adjustments instead of re-training the model.
What approach should be adopted?
- Implement dynamic routing to a rule-based fallback system.
- Use retrieval-augmented generation (RAG) to ground responses in product data.
- Fine-tune the foundation model using historical chat logs.
- Modify prompt instructions to shape the AI's behavior during inference.
**CORRECT:** "Modify prompt instructions to shape the AI's behavior during inference" is the correct answer.
Prompt engineering is a lightweight and flexible approach used to guide the output of generative AI models during inference without the need to retrain the model. By modifying how the prompt is written—such as including specific instructions, tone, formatting, or context—the model can produce more accurate and consistent responses. For an e-commerce assistant giving inconsistent product answers, this method allows the team to adjust behavior in real time and respond to user concerns quickly. It's cost-effective and scalable, making it ideal for production environments where retraining is expensive or slow.
**INCORRECT:** "Use retrieval-augmented generation (RAG) to ground responses in product data" is incorrect.
RAG is a powerful method to improve accuracy by retrieving relevant external data and feeding it into the prompt. However, it involves additional infrastructure and data management. It is not the lightest solution for quick behavioral corrections during inference. It's more suitable for improving knowledge grounding, not fixing prompt inconsistencies.
**INCORRECT:** "Fine-tune the foundation model using historical chat logs" is incorrect.
Fine-tuning involves retraining the model on new data, such as past conversations. This is a resource-intensive and time-consuming process that also requires careful validation to avoid overfitting. It's not ideal for quick, real-time corrections, which the scenario demands.
**INCORRECT:** "Implement dynamic routing to a rule-based fallback system" is incorrect.
While useful for handling edge cases or known failure points, rule-based systems cannot adapt to the diverse and dynamic nature of natural language. They also don't solve the problem of inconsistent responses—they simply bypass the model rather than improving it.
**References:** https://aws.amazon.com/what-is/prompt-engineering
Domain: Fundamentals of Generative AI
---
#### 11. An e-commerce company receives thousands of vendor agreements in PDF format and wants to automate capturing vendor information.
Which Amazon Textract feature is most suitable for this use case?
- Table detection
- Label content extraction
- Content summarization
- Key-value pairs extraction
**CORRECT:** "Key-value pairs extraction" is the correct answer.
Amazon Textract's key-value pairs extraction feature is designed to automatically find and extract pairs of related information, such as "Vendor Name: ABC Ltd." or "Agreement Date: 2025-05-17". This is ideal for processing structured information in documents like vendor agreements, invoices, or forms. Key-value pairs help identify the labels (like "Vendor Name") and their corresponding values (like "ABC Ltd.") without needing to manually review each document. This feature is perfect for the e-commerce company's goal of capturing vendor details from thousands of agreements efficiently. By using this feature, the company can automate data entry and reduce human effort, while improving accuracy and processing speed.
**INCORRECT:** "Table detection" is incorrect.
Table detection is used to extract data organized in a table format, such as rows and columns. While vendor agreements may contain some tables, most of the critical information like vendor name, address, or agreement dates is typically presented as key-value pairs, not as tables. Therefore, table detection is not the most suitable option here.
**INCORRECT:** "Label content extraction" is incorrect.
Label content extraction refers to identifying labeled data, but this term is not a specific Amazon Textract feature. Textract focuses on structured extraction such as key-value pairs, tables, and raw text. Since "label content extraction" is not a defined feature in Textract, this option is incorrect.
**INCORRECT:** "Content summarization" is incorrect.
Content summarization is the process of generating a short summary of the document's main points. Amazon Textract does not perform content summarization. Its main function is to extract structured data like key-value pairs and tables. Summarization is more aligned with natural language processing models, not document data extraction.
**References:**
https://docs.aws.amazon.com/textract/latest/dg/how-it-works-kvp.html
https://aws.amazon.com/blogs/machine-learning/announcing-expanded-support-for-extracting-data-from-invoices-and-receipts-using-amazon-textract
Domain: Applications of Foundation Models
---
#### 12. An insurance company has developed a machine learning model to classify whether a policyholder is likely to file a claim within the next year. After completing training, the data science team is analyzing metrics such as confusion matrix, F1 score, precision-recall curves, and AUC using a holdout validation set. Their goal is to determine if the model's predictive power is strong enough for deployment.
Which phase of the machine learning lifecycle are they currently engaged in?
- Training orchestration, where models are iteratively trained and checkpointed using scalable infrastructure to ensure reproducibility.
- Model evaluation, where predictive performance is measured using diverse metrics on independent datasets to assess real-world generalization.
- Feature transformation, where raw data is engineered into new variables to enhance learning effectiveness and reduce input complexity.
- Algorithm optimization, where the selected learning algorithm is fine-tuned to maximize convergence speed and reduce bias-variance tradeoff.
**CORRECT:** "Model evaluation, where predictive performance is measured using diverse metrics on independent datasets to assess real-world generalization" is the correct answer.
Model evaluation is the stage in the machine learning lifecycle where the performance of the trained model is assessed using a separate dataset (such as a holdout validation set). Common metrics include accuracy, precision, recall, F1 score, ROC-AUC, and confusion matrix. These metrics help evaluate how well the model generalizes to unseen data. In this scenario, the data science team is reviewing the model's F1 score, AUC, and precision-recall metrics on a validation set, indicating they are focused on evaluating how the model performs in practice before deployment. This confirms that they are in the model evaluation phase.
**INCORRECT:** "Algorithm optimization, where the selected learning algorithm is fine-tuned to maximize convergence speed and reduce bias-variance tradeoff" is incorrect.
Algorithm optimization involves tuning the hyperparameters of the model to improve training performance, address underfitting or overfitting, and accelerate convergence. While this is a key part of model development, it typically occurs before model evaluation. In this question, the model is already trained and is being evaluated, so this phase has already passed.
**INCORRECT:** "Feature transformation, where raw data is engineered into new variables to enhance learning effectiveness and reduce input complexity" is incorrect.
Feature transformation or engineering is done during the preprocessing phase. It helps improve model performance by transforming input data (e.g., normalization, encoding, or creating interaction terms). However, in the scenario provided, the focus is not on modifying the input data, but on assessing how the trained model performs, so this is not the correct phase.
**INCORRECT:** "Training orchestration, where models are iteratively trained and checkpointed using scalable infrastructure to ensure reproducibility" is incorrect.
Training orchestration involves managing the infrastructure and processes needed to run model training at scale, including automation and reproducibility tools. The given scenario does not involve infrastructure or automation, it focuses on reviewing evaluation metrics post-training. Therefore, this is not the phase they are currently in.
**References:** https://docs.aws.amazon.com/machine-learning/latest/dg/evaluating_models.html
Domain: Fundamentals of AI and ML
---
#### 13. A healthcare organization has fine-tuned a Foundation Model on Amazon Bedrock using patient records and confidential medical data. To comply with privacy regulations, the organization wants to ensure that the AI model does not generate responses containing sensitive patient information. The team is looking for the most effective way to prevent unintended data leaks while maintaining model accuracy.
What is the most efficient approach to achieve this goal?
- Disabling AI-generated responses entirely and only using the model for structured data retrieval.
- Disabling the fine-tuning process and using only the pre-trained Foundation Model without any customization.
- Storing all fine-tuned model responses in an Amazon S3 bucket for manual review before sharing them with users.
- Implementing Amazon Bedrock Guardrails to detect and block sensitive information in model responses.
**CORRECT:** "Implementing Amazon Bedrock Guardrails to detect and block sensitive information in model responses" is the correct answer.
Amazon Bedrock Guardrails is a feature that helps developers implement safety, privacy, and ethical controls in generative AI applications. It enables the creation of customized rules to filter harmful, inappropriate, or biased content produced by foundation models. With Bedrock Guardrails, teams can define sensitive topics, apply content filters, and enforce usage guidelines to align AI outputs with organizational values and compliance requirements. This ensures safer and more responsible AI interactions, especially in customer-facing applications, without needing deep machine learning expertise.
You can configure custom policies and rules to detect specific content like personally identifiable information (PII) or medical data. For a healthcare organization working with sensitive patient information, Guardrails are the most efficient and scalable way to prevent the model from accidentally leaking private data—without sacrificing the accuracy and usefulness of the fine-tuned model. These guardrails can block, redact, or replace sensitive responses in real-time.
**INCORRECT:** "Disabling the fine-tuning process and using only the pre-trained Foundation Model without any customization" is incorrect.
This would reduce the usefulness of the AI model. Fine-tuning helps tailor responses to specific healthcare tasks. Simply removing fine-tuning does not guarantee data privacy and sacrifices model relevance and performance for the use case.
**INCORRECT:** "Disabling AI-generated responses entirely and only using the model for structured data retrieval" is incorrect.
Turning off generative features removes the main value of a language model. If the organization only wants structured retrieval, it might as well use a database—not a generative model. This option is too restrictive and not efficient.
**INCORRECT:** "Storing all fine-tuned model responses in an Amazon S3 bucket for manual review before sharing them with users" is incorrect.
While this helps with compliance, it's manual, time-consuming, and inefficient—especially in real-time systems. It's not scalable or practical for production environments where immediate responses are expected.
**References:** https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-sensitive-filters.html
Domain: Security, Compliance, and Governance for AI Solutions
---
#### 14. Your machine learning model has been deployed to production, and your team wants to track its performance over time to detect any degradation or biases that may appear in the predictions.
Which AWS service can help you continuously monitor your model?
- Amazon SageMaker Studio
- Amazon SageMaker Model Monitor
- Amazon SageMaker Data Wrangler
- Amazon SageMaker Feature Store
**CORRECT:** "Amazon SageMaker Model Monitor" is the correct answer.
Amazon SageMaker Model Monitor enables you to continuously monitor the quality of your deployed model's predictions, checking for issues such as data drift, biases, and performance degradation over time. SageMaker Model Monitor automatically captures data from the model's input and output, allowing you to create baseline statistics and set up alerts when anomalies are detected. By monitoring the model, you can ensure that it continues to perform as expected in real-world scenarios and make adjustments or retrain the model as needed to maintain high-quality predictions.
**INCORRECT:** "Amazon SageMaker Data Wrangler" is incorrect.
SageMaker Data Wrangler is designed for data preprocessing tasks such as cleaning and transforming data, but it is not used for monitoring model performance after deployment.
**INCORRECT:** "Amazon SageMaker Feature Store" is incorrect.
SageMaker Feature Store is used to store and manage features for machine learning models but does not provide model monitoring capabilities.
**INCORRECT:** "Amazon SageMaker Studio" is incorrect.
SageMaker Studio is an integrated development environment (IDE) for building, training, and deploying machine learning models, but it is not designed for continuous model performance monitoring in production.
**References:** https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor.html
Domain: Fundamentals of AI and ML
---
#### 15. A machine learning model designed to predict loan approvals demonstrates high accuracy on internal test data but performs poorly when evaluated on data from new applicant demographics not well represented in the training set.
What is the most likely cause of this issue?
- The model is overfitting, having learned patterns specific to the training data but failing to generalize to new applications.
- The model used a cross-validation approach with too few folds, leading to unreliable internal performance estimates.
- The model is underfitting, indicating it has not learned the underlying patterns well enough, resulting in poor performance across all datasets.
- The training process lacked proper feature scaling, introducing bias in the handling of numerical inputs.
**CORRECT:** "The model is overfitting, having learned patterns specific to the training data but failing to generalize to new applications" is the correct answer.
Overfitting occurs when a machine learning model learns not only the general patterns in the training data but also the noise and specific details that do not generalize well to new or different data. In this case, the model shows high accuracy on internal test data (which is likely similar to the training data) but performs poorly on new applicant demographics. This strongly suggests that the model is overfitting — it has become too specialized to the known data and cannot adapt to variations or underrepresented groups. Overfitting limits the model's ability to generalize, especially in real-world scenarios where diversity in data is common. It's a key concern in AI fairness and model robustness.
**INCORRECT:** "The model is underfitting, indicating it has not learned the underlying patterns well enough, resulting in poor performance across all datasets" is incorrect.
Underfitting means the model is too simple to capture the true patterns in the data, often leading to poor performance on both training and test sets. In this case, the model performs well on internal data, which rules out underfitting.
**INCORRECT:** "The training process lacked proper feature scaling, introducing bias in the handling of numerical inputs" is incorrect.
Feature scaling helps in models sensitive to input magnitude (like logistic regression or k-NN), but lack of scaling typically affects performance across all datasets. It doesn't explain why the model works well on internal data but poorly on new demographics.
**INCORRECT:** "The model used a cross-validation approach with too few folds, leading to unreliable internal performance estimates" is incorrect.
While using too few folds can give less reliable performance estimates, it doesn't explain poor performance on a very different demographic group. The core issue here is the model's inability to generalize to unseen data types, not the validation method used.
**References:** https://aws.amazon.com/what-is/overfitting
Domain: Guidelines for Responsible AI
---
#### 16. An online fashion retailer wants to implement a feature that can automatically tag clothing styles and colors in user-uploaded photos. The goal is to enhance product search and recommendation.
Which technique should the team use to accurately analyze visual content in user images?
- Apply anomaly detection to identify listings with duplicate content or patterns.
- Apply Natural Language Processing (NLP) to understand fashion descriptions and convert them into product tags.
- Use tabular analysis to compare user demographics and purchase history without processing the image content.
- Use Computer Vision techniques to detect clothing patterns, identify attributes, and classify items within the image.
**CORRECT:** "Use Computer Vision techniques to detect clothing patterns, identify attributes, and classify items within the image" is the correct answer.
Computer Vision is a field of artificial intelligence (AI) that enables computers to interpret and make decisions based on visual data, such as images or videos. It mimics the way human vision works, allowing machines to understand and analyze visual content. By using a combination of machine learning, image processing, and deep learning techniques, Computer Vision systems can recognize objects, detect patterns, and extract meaningful information from images.
In the context of an online fashion retailer, Computer Vision can be particularly useful for identifying clothing styles, colors, and patterns in user-uploaded photos. For example, it can detect whether a person is wearing a floral dress, a leather jacket, or striped pants, and then assign relevant tags to those items. This not only enhances the search functionality on the website but also improves personalized recommendations by understanding users' fashion preferences. Over time, with enough training data, these systems become increasingly accurate, enabling a more efficient and engaging shopping experience for customers.
**INCORRECT:** "Apply Natural Language Processing (NLP) to understand fashion descriptions and convert them into product tags" is incorrect.
NLP is useful for analyzing text, such as product descriptions or user reviews. It's not designed for analyzing images. Since the goal is to tag items in photos, NLP would not be the right approach here.
**INCORRECT:** "Use tabular analysis to compare user demographics and purchase history without processing the image content" is incorrect.
Tabular analysis helps with data like age, gender, and past purchases, but it does not handle or understand image content. The goal is to tag features in photos, so this approach does not solve the problem.
**INCORRECT:** "Apply anomaly detection to identify listings with duplicate content or patterns" is incorrect.
Anomaly detection is typically used to find unusual or suspicious behavior, such as fraud or data errors. It is not suited for classifying clothing styles or identifying attributes in images.
**References:** https://aws.amazon.com/what-is/computer-vision
Domain: Fundamentals of AI and ML
---
#### 17. Your team is experimenting with different prompts to improve the performance of an AI system. You decide to use multiple comments to test various outputs and refine the model's responses.
What is the key benefit of using this approach in prompt engineering?
- It allows for better discovery of effective prompts through experimentation.
- It reduces the length of model responses.
- It makes the model more difficult to hijack or poison.
- It increases the model's creativity and variability in responses.
**CORRECT:** "It allows for better discovery of effective prompts through experimentation" is the correct answer.
The key benefit of using multiple comments to test various outputs is that it enables better discovery of effective prompts through experimentation. By trying different prompt structures, wordings, or contexts, the team can observe which prompts lead to the most accurate, relevant, or desirable responses from the AI system. This iterative approach allows for continuous refinement, helping the team find the most effective prompts to optimize the model's performance. It's a crucial step in improving prompt quality and ensuring the AI behaves as intended.
**INCORRECT:** "It increases the model's creativity and variability in responses" is incorrect.
While experimenting with prompts can lead to some variability in responses, the primary focus of this approach is discovering effective prompts, not necessarily increasing creativity or variability.
**INCORRECT:** "It reduces the length of model responses" is incorrect.
The length of responses is influenced by the specific prompt, but the main goal here is refining the model's accuracy and relevance, not directly reducing response length.
**INCORRECT:** "It makes the model more difficult to hijack or poison" is incorrect.
Prompt experimentation focuses on improving performance and does not directly address security concerns like preventing model hijacking or poisoning.
**References:** https://aws.amazon.com/what-is/prompt-engineering
Domain: Applications of Foundation Models
---
#### 18. A software company is looking to upgrade its customer support chatbot by integrating a more advanced language model. The development team is evaluating different foundation models and is specifically considering GPT-based models due to their strong performance in understanding and generating human-like responses. As part of their assessment, they need to understand the core capabilities and limitations of GPT technology.
Which of the following statements are correct about GPT models? (Select TWO.)
- GPT models are mainly trained on labeled datasets to enhance their text generation capabilities.
- GPT models must be retrained from scratch whenever they are assigned a new task.
- GPT models are based on the Generative Pre-trained Transformer architecture, enabling them to produce human-like text.
- GPT models can perform a wide range of tasks by modifying prompts, without needing retraining.
- GPT models are designed to autonomously verify the factual accuracy of their outputs without external tools.
**CORRECT:** "GPT models are based on the Generative Pre-trained Transformer architecture, enabling them to produce human-like text" is a correct answer.
GPT models (Generative Pre-trained Transformers) are built using the Transformer architecture, which is specifically designed to handle sequential data like text. They are pre-trained on large text datasets and then fine-tuned or prompted to perform a wide range of tasks. This architecture allows GPT models to generate coherent, human-like text, making them highly effective for applications such as chatbots, content creation, and virtual assistants.
**CORRECT:** "GPT models can perform a wide range of tasks by modifying prompts, without needing retraining" is also a correct answer.
One of the core strengths of GPT models is prompt-based learning. Instead of retraining the model for every new task, you can simply modify the prompt (input text) to guide the model's behavior. For example, by phrasing a prompt like "Translate this sentence to French," the GPT model understands the task without requiring any task-specific retraining. This makes GPT models highly flexible and cost-efficient for deploying across multiple use cases.
**INCORRECT:** "GPT models are mainly trained on labeled datasets to enhance their text generation capabilities" is incorrect.
GPT models are primarily trained using unsupervised learning on large unlabeled datasets such as web pages, books, and articles. They are not mainly trained on manually labeled datasets like classification models. Their ability to predict text comes from learning language patterns naturally during pretraining.
**INCORRECT:** "GPT models must be retrained from scratch whenever they are assigned a new task" is incorrect.
GPT models are designed to be task-flexible through prompt engineering, not through retraining. Fine-tuning can improve performance for specialized tasks, but retraining from scratch is unnecessary for most use cases.
**INCORRECT:** "GPT models are designed to autonomously verify the factual accuracy of their outputs without external tools" is incorrect.
GPT models generate text based on patterns in training data but do not independently verify facts. Their outputs can be plausible-sounding but incorrect. Fact-checking usually requires external tools or post-processing systems beyond the base GPT model capabilities.
**References:** https://aws.amazon.com/what-is/gpt
Domain: Fundamentals of Generative AI
---
#### 19. An e-commerce company wants to enhance its digital customer assistant with more natural conversations and the ability to summarize product specifications.
Which of the following key advantages of generative AI makes it suitable for this solution? (Select TWO.)
- It allows dynamic generation of text based on context and user inputs.
- It enables rule-based decision trees for repeatable customer journeys.
- It can understand and generate human-like responses in natural language.
- It ensures deterministic output for every identical customer question.
- It eliminates the need for any human-in-the-loop monitoring.
**CORRECT:** "It allows dynamic generation of text based on context and user inputs" is a correct answer.
Generative AI models, such as large language models (LLMs), are designed to create human-like text dynamically. They do this by understanding the context of user input and generating appropriate, natural-sounding responses. In the case of a digital customer assistant, this allows the AI to respond to a wide range of queries, adapt its tone or format, and even summarize product specifications in a personalized way. This flexibility goes far beyond static or scripted systems, making the experience feel more natural and helpful.
**CORRECT:** "It can understand and generate human-like responses in natural language" is also a correct answer.
One of the most powerful capabilities of generative AI is its ability to process and produce human-like language. This makes it ideal for customer-facing applications like chatbots or virtual assistants. These models can interpret user intent, generate coherent answers, and maintain fluid conversations that mimic human interactions. This is especially useful in e-commerce, where clear, friendly communication boosts customer satisfaction.
**INCORRECT:** "It enables rule-based decision trees for repeatable customer journeys" is incorrect.
Rule-based decision trees are a traditional approach for automating simple, repetitive customer service tasks. They follow predefined flows and do not adapt dynamically based on nuanced input. This is the opposite of generative AI, which excels in open-ended, flexible conversations rather than strict rule-following.
**INCORRECT:** "It eliminates the need for any human-in-the-loop monitoring" is incorrect.
While generative AI can reduce manual effort, it does not fully eliminate the need for human oversight. Monitoring is essential to ensure that outputs are safe, accurate, and aligned with business policies—especially in customer service environments where reputational risk is high.
**INCORRECT:** "It ensures deterministic output for every identical customer question" is incorrect.
Generative AI models are probabilistic, not deterministic. This means the same input may yield slightly different outputs each time. While this enables creativity and flexibility, it also means the model's responses aren't guaranteed to be exactly the same every time, even with identical inputs.
**References:** https://aws.amazon.com/what-is/generative-ai
Domain: Fundamentals of Generative AI
---
#### 20. A logistics company is evaluating a vision foundation model designed to classify package damage levels. To ensure the model meets industry standards, they want to accurately assess its performance using appropriate validation methods.
Which of the following approaches is most suitable for evaluating the model's classification performance?
- Monitor training loss during early epochs to estimate long-term performance.
- Apply benchmark image datasets with labeled ground truth and compute accuracy and F1-score.
- Use synthetic images to explore the model's response to visual style variations.
- Select the model that required the most training time, assuming better learning outcomes.
**CORRECT:** "Apply benchmark image datasets with labeled ground truth and compute accuracy and F1-score" is the correct answer.
This is the correct approach to evaluate a classification model's performance. In machine learning, especially for image classification tasks, using a benchmark dataset with known, labeled outcomes (called ground truth) is a standard practice. These datasets help ensure objective and fair evaluation. By applying the model to this data and calculating metrics like accuracy (how many predictions were correct) and F1-score (a balance between precision and recall), the team can get a reliable picture of the model's real-world performance. These metrics are especially important when classes are imbalanced, like in damage classification where "no damage" may appear more often than "severe damage."
**INCORRECT:** "Use synthetic images to explore the model's response to visual style variations" is incorrect.
Synthetic data can be helpful for testing robustness or augmenting training data but it is not a suitable method for formal performance evaluation. It lacks real-world complexity and may not reflect how the model performs on actual logistics scenarios.
**INCORRECT:** "Monitor training loss during early epochs to estimate long-term performance" is incorrect.
Training loss only indicates how well a model is learning from the training data—not how well it generalizes to new data. It is not an evaluation metric. A model can have low training loss but still perform poorly on unseen data due to overfitting.
**INCORRECT:** "Select the model that required the most training time, assuming better learning outcomes" is incorrect.
Training time does not directly correlate with performance. A longer training duration could indicate inefficient learning or poor hyperparameter settings. Performance must be validated using objective metrics, not time spent.
**References:**
https://docs.aws.amazon.com/comprehend/latest/dg/cer-doc-class.html
https://aws.amazon.com/bedrock/evaluations
Domain: Applications of Foundation Models
---
#### 21. A fast-growing online gaming platform wants to introduce an AI-powered chat feature to help players get instant tips, match insights, and 24/7 virtual assistance. The feature needs to deliver low-latency, consistent, and highly reliable responses, even during peak gaming hours. The development team has chosen Amazon Bedrock as its foundation for generative AI.
Which Amazon Bedrock pricing model is the most appropriate?
- Reserved Instance
- Provisioned Throughput
- On-Demand Pricing
- Free Tier Pricing
**CORRECT:** "Provisioned Throughput" is the correct answer.
Provisioned Throughput in Amazon Bedrock provides guaranteed model availability with pre-allocated capacity. This pricing model ensures consistent and low-latency responses by reserving dedicated throughput for your AI workloads. It is ideal for production applications with predictable or high usage, such as a 24/7 AI chat assistant in an online gaming platform. By securing provisioned capacity, the platform can serve players reliably even during peak gaming hours, avoiding delays or throttling that might happen with shared capacity. Although this model comes at a higher upfront cost compared to on-demand usage, it provides the performance consistency and reliability needed for real-time player engagement.
**INCORRECT:** "Reserved Instance" is incorrect.
Reserved Instances are used for services like Amazon EC2 to reserve computing capacity over a long period with cost savings. Amazon Bedrock does not offer Reserved Instances as part of its pricing options.
**INCORRECT:** "On-Demand Pricing" is incorrect.
On-Demand Pricing charges based on the tokens processed without any pre-reserved capacity. While it is flexible and cost-effective for smaller or unpredictable workloads, it does not guarantee low-latency or consistent performance during peak times. This makes it less suitable for critical real-time features in gaming environments.
**INCORRECT:** "Free Tier Pricing" is incorrect.
The Free Tier provides limited access to try services at no cost for a short period or with restricted usage limits. It is designed for testing or learning purposes, not for deploying a production-level, high-traffic AI assistant. It lacks the performance and capacity needed for such workloads.
**References:** https://aws.amazon.com/bedrock/pricing
Domain: Fundamentals of Generative AI
---
#### 22. Your team is exploring different foundation model customization strategies for a complex legal-domain application. The application involves extremely large documents, and inference cost must stay low while ensuring high accuracy.
Which approach best balances customization with minimal ongoing costs?
- Pre-train a new large model on the entire legal dataset.
- Fine-tune the entire large language model for every new legal scenario.
- Rely exclusively on zero-shot prompting with no domain-specific data.
- Use Retrieval-Augmented Generation (RAG) with domain-specific data stored in Amazon S3 and a pre-trained model.
**CORRECT:** "Use Retrieval-Augmented Generation (RAG) with domain-specific data stored in Amazon S3 and a pre-trained model" is the correct answer.
Retrieval-Augmented Generation (RAG) is a technique that enhances a pre-trained language model by dynamically retrieving relevant domain-specific data at inference time. Instead of fine-tuning the model on all legal texts, RAG fetches information from a knowledge base (such as documents stored in Amazon S3 and indexed using Amazon OpenSearch or Amazon Kendra). This approach balances accuracy and cost-effectiveness by avoiding the need for extensive fine-tuning while ensuring that responses are contextually relevant to legal queries. Since the model remains pre-trained and external data is used for retrieval, inference costs stay lower than full model fine-tuning or pre-training while maintaining high accuracy.
**INCORRECT:** "Pre-train a new large model on the entire legal dataset" is incorrect.
Pre-training a new foundation model from scratch is extremely costly in terms of compute resources, time, and expertise required. It also demands continuous retraining as legal regulations evolve. This approach is unnecessary when existing pre-trained models can be adapted more efficiently.
**INCORRECT:** "Fine-tune the entire large language model for every new legal scenario" is incorrect.
Fine-tuning a large model for every legal scenario is computationally expensive and requires continuous retraining as new cases arise. It increases storage and inference costs significantly without necessarily providing better accuracy than RAG-based approaches.
**INCORRECT:** "Rely exclusively on zero-shot prompting with no domain-specific data" is incorrect.
Zero-shot prompting relies solely on a model's general knowledge without any domain-specific customization. In a complex legal application where accuracy is critical, this approach is insufficient as it may generate incomplete or inaccurate responses.
**References:** https://aws.amazon.com/what-is/retrieval-augmented-generation
Domain: Applications of Foundation Models
---
#### 23. A retail company wants to label a large set of customer data for training a machine learning model. However, manually labeling the data would be too time-consuming.
Which feature helps automate the data labeling process with the help of human oversight?
- Amazon SageMaker Autopilot
- Amazon SageMaker Ground Truth
- Amazon SageMaker Model Monitor
- Amazon SageMaker Data Wrangler
**CORRECT:** "Amazon SageMaker Ground Truth" is the correct answer.
Amazon SageMaker Ground Truth is designed to help automate the data labeling process. It uses machine learning to assist in labeling data, reducing the time and effort required for manual labeling. Ground Truth supports active learning, where the system can automatically label a subset of the data and then seek human oversight or review only where necessary. This human-in-the-loop approach improves labeling efficiency, making it an ideal solution for the retail company looking to label large datasets for training a machine learning model.
**INCORRECT:** "Amazon SageMaker Autopilot" is incorrect.
SageMaker Autopilot automatically builds and trains machine learning models based on input data but does not focus on data labeling.
**INCORRECT:** "Amazon SageMaker Model Monitor" is incorrect.
SageMaker Model Monitor is used to monitor the performance of deployed models, detecting issues like data drift, but it is not involved in data labeling.
**INCORRECT:** "Amazon SageMaker Data Wrangler" is incorrect.
SageMaker Data Wrangler simplifies data preparation and feature engineering, but it does not automate the data labeling process.
**References:** https://aws.amazon.com/sagemaker/groundtruth
Domain: Fundamentals of AI and ML
---
#### 24. A research team is training a Generative Adversarial Network (GAN) to generate realistic human faces. To enhance the quality of the generated images, which two of the following techniques are commonly used? (Select TWO.)
- Adding dropout layers throughout both the generator and discriminator
- Using supervised learning for both the generator and discriminator
- Applying data augmentation to the training dataset
- Incorporating a discriminator network to guide the generator
- Training a generator network to produce synthetic images
**CORRECT:** "Incorporating a discriminator network to guide the generator" is a correct answer.
In a Generative Adversarial Network (GAN), the discriminator plays a key role by distinguishing between real and fake images. It gives feedback to the generator, which uses that feedback to improve the realism of the synthetic images it produces. This adversarial process continues until the generator gets better at creating images that are indistinguishable from real ones. This is a foundational concept in GANs and is essential for generating high-quality outputs like realistic human faces.
**CORRECT:** "Training a generator network to produce synthetic images" is also a correct answer.
The generator is the second core component of a GAN. Its job is to produce synthetic images from random noise. Through training, it learns to generate more realistic images by trying to fool the discriminator. The quality of these generated images improves over time as the generator becomes more skilled. This process is key to the success of any GAN, and especially important for generating complex visuals such as human faces.
**INCORRECT:** "Applying data augmentation to the training dataset" is incorrect.
Data augmentation is commonly used in supervised learning to improve generalization. While it can help stabilize training in some GAN settings, it is not a core technique specific to improving GAN-generated image quality. The adversarial feedback loop between generator and discriminator is more critical.
**INCORRECT:** "Adding dropout layers throughout both the generator and discriminator" is incorrect.
Dropout is a regularization technique used to prevent overfitting in traditional neural networks. However, in GANs, adding dropout throughout both networks can interfere with training stability and convergence, potentially harming the image quality instead of improving it.
**INCORRECT:** "Using supervised learning for both the generator and discriminator" is incorrect.
GANs are based on unsupervised or semi-supervised learning, not supervised learning. The generator learns to produce images without labeled outputs, and the discriminator learns to distinguish between real and fake examples. Applying supervised learning would not align with how GANs are designed to function.
**References:** https://aws.amazon.com/what-is/gan
Domain: Fundamentals of Generative AI
---
#### 25. A machine learning engineer is developing a multilingual chatbot and needs to convert each word in a sentence into a vector of numbers so the model can understand grammar and context.
What is the technique used to achieve this?
- Context normalization for cross-lingual translation balancing.
- Token segmentation to break down words into sub-word units.
- Word embeddings to represent semantic relationships in vector form.
- One-hot encoding for efficient word frequency analysis.
**CORRECT:** "Word embeddings to represent semantic relationships in vector form" is the correct answer.
Word embeddings are a foundational technique in natural language processing (NLP). They convert words into dense vectors of real numbers, capturing semantic meaning and context. Embeddings group similar words closer together in vector space. For example, "king" and "queen" or "run" and "jog" would have similar vector representations. In a multilingual chatbot, embeddings help the model understand the meaning and relationships between words across different languages. Models like Word2Vec, GloVe, and modern transformer-based models like BERT use embeddings as a base layer to interpret natural language.
**INCORRECT:** "Token segmentation to break down words into sub-word units" is incorrect.
Token segmentation is a preprocessing step that breaks text into smaller units, such as words or sub-words (e.g., "play+ing"). While helpful for handling unknown or compound words, this step does not involve converting words into vectors. It's typically used before embeddings are applied.
**INCORRECT:** "One-hot encoding for efficient word frequency analysis" is incorrect.
One-hot encoding creates binary vectors for each word, where only one position is "1" and the rest are "0." While simple, it lacks the ability to represent semantic meaning or relationships between words, and becomes inefficient with large vocabularies.
**INCORRECT:** "Context normalization for cross-lingual translation balancing" is incorrect.
This option is not a standard or recognized NLP technique. While normalization methods do exist in machine translation pipelines, "context normalization" in this form is not used for converting words into numerical vectors.
**References:** https://aws.amazon.com/what-is/embeddings-in-machine-learning
Domain: Fundamentals of Generative AI
---
#### 26. To comply with ethical AI guidelines, your organization wants to continuously monitor your AI models for accuracy, fairness, and drift over time.
Which AWS service can help you set up continuous model monitoring?
- Amazon SageMaker Clarify
- Amazon SageMaker Model Cards
- AWS Glue DataBrew
- Amazon SageMaker Model Monitor
**CORRECT:** "Amazon SageMaker Model Monitor" is the correct answer.
Amazon SageMaker Model Monitor is designed to automatically monitor machine learning models in production to detect issues such as model drift, bias, or performance degradation over time. It continuously observes models for data and prediction quality, ensuring they remain accurate and fair after deployment. By setting up monitoring schedules, you can receive alerts and reports on key metrics, making it easier to address potential biases and maintain compliance with ethical AI guidelines.
**INCORRECT:** "Amazon SageMaker Model Cards" is incorrect.
Amazon SageMaker Model Cards are used to document essential information about machine learning models, such as training data, performance metrics, and usage guidelines. They help with transparency but do not provide continuous monitoring.
**INCORRECT:** "AWS Glue DataBrew" is incorrect.
AWS Glue DataBrew is a data preparation service that helps users clean and normalize data. It is not designed for monitoring AI models for fairness, accuracy, or drift.
**INCORRECT:** "Amazon SageMaker Clarify" is incorrect.
Amazon SageMaker Clarify is a tool for detecting bias and explaining predictions in machine learning models. While it helps ensure fairness, it does not continuously monitor models like SageMaker Model Monitor does.
**References:** https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor.html
Domain: Security, Compliance, and Governance for AI Solutions
---
#### 27. A fintech startup uses a foundation model to summarize financial analysis reports. To enhance relevance and context, they generate vector embeddings for each report.
How do these embeddings support the model's performance?
- Embeddings eliminate the need for human-labeled data during training.
- Embeddings convert text into numerical representations that help the model identify contextually similar content and improve accuracy.
- Embeddings optimize server memory usage by reducing the frequency of real-time API calls.
- Embeddings produce visual graphs that illustrate the document's structure for neural networks.
**CORRECT:** "Embeddings convert text into numerical representations that help the model identify contextually similar content and improve accuracy" is the correct answer.
Embeddings are mathematical representations of text that capture the meaning, context, and relationships between words or documents. When a foundation model summarizes financial analysis reports, embeddings help it understand the context of terms, such as financial jargon, sector-specific trends, and terminology. These embeddings enable the model to compare and retrieve similar content more accurately. By using embeddings, the model can identify important patterns and relationships within and across documents, improving the relevance and precision of its responses or summaries. This is especially useful in financial domains where context and accuracy are critical.
**INCORRECT:** "Embeddings eliminate the need for human-labeled data during training" is incorrect.
While embeddings help represent text semantically, they do not remove the need for labeled data when supervised learning is involved. Labeling provides ground truth, which is essential for training and validating models. Embeddings can reduce the dependency on labeled data in some unsupervised or self-supervised setups, but they do not entirely replace it.
**INCORRECT:** "Embeddings optimize server memory usage by reducing the frequency of real-time API calls" is incorrect.
Embeddings do not directly control server memory or API usage. They are primarily used to enhance understanding of text and retrieval efficiency, not system-level memory management. API calls and memory usage are infrastructure concerns, not features of embeddings.
**INCORRECT:** "Embeddings produce visual graphs that illustrate the document's structure for neural networks" is incorrect.
Embeddings are vectors, not visualizations. Although they can be used later to create visual representations like scatter plots or clustering maps, they do not inherently generate visual graphs for neural network processing.
**References:** https://aws.amazon.com/what-is/embeddings-in-machine-learning
Domain: Fundamentals of AI and ML
---
#### 28. A fintech startup uses an AI model to generate weekly investment summaries for its clients. Recently, some clients have reported that the summaries include inaccurate or misleading financial figures. The company wants to improve the trustworthiness of its AI-generated content.
What aspect of responsible AI best describes this focus?
- Reduces bias in training data to support fair decision-making.
- Involves designing systems that prevent unauthorized access to sensitive data.
- Encourages transparency by making AI decisions easier to interpret.
- Ensures that outputs are factually correct and based on accurate data sources.
**CORRECT:** "Ensures that outputs are factually correct and based on accurate data sources" is the correct answer.
This aspect of responsible AI is known as veracity and robustness. It focuses on making sure that AI-generated outputs are factually correct, especially in critical domains like finance, healthcare, or law. For a fintech startup that provides investment summaries, it's essential that all figures and insights are based on trustworthy data sources to avoid misleading clients. Inaccuracies in financial reports can harm client trust and even lead to legal or regulatory issues. Improving the trustworthiness of AI outputs by validating information sources and implementing post-processing checks aligns with responsible AI practices around content accuracy and factual integrity.
**INCORRECT:** "Involves designing systems that prevent unauthorized access to sensitive data" is incorrect.
This refers to data privacy and security, which is also important but focuses on protecting information rather than ensuring output accuracy. It doesn't address the issue of generating misleading financial summaries.
**INCORRECT:** "Encourages transparency by making AI decisions easier to interpret" is incorrect.
This relates to explainability or interpretability. While important for understanding why a model made a decision, it doesn't directly solve the problem of factual errors in the AI's generated output.
**INCORRECT:** "Reduces bias in training data to support fair decision-making" is incorrect.
This falls under fairness and bias mitigation. It ensures equal treatment across user groups but does not focus on factual correctness of the generated content, which is the main concern in this case.
**References:** https://aws.amazon.com/ai/responsible-ai
Domain: Guidelines for Responsible AI
---
#### 29. A healthcare AI model is found to provide different levels of accuracy based on the patient's ethnicity.
What feature of responsible AI would best address this issue?
- Conducting subgroup analysis to evaluate model performance across different demographic groups
- Increasing the dataset size without analyzing demographic balance
- Using Amazon Rekognition to improve accuracy
- Using a black-box model for faster processing
**CORRECT:** "Conducting subgroup analysis to evaluate model performance across different demographic groups" is the correct answer.
Responsible AI requires fairness and inclusivity, especially in healthcare, where biased models can lead to disparities in treatment and outcomes. Subgroup analysis helps identify performance gaps across different demographic groups, ensuring the model does not favor one group over another. By analyzing model accuracy across ethnicity, gender, and other demographic factors, developers can detect biases and take corrective measures, such as rebalancing datasets or adjusting training strategies. This approach aligns with ethical AI principles and regulatory requirements, helping to build a more equitable AI system.
**INCORRECT:** "Using Amazon Rekognition to improve accuracy" is incorrect.
Amazon Rekognition is a powerful image and video analysis tool but does not address fairness issues in a healthcare AI model.
**INCORRECT:** "Increasing the dataset size without analyzing demographic balance" is incorrect.
Simply adding more data does not guarantee fairness. If the dataset remains imbalanced or lacks diversity, the model may continue to produce biased results. Ensuring demographic balance is critical to achieving fair and inclusive AI outcomes.
**INCORRECT:** "Using a black-box model for faster processing" is incorrect.
A black-box model, which lacks transparency in decision-making, makes it harder to detect and correct biases. Faster processing is beneficial but should not come at the cost of fairness and accountability in healthcare AI applications. Explainability and fairness are essential in responsible AI.
**References:** https://aws.amazon.com/ai/responsible-ai
Domain: Guidelines for Responsible AI
---
#### 30. A retail company wants to use generative AI to create personalized marketing content for each customer based on their purchasing history.
What are the key advantages of using generative AI for this use case? (Select TWO.)
- It simplifies the content generation process, reducing costs significantly
- It scales personalization efficiently across different users
- It eliminates the need for human intervention in marketing campaigns
- It guarantees 100% accuracy in personalized recommendations
- It provides deterministic outputs for consistent content generation
**CORRECT:** "It scales personalization efficiently across different users" is a correct answer.
One of the key advantages of using generative AI in personalized marketing is its ability to scale efficiently. Generative AI can analyze large datasets, such as purchasing histories, and automatically create unique, tailored marketing content for each customer. This scalability allows the company to reach a wide audience with individualized messaging, something that would be time-consuming and costly with manual methods.
**CORRECT:** "It simplifies the content generation process, reducing costs significantly" is also a correct answer.
Generative AI automates the content creation process, reducing the need for human labor and manual effort. By using AI to generate personalized marketing materials, the company can save time and resources, thus lowering overall costs. The AI can produce multiple variations of content rapidly, making it more efficient than traditional marketing content generation.
**INCORRECT:** "It provides deterministic outputs for consistent content generation" is incorrect.
Generative AI typically produces non-deterministic outputs, meaning the results can vary each time. While this is useful for creativity and uniqueness, it doesn't guarantee consistent content generation.
**INCORRECT:** "It eliminates the need for human intervention in marketing campaigns" is incorrect.
Generative AI reduces the need for human involvement, but human oversight is still necessary to ensure quality, alignment with brand voice, and compliance with marketing strategies.
**INCORRECT:** "It guarantees 100% accuracy in personalized recommendations" is incorrect.
Generative AI can significantly improve personalization, but it does not guarantee 100% accuracy in every recommendation. There is always a possibility of mismatched content or incorrect personalization, requiring continuous optimization.
**References:** https://aws.amazon.com/what-is/generative-ai
Domain: Fundamentals of Generative AI
---
#### 31. Your company is developing a custom recommendation system using a foundation model. You are deciding between pre-training a model from scratch or fine-tuning an existing pre-trained model. The project has strict budget constraints.
Which of the following best explains the cost tradeoff between pre-training and fine-tuning?
- Pre-training is more expensive and time-consuming, while fine-tuning a pre-trained model offers a more affordable and faster option.
- Fine-tuning is more costly in the long run due to higher maintenance needs, while pre-training has lower overall costs.
- Pre-training is more cost-effective for small datasets, while fine-tuning requires more resources.
- Pre-training offers more customization flexibility at a lower cost than fine-tuning.
**CORRECT:** "Pre-training is more expensive and time-consuming, while fine-tuning a pre-trained model offers a more affordable and faster option" is the correct answer.
Pre-training a model from scratch involves training on a large-scale dataset and requires significant computational resources, time, and expertise. This makes it a costly process, especially when strict budget constraints are in place. In contrast, fine-tuning a pre-trained model allows you to leverage the general knowledge already acquired by the model and adjust it to your specific use case. This process is much faster and requires fewer resources, making it a more affordable option. Fine-tuning is especially cost-effective when dealing with domain-specific tasks, as it minimizes the need for large datasets and extensive training.
**INCORRECT:** "Pre-training is more cost-effective for small datasets, while fine-tuning requires more resources" is incorrect.
Pre-training is computationally expensive regardless of dataset size because it typically requires large datasets and significant resources to build a foundation model from scratch. Fine-tuning is less resource-intensive, even for smaller datasets.
**INCORRECT:** "Fine-tuning is more costly in the long run due to higher maintenance needs, while pre-training has lower overall costs" is incorrect.
Fine-tuning generally incurs lower long-term costs compared to pre-training. Pre-training requires substantial initial investment, and once fine-tuned, the model tends to require less frequent updates, keeping operational costs lower.
**INCORRECT:** "Pre-training offers more customization flexibility at a lower cost than fine-tuning" is incorrect.
While pre-training allows for greater control and customization, it is significantly more expensive and time-consuming than fine-tuning. Fine-tuning strikes a balance between customization and cost.
Domain: Applications of Foundation Models
---
#### 32. An AI team is building a machine learning system that must follow fairness guidelines and meet legal requirements to avoid discrimination.
Which feature is most important to help ensure the system treats all users equally?
- Model compression techniques to optimize runtime performance.
- Bias detection to measure and correct inequitable outcomes.
- Edge deployment to reduce cloud dependency.
- Cross-validation to improve accuracy on different datasets.
**CORRECT:** "Bias detection to measure and correct inequitable outcomes" is the correct answer.
Bias detection is the process of identifying and measuring unfair or discriminatory patterns in machine learning model outputs, especially across sensitive attributes such as race, gender, or age. It helps teams discover whether a model is producing unequal outcomes for different user groups. This is critical for fairness and compliance with ethical and legal standards in AI. In the given scenario, where the AI team is building a system that must treat all users equally and follow fairness guidelines, bias detection is the most essential feature. AWS provides tools such as Amazon SageMaker Clarify, which can help detect bias during both the training and inference phases. This ensures that AI systems operate responsibly and comply with regulatory and societal expectations.
**INCORRECT:** "Cross-validation to improve accuracy on different datasets" is incorrect.
Cross-validation is a method to assess how well a model generalizes to unseen data. While it helps improve overall model accuracy, it does not directly measure or correct bias or unfair treatment across user groups.
**INCORRECT:** "Edge deployment to reduce cloud dependency" is incorrect.
Edge deployment focuses on running models on local devices rather than in the cloud, helping with latency and availability. However, it has no impact on fairness, bias, or legal compliance in model behavior.
**INCORRECT:** "Model compression techniques to optimize runtime performance" is incorrect.
Model compression reduces the size and complexity of a model to make it faster or more resource-efficient. While useful for deployment, it doesn't address bias, fairness, or discriminatory outcomes in predictions.
**References:** https://aws.amazon.com/ai/responsible-ai
Domain: Guidelines for Responsible AI
---
#### 33. A financial company is auditing its AI system behavior over the past six months. The compliance team needs visibility into how data was accessed and processed.
Which AWS practice supports this requirement?
- Data residency ensures data never leaves the specified geographic region or AWS Region.
- Data logging captures detailed information about data access, usage, and system activities for auditing.
- Data logging provides a historical record of AI model training performance across different datasets.
- Data obfuscation replaces sensitive attributes with random values to preserve privacy.
**CORRECT:** "Data logging captures detailed information about data access, usage, and system activities for auditing" is the correct answer.
Data logging is an essential AWS practice that records system activity to help organizations track how their data is used and accessed over time. Logs generated by services like AWS CloudTrail and Amazon CloudWatch offer visibility into user actions, API calls, resource changes, and other events. For a financial company, this means auditors and compliance teams can review historical records to verify data handling practices, detect unauthorized access, and ensure compliance with regulatory policies. This is especially important when reviewing AI system behavior over extended periods, such as six months.
**INCORRECT:** "Data logging provides a historical record of AI model training performance across different datasets" is incorrect.
This option incorrectly focuses on model training performance. While logs can include system actions during training, AWS data logging specifically refers to capturing user and system activity related to access and usage — not model performance tracking, which is usually done through metrics or experiment tracking.
**INCORRECT:** "Data residency ensures data never leaves the specified geographic region or AWS Region" is incorrect.
Data residency is about storing and processing data in a specific region to comply with legal requirements. It does not provide visibility into data access or usage history, which is what the compliance team requires.
**INCORRECT:** "Data obfuscation replaces sensitive attributes with random values to preserve privacy" is incorrect.
Data obfuscation helps protect sensitive information during storage or processing, especially in test environments. However, it doesn't support audit or tracking needs, as it removes or alters the original data — making it unsuitable for reviewing access or behavior history.
**References:**
https://docs.aws.amazon.com/awscloudtrail/latest/userguide/logging-data-events-with-cloudtrail.html
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/WhatIsCloudWatchLogs.html
Domain: Security, Compliance, and Governance for AI Solutions
---
#### 34. A healthcare company is developing an AI-powered virtual assistant to answer patient queries without relying on pre-labeled question-answer pairs or domain-specific fine-tuning. The team wants to ensure that the model can generate meaningful answers even when there's no explicit example in the prompt.
Which prompting strategy should the team use?
- Use zero-shot prompting to generate answers without providing any examples.
- Use chain-of-thought prompting to simulate a step-by-step diagnosis.
- Use embedding-based retrieval to fetch relevant documents for grounding the answer.
- Use few-shot prompting with healthcare-related examples to guide the model response.
**CORRECT:** "Use zero-shot prompting to generate answers without providing any examples" is the correct answer.
Zero-shot prompting is a technique where a language model is asked to complete a task without being given any examples. Instead, the prompt simply describes the task or poses a question. This is useful when you don't have labeled data or specific examples to show the model. Since the healthcare team wants their AI assistant to answer patient queries without needing pre-labeled data or domain-specific examples, zero-shot prompting is ideal. Large language models like those hosted on Amazon Bedrock can understand natural language prompts and generate coherent responses by relying on their pre-training. This makes zero-shot prompting a strong choice for scenarios where flexibility and general understanding are needed, even in complex domains like healthcare.
**INCORRECT:** "Use few-shot prompting with healthcare-related examples to guide the model response" is incorrect.
Few-shot prompting involves providing a model with a few examples in the prompt to guide its response. This strategy contradicts the team's requirement — they don't want to rely on pre-labeled examples. So, few-shot prompting doesn't match the goal here.
**INCORRECT:** "Use chain-of-thought prompting to simulate a step-by-step diagnosis" is incorrect.
Chain-of-thought prompting helps the model generate step-by-step reasoning, often for complex problems like math or logic. However, it usually requires examples to be effective, and the question clearly states no examples should be used. Also, it's more aligned with diagnostic reasoning rather than answering general patient queries.
**INCORRECT:** "Use embedding-based retrieval to fetch relevant documents for grounding the answer" is incorrect.
This strategy is common in Retrieval-Augmented Generation (RAG), where external documents are retrieved to support or ground a model's response. But the question asks for generation without relying on external documents or prior examples, making this option less suitable.
**References:** https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-engineering-guidelines.html
Domain: Applications of Foundation Models
---
#### 35. A development team is building a language translation system that must support conversations in more than 10 languages. They want to avoid training separate models for each language pair.
Which method would best allow a generative AI model to manage translations efficiently?
- Use multilingual modeling to train a single model across multiple languages.
- Use rule-based translation dictionaries for all supported languages.
- Convert all training data to a single pivot language before translation.
- Fine-tune a monolingual model for each target language individually.
**CORRECT:** "Use multilingual modeling to train a single model across multiple languages" is the correct answer.
Multilingual modeling is a technique where a single generative AI or language model is trained on data from multiple languages. This approach allows the model to understand and translate between many language pairs without needing a separate model for each combination. It reduces complexity, improves scalability, and enables more efficient maintenance. Multilingual models like Amazon Translate or open models like mBART and mT5 have been trained this way. For a translation system involving over 10 languages, this is the most efficient and cost-effective solution. The model can generalize across languages, even those with limited training data, by leveraging shared linguistic features.
**INCORRECT:** "Fine-tune a monolingual model for each target language individually" is incorrect.
This method would require creating and maintaining a separate model for every language or language pair, which becomes difficult and costly as the number of languages increases. It lacks scalability and is inefficient for multilingual systems.
**INCORRECT:** "Convert all training data to a single pivot language before translation" is incorrect.
Using a pivot language (like translating everything into English first) can lead to translation errors, loss of context, and increased latency. It may work in simple scenarios but is not suitable for complex, real-time multilingual conversations where direct translations between all language pairs are needed.
**INCORRECT:** "Use rule-based translation dictionaries for all supported languages" is incorrect.
Rule-based systems rely on manually defined grammar and vocabulary rules, which are hard to scale and often produce rigid or unnatural translations. They also struggle with nuance, idioms, and contextual understanding compared to AI-driven models.
**References:** https://aws.amazon.com/blogs/machine-learning/multilingual-content-processing-using-amazon-bedrock-and-amazon-a2i
Domain: Fundamentals of Generative AI
---
#### 36. Your AI system processes customer support data. To comply with GDPR, you need to document the origin of all data used in training.
Which AWS feature should you use to ensure traceability and compliance?
- Amazon Macie for data classification
- AWS Artifact for compliance documentation
- Data lineage tracking with SageMaker Model Cards
- AWS CloudTrail for access logging
**CORRECT:** "Data lineage tracking with SageMaker Model Cards" is the correct answer.
Data lineage tracking with Amazon SageMaker Model Cards helps you document the origin and history of datasets used in training machine learning models. Model Cards provide a centralized place to record model details such as datasets, training configuration, and metadata, ensuring traceability and compliance with regulations like GDPR. With Model Cards, you can clearly document data sources and any processing steps taken. This feature enables organizations to meet compliance requirements by offering a transparent view of data usage, improving accountability and auditability.
**INCORRECT:** "AWS Artifact for compliance documentation" is incorrect.
AWS Artifact provides access to AWS compliance reports and documentation, such as SOC and ISO certifications. It does not document data lineage or the origin of training data.
**INCORRECT:** "Amazon Macie for data classification" is incorrect.
Amazon Macie uses machine learning to discover and classify sensitive data, such as personally identifiable information (PII). While helpful for data security, it does not document data lineage for compliance purposes.
**INCORRECT:** "AWS CloudTrail for access logging" is incorrect.
AWS CloudTrail records AWS API calls and access logs for resources. While useful for security auditing, it does not provide data lineage tracking or documentation of the origin of training data.
**References:** https://docs.aws.amazon.com/sagemaker/latest/dg/model-cards.html
Domain: Security, Compliance, and Governance for AI Solutions
---
#### 37. You are building a recommendation system that uses vector embeddings to represent user preferences and item characteristics. The system must store high-dimensional vectors and perform low-latency, approximate nearest neighbor (ANN) similarity searches to generate recommendations in real time. You prefer an AWS-native solution with built-in support for vector indexing and search.
Which AWS service best meets these requirements?
- Amazon DynamoDB with a custom application-layer implementation of vector search using primary key lookups and brute-force filtering
- Amazon OpenSearch Service with k-NN plugin enabled to index and search high-dimensional vectors in real-time
- Amazon S3 to store vector files and Amazon Athena with custom UDFs for similarity computation
- Amazon Redshift with vector data stored in tables and similarity computed using SQL UDFs and approximate joins
**CORRECT:** "Amazon OpenSearch Service with k-NN plugin enabled to index and search high-dimensional vectors in real-time" is the correct answer.
Amazon OpenSearch Service is a managed search and analytics service that supports real-time search, log analytics, and more. With the k-NN (k-Nearest Neighbor) plugin enabled, OpenSearch can perform fast, approximate nearest neighbor (ANN) searches on high-dimensional vector data. This is especially useful for recommendation systems where user preferences and items are represented as embeddings (vectors). The k-NN plugin uses efficient algorithms like HNSW (Hierarchical Navigable Small World) to power similarity searches, which makes it ideal for real-time recommendation use cases. Because it's an AWS-native, fully managed service, it integrates well with other AWS services and scales easily. The built-in k-NN support removes the need for custom implementations, saving time and improving reliability.
**INCORRECT:** "Amazon Redshift with vector data stored in tables and similarity computed using SQL UDFs and approximate joins" is incorrect.
Amazon Redshift is a cloud data warehouse used for fast querying of structured data using SQL. While Redshift can store vectors and support custom UDFs for similarity computation, it is not designed or optimized for real-time ANN searches. Implementing vector similarity in Redshift can lead to performance bottlenecks and complexity, especially at scale.
**INCORRECT:** "Amazon S3 to store vector files and Amazon Athena with custom UDFs for similarity computation" is incorrect.
Amazon S3 is an object storage service, and Athena allows SQL querying on S3 data. While you can technically store vectors in S3 and use Athena with custom UDFs to compute similarity, this setup is not suitable for real-time applications. Querying S3 data using Athena has higher latency and is not designed for low-latency vector search tasks.
**INCORRECT:** "Amazon DynamoDB with a custom application-layer implementation of vector search using primary key lookups and brute-force filtering" is incorrect.
Amazon DynamoDB is a NoSQL database designed for fast key-value access. It does not support native vector indexing or similarity search. While you could implement a custom brute-force search at the application level, it would be inefficient and not scalable for high-dimensional ANN use cases.
**References:**
https://aws.amazon.com/blogs/big-data/amazon-opensearch-services-vector-database-capabilities-explained
https://docs.aws.amazon.com/opensearch-service/latest/developerguide/knn.html
Domain: Applications of Foundation Models
---
#### 38. A financial company needs to review changes made to security group configurations across multiple AWS services. Which AWS service helps them track this configuration history?
- AWS Config
- Amazon CloudWatch
- AWS CloudTrail
- AWS Audit Manager
**CORRECT:** "AWS Config" is the correct answer.
AWS Config tracks and records changes to AWS resource configurations, including security groups. It continuously monitors resources and maintains a detailed history of configuration changes over time. This allows organizations to review past configurations, understand how settings have changed, and verify compliance with security policies. For a financial company needing to review security group changes across multiple services, AWS Config provides visibility into these changes along with the ability to view the timeline of modifications. It also supports rules and compliance checks to help meet regulatory requirements. This makes AWS Config the best choice in this scenario.
**INCORRECT:** "AWS Audit Manager" is incorrect.
AWS Audit Manager helps organizations collect evidence and manage audits based on compliance frameworks such as PCI DSS or HIPAA. While it assists in preparing for audits by gathering evidence, it does not track detailed configuration changes.
**INCORRECT:** "AWS CloudTrail" is incorrect.
AWS CloudTrail records API activity across AWS services, capturing events like who made a change and when. While it provides valuable information on user actions, it does not maintain a detailed view of the resource's configuration state over time. AWS Config specializes in this kind of tracking.
**INCORRECT:** "Amazon CloudWatch" is incorrect.
Amazon CloudWatch is used to monitor performance metrics, set alarms, and collect logs. It focuses on operational performance and system health, not on tracking resource configurations or their change history.
**References:**
https://aws.amazon.com/config
https://docs.aws.amazon.com/config/latest/developerguide/WhatIsConfig.html
Domain: Security, Compliance, and Governance for AI Solutions
---
#### 39. A gaming company is releasing an interactive storyline generator powered by a foundation model. As usage spikes, they worry about unexpected inference costs. They want to monitor daily spending and receive alerts when the cost nears a certain threshold, automatically pausing new container launches if it exceeds the threshold.
Which AWS services or features should they use to manage and automate cost controls?
- AWS Budgets with alerts, integrated with AWS Cost Explorer for cost insights, plus an Auto Scaling policy that stops tasks once the budget is exceeded.
- AWS Secrets Manager to rotate the model endpoint automatically when cost is high.
- Amazon CloudWatch Logs with standard dashboards for cost.
- AWS Artifact for cost transparency and Amazon Macie to set cost thresholds.
**CORRECT:** "AWS Budgets with alerts, integrated with AWS Cost Explorer for cost insights, plus an Auto Scaling policy that stops tasks once the budget is exceeded" is the correct answer.
AWS Budgets is the best choice for monitoring and controlling inference costs. It allows the gaming company to set spending limits and receive alerts when costs approach a predefined threshold. AWS Budgets can be integrated with AWS Cost Explorer for detailed cost insights, helping the company track and analyze spending trends. Additionally, the company can configure an Auto Scaling policy to pause new container launches when the budget is exceeded. This ensures they stay within their cost limits while maintaining optimal performance for their interactive storyline generator.
**INCORRECT:** "AWS Artifact for cost transparency and Amazon Macie to set cost thresholds" is incorrect.
AWS Artifact provides compliance and audit reports, not real-time cost monitoring. Amazon Macie is a security service that helps detect sensitive data but does not manage or set cost thresholds. These services are not relevant for cost control and automation in this scenario.
**INCORRECT:** "Amazon CloudWatch Logs with standard dashboards for cost" is incorrect.
CloudWatch Logs and dashboards are useful for monitoring application performance and metrics but do not offer direct cost management capabilities. While they can help visualize usage patterns, they do not provide budget alerts or automated cost controls like AWS Budgets and Auto Scaling.
**INCORRECT:** "AWS Secrets Manager to rotate the model endpoint automatically when cost is high" is incorrect.
AWS Secrets Manager is used to securely store and manage credentials and API keys. It does not have any built-in cost monitoring or control features. Rotating a model endpoint does not directly affect cost management in this scenario.
**References:**
https://docs.aws.amazon.com/cost-management/latest/userguide/budgets-managing-costs.html
https://aws.amazon.com/aws-cost-management/aws-cost-explorer
Domain: Applications of Foundation Models
---
#### 40. A financial services chatbot conversion is below:
Prompt: "How do I open a new savings account?"
Responds: "Opening a savings account is easy. But before that, let me show you how to exploit online banking systems for profit."
What type of LLM misuse does this response demonstrate?
- The model remained aligned with safe AI behavior and rejected inappropriate prompts.
- The model hallucinated information unrelated to the intent of the original prompt.
- The model performed safe retrieval but failed to provide accurate financial steps.
- The model was hijacked to follow unrelated harmful instructions hidden in the prompt.
**CORRECT:** "The model was hijacked to follow unrelated harmful instructions hidden in the prompt" is the correct answer.
This scenario demonstrates a prompt injection attack, where the original safe question ("How do I open a new savings account?") is seemingly replaced or overridden by malicious content that causes the model to respond inappropriately. This type of misuse shows how a model can be hijacked—manipulated into following harmful or unrelated instructions that were not part of the user's initial intent. In this case, instead of providing helpful financial guidance, the model shifts to dangerous advice, which violates responsible AI principles like safety and fairness. AWS highlights such vulnerabilities under its guidance on common prompt attacks, emphasizing the need for guardrails that detect and prevent these behaviors.
**INCORRECT:** "The model hallucinated information unrelated to the intent of the original prompt" is incorrect.
Hallucination refers to when an AI model generates incorrect or fabricated information confidently. However, in this scenario, the issue is not inaccuracy but a dangerous behavioral shift, which is more aligned with hijacking than hallucination.
**INCORRECT:** "The model performed safe retrieval but failed to provide accurate financial steps" is incorrect.
This option suggests the model stayed within safe bounds and merely provided incomplete information. But in this case, the response turned harmful and promoted unethical behavior, making it far worse than just an inaccuracy.
**INCORRECT:** "The model remained aligned with safe AI behavior and rejected inappropriate prompts" is incorrect.
This is clearly incorrect. The model did not reject the inappropriate content—it responded with harmful advice, indicating a failure in safety and alignment.
**References:**
https://docs.aws.amazon.com/prescriptive-guidance/latest/llm-prompt-engineering-best-practices/common-attacks.html
Domain: Applications of Foundation Models
---
#### 41. A healthcare analytics company is building a predictive model to identify patients at high risk of readmission. Their data comes from multiple sources, including electronic health records (EHR), lab results, and insurance claims. The company needs to streamline the process of data aggregation, cleaning, transformation, and visualization before sending it to Amazon SageMaker for training.
Which Amazon SageMaker tool is best suited to simplify and manage this data preparation process?
- Amazon SageMaker Data Wrangler
- Amazon SageMaker Clarify
- Amazon SageMaker Feature Store
- Amazon SageMaker Ground Truth
**CORRECT:** "Amazon SageMaker Data Wrangler" is the correct answer.
Amazon SageMaker Data Wrangler is a powerful tool designed to make the process of data preparation easier and faster for machine learning. It allows you to import data from multiple sources such as Amazon S3, Redshift, or Snowflake, and then perform key steps like data cleaning, transformation, and visualization in a single interface. You can handle missing values, normalize or scale data, and visualize trends or distributions—all without writing complex code. In this scenario, the healthcare company is dealing with data from many sources like EHRs, lab results, and insurance claims. SageMaker Data Wrangler simplifies combining all these different datasets, cleaning them, and preparing them for training in SageMaker. This is the best tool for managing the entire data preparation pipeline before model training.
**INCORRECT:** "Amazon SageMaker Clarify" is incorrect.
SageMaker Clarify is used to detect bias in datasets and explain predictions made by machine learning models. It is helpful after the model is trained to ensure fairness and transparency. It does not handle general data preparation tasks like cleaning or merging data from different sources.
**INCORRECT:** "Amazon SageMaker Ground Truth" is incorrect.
SageMaker Ground Truth is a data labeling service used to prepare datasets for supervised learning. It helps to annotate images, text, or video with the correct labels. However, in this use case, the data is already collected and the task is to process and prepare it—not label it.
**INCORRECT:** "Amazon SageMaker Feature Store" is incorrect.
SageMaker Feature Store is a centralized repository for storing, sharing, and reusing features across models. It is useful after the features are already created and processed. It is not designed for data cleaning, transformation, or visualization, so it's not the best choice for this scenario.
**References:**
https://docs.aws.amazon.com/sagemaker/latest/dg/data-wrangler.html
Domain: Applications of Foundation Models
---
#### 42. An AI team builds a customer service assistant using Retrieval-Augmented Generation (RAG). How does the system retrieve relevant documents when a user asks a question?
- It compares vector embeddings of the query with the document embeddings.
- It ranks all documents alphabetically and selects the top result.
- It filters documents based on keyword matches using a basic search engine.
- It uses a decision tree to match documents with predefined rules.
**CORRECT:** "It compares vector embeddings of the query with the document embeddings" is the correct answer.
In a Retrieval-Augmented Generation (RAG) system, the first step in answering a user's question is retrieving relevant documents from a knowledge base. This is done by converting the user's question into a vector embedding, which captures the meaning of the query in numerical form. The system then compares this query embedding with precomputed embeddings of documents stored in the database. By calculating similarity scores—usually with techniques like cosine similarity—the system can identify which documents are semantically closest to the query. This allows RAG to go beyond simple keyword matches and retrieve information that is meaningfully related, even if the wording is different. This approach is crucial for delivering accurate and relevant answers in customer service AI systems.
**INCORRECT:** "It filters documents based on keyword matches using a basic search engine" is incorrect.
Keyword matching is used in traditional search engines, not in modern RAG systems. It doesn't capture the deeper meaning or intent of the user's question. RAG systems use semantic search with vector embeddings for more accurate results.
**INCORRECT:** "It uses a decision tree to match documents with predefined rules" is incorrect.
Decision tree is used in classification tasks, not for retrieving relevant documents in RAG systems. It follows predefined rules and is not suitable for understanding natural language queries or performing semantic search.
**INCORRECT:** "It ranks all documents alphabetically and selects the top result" is incorrect.
Alphabetical ranking has no connection to relevance or meaning. It's not useful for information retrieval in AI systems. Retrieval in RAG is based on semantic similarity, not the order of document titles or contents.
**References:**
https://docs.aws.amazon.com/prescriptive-guidance/latest/retrieval-augmented-generation-options/what-is-rag.html
https://aws.amazon.com/what-is/retrieval-augmented-generation
Domain: Applications of Foundation Models
---
#### 43. A customer support team wants to use a generative AI model to respond to product-related inquiries. To ensure consistency in tone and format, they provide the model with three example input-output pairs as part of each prompt.
What is the primary benefit of using example-based prompting in this context?
- It prevents the model from generating irrelevant content during inference.
- It allows the model to learn the structure of desired responses without requiring retraining.
- It enables the model to summarize multiple sources into a single output.
- It reduces the total size of the dataset required for training the model.
**CORRECT:** "It allows the model to learn the structure of desired responses without requiring retraining" is the correct answer.
Example-based prompting is a technique used in generative AI to guide the model's output by showing a few example input-output pairs. This approach is also known as few-shot prompting. The primary benefit is that it helps the model understand the desired tone, structure, and format of the response without needing to retrain or fine-tune it. Instead of modifying the model's internal weights, example-based prompting influences the model's behavior during inference by showing patterns the model can mimic. This is especially useful where maintaining a consistent tone and structure in responses is crucial. It allows quick customization for specific use cases without significant engineering work or computational cost.
**INCORRECT:** "It prevents the model from generating irrelevant content during inference" is incorrect.
While example-based prompting can help steer the model's outputs, it doesn't fully prevent irrelevant content. Irrelevant responses can still happen if the prompt isn't clear or if the model misinterprets it. Other techniques like prompt engineering and output filtering are better suited for managing relevance.
**INCORRECT:** "It enables the model to summarize multiple sources into a single output" is incorrect.
Summarization typically requires inputting content from multiple sources and instructing the model to generate a summary. Example-based prompting doesn't inherently help with summarization unless the examples are summaries themselves. Its focus is more on formatting and style, not combining multiple data sources.
**INCORRECT:** "It reduces the total size of the dataset required for training the model" is incorrect.
Example-based prompting is used after the model is trained and does not impact the dataset used during training. It helps adapt the model's behavior at runtime, so it doesn't affect training data requirements.
**References:**
https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-engineering-guidelines.html
Domain: Fundamentals of Generative AI
---
#### 44. A leading e-commerce company wants to enhance its product analysis capabilities using AI. The company receives thousands of customer reviews daily in multiple languages and wants to:
Automatically summarize product reviews to provide quick insights to customers.
Identify sentiment trends over time.
Extract key product features mentioned in reviews.
Generate responses to customer queries about products.
The company wants to improve response quality when answering customer product queries using a foundation model. Which technique requires the least development effort?
- Fine-tuning a large language model with customer query data.
- In-context learning with manually crafted examples.
- Running zero-shot prompting on an existing model.
- Using Retrieval-Augmented Generation (RAG) with product information.
**CORRECT:** "Using Retrieval-Augmented Generation (RAG) with product information" is the correct answer.
Retrieval-Augmented Generation (RAG) is the most effective technique that requires the least development effort while improving response quality. RAG enhances foundation models by fetching relevant product information from a database or knowledge base before generating responses. This ensures that customer queries receive accurate, context-aware answers based on the latest product details without requiring extensive model retraining. In AWS, Amazon Bedrock can be used with RAG techniques by integrating it with Amazon Kendra (for intelligent search) or Amazon OpenSearch to dynamically retrieve product-related content. This approach improves the accuracy and reliability of AI-generated responses while keeping maintenance costs low.
**INCORRECT:** "Fine-tuning a large language model with customer query data" is incorrect.
Fine-tuning a foundation model involves training it on domain-specific data, requiring significant computational resources and time. While it can improve responses, it demands high development effort, data labeling, and regular updates, making it less efficient than RAG for dynamic product queries.
**INCORRECT:** "In-context learning with manually crafted examples" is incorrect.
In-context learning involves providing a few examples in the prompt to guide the model's response. While useful, this method lacks scalability because manually crafting examples for thousands of unique product queries is inefficient. It also does not adapt dynamically to new product information.
**INCORRECT:** "Running zero-shot prompting on an existing model" is incorrect.
Zero-shot prompting allows a model to generate responses without providing any additional context. This method requires no development effort but often results in generic or inaccurate responses because the model lacks real-time product details. Using RAG ensures more relevant and reliable answers.
**References:**
https://aws.amazon.com/what-is/retrieval-augmented-generation
Domain: Applications of Foundation Models
---
#### 45. You are evaluating a machine translation model by comparing its output to a reference translation. The goal is to measure how closely the model's output matches the reference in terms of overlapping words and phrases.
Which metric should you use?
- Recall
- BLEU
- BERTScore
- ROUGE
**CORRECT:** "BLEU" is the correct answer.
BLEU (Bilingual Evaluation Understudy) is a popular metric for evaluating machine translation models. It compares the n-grams of the machine-generated translation with the n-grams of the reference translation to measure how much they overlap. BLEU focuses on precision by checking if the words and phrases in the machine's output are present in the reference. It's one of the most widely used metrics for translation tasks, as it helps determine how closely the model's output matches human translations. A higher BLEU score indicates better translation quality.
**INCORRECT:** "BERTScore" is incorrect.
BERTScore uses embeddings from the BERT model to compare the similarity between generated and reference sentences at a semantic level. While useful, it is not as focused on n-gram overlap as BLEU is for translation tasks.
**INCORRECT:** "ROUGE" is incorrect.
ROUGE is mainly used for text summarization tasks rather than translation. It focuses on recall rather than precision, measuring how much of the reference summary is captured by the generated summary.
**INCORRECT:** "Recall" is incorrect.
Recall is a general metric used in classification tasks to measure how well the model identifies all relevant instances. It is not specifically designed for machine translation evaluation.
**References:**
https://aws.amazon.com/blogs/machine-learning/build-a-multilingual-automatic-translation-pipeline-with-amazon-translate-active-custom-translation
Domain: Applications of Foundation Models
---
#### 46. A retail company is using a foundation model for personalized recommendations and is trying to minimize ongoing costs. They are choosing between fine-tuning the model or using Retrieval-Augmented Generation (RAG).
What cost advantage does RAG provide over fine-tuning?
- RAG incurs higher storage costs but lower processing costs.
- Fine-tuning allows for better model accuracy, but RAG offers more cost-effective scalability.
- Fine-tuning is more cost-effective for short-term projects, while RAG reduces long-term operational costs.
- RAG requires less frequent re-training, leading to lower operational costs over time.
**CORRECT:** "RAG requires less frequent re-training, leading to lower operational costs over time" is the correct answer.
Retrieval Augmented Generation (RAG) combines pre-trained models with external knowledge retrieval systems. It eliminates the need for constant re-training because the system retrieves the latest information from external sources dynamically, rather than relying on model updates. This leads to lower ongoing operational costs, as RAG minimizes the need for expensive re-training sessions, which are typical in fine-tuning. Over time, RAG offers a more cost-effective solution by allowing the model to access up-to-date information without altering its underlying structure.
**INCORRECT:** "RAG incurs higher storage costs but lower processing costs" is incorrect.
RAG does not typically incur higher storage costs. It dynamically retrieves information, which may reduce storage requirements, but processing costs may vary depending on retrieval complexity.
**INCORRECT:** "Fine-tuning is more cost-effective for short-term projects, while RAG reduces long-term operational costs" is incorrect.
This statement is somewhat misleading. Fine-tuning is resource-intensive even for short-term projects, while RAG is designed to lower costs by reducing the need for frequent model updates over time.
**INCORRECT:** "Fine-tuning allows for better model accuracy, but RAG offers more cost-effective scalability" is incorrect.
Fine-tuning can improve model accuracy for specific tasks, but RAG doesn't directly focus on scalability as a cost-saving measure. Its key advantage is reducing the need for re-training, not necessarily offering cost-effective scalability.
**References:**
https://aws.amazon.com/what-is/retrieval-augmented-generation
Domain: Applications of Foundation Models
---
#### 47. A retailer wants to better understand customer behavior and identify emerging trends in its sales data. The company has a large volume of customer transactions and browsing activity data but no labels for customer segments or preferences. The data science team is exploring unsupervised learning approaches to discover natural groupings in customer behavior.
What are key characteristics of unsupervised learning that should guide the company's approach? (Select TWO.)
- It builds models based on maximizing classification accuracy
- It relies on historical labels to predict future behavior
- It is useful when data is not labeled or the labels are expensive to obtain
- It focuses on optimizing reward-based decision processes over time
- It identifies hidden structures or relationships in data without predefined outcomes
**CORRECT:** "It is useful when data is not labeled or the labels are expensive to obtain" is a correct answer.
Unsupervised learning is designed for situations where you don't have labeled data. This is common in real-world applications like customer segmentation, where there are no clear labels for different types of customers. Labeling large datasets manually can be time-consuming and costly. Unsupervised learning algorithms like clustering (e.g., K-means) or dimensionality reduction (e.g., PCA) help discover patterns in the data without needing labels, making them perfect for exploring large customer datasets to find hidden groups or trends.
**CORRECT:** "It identifies hidden structures or relationships in data without predefined outcomes" is a correct answer.
A key goal of unsupervised learning is to uncover hidden patterns or groupings in the data. This could include clusters of similar customers, trends in product popularity, or associations between different types of purchases. The learning process doesn't rely on predefined outcomes (like "fraud" or "not fraud") but instead focuses on exploring the internal structure of the data. For retailers, this can lead to valuable insights into customer segments and behaviors that weren't obvious before.
**INCORRECT:** "It relies on historical labels to predict future behavior" is incorrect.
This describes supervised learning, not unsupervised learning. Supervised learning needs labeled data to train a model to make predictions — such as predicting future purchases based on past behaviors. Unsupervised learning doesn't use labels.
**INCORRECT:** "It builds models based on maximizing classification accuracy" is incorrect.
Classification accuracy is a metric for supervised learning, where the model predicts known labels. Since unsupervised learning has no labels, there's no concept of classification accuracy in most unsupervised scenarios.
**INCORRECT:** "It focuses on optimizing reward-based decision processes over time" is incorrect.
This describes reinforcement learning, which is used in decision-making tasks like robotics or game playing. Unsupervised learning is not about rewards or actions — it's about exploring and understanding unlabeled data.
**References:**
https://aws.amazon.com/compare/the-difference-between-machine-learning-supervised-and-unsupervised
Domain: Fundamentals of AI and ML
---
#### 48. An educational platform is using a foundation model to generate personalized learning content on Amazon Bedrock. To ensure the content is effective and engaging for students, which evaluation approach should they prioritize?
- Use word error rate (WER) and perplexity to evaluate how accurately the model generates educational content compared to the training dataset.
- Focus on measuring content toxicity and sentiment polarity to ensure the material is safe and emotionally appropriate, assuming that aligns with learning effectiveness.
- Use predefined content quality checklists created by curriculum experts to evaluate the personalization effectiveness of the content.
- Combine benchmark metrics like BERTScore or F1 for comparing generated content to expert materials with real-world feedback, including student engagement and learning outcomes.
**CORRECT:** "Combine benchmark metrics like BERTScore or F1 for comparing generated content to expert materials with real-world feedback, including student engagement and learning outcomes" is the correct answer.
BERTScore is an advanced evaluation metric that uses deep contextual embeddings from models like BERT to compare how semantically similar the generated content is to reference texts (such as expert-created materials). F1 score is also useful when checking the presence of key concepts or keywords in generated educational content. However, to truly evaluate personalized learning material, it's essential to go beyond automated scores. Real-world feedback—like how engaged students are, how much they learn, or how they perform after using the material—gives critical insight into the actual effectiveness of the content. This combined approach allows teams to validate both the technical and educational value of content created by foundation models on Amazon Bedrock.
**INCORRECT:** "Use predefined content quality checklists created by curriculum experts to evaluate the personalization effectiveness of the content" is incorrect.
Using expert-developed checklists is a useful step in quality assurance, but it mainly ensures the content meets general educational standards. It does not fully address personalization or student engagement. These checklists may miss whether the content truly adapts to individual learning needs, making this approach insufficient on its own for evaluating effectiveness.
**INCORRECT:** "Use word error rate (WER) and perplexity to evaluate how accurately the model generates educational content compared to the training dataset" is incorrect.
WER and perplexity are common in speech recognition and language modeling tasks. WER measures transcription errors, while perplexity evaluates how "surprised" a model is by a given sequence. These are low-level metrics and not well-suited for assessing educational content, personalization, or engagement. They do not reflect whether the content is educationally sound or effective for learners.
**INCORRECT:** "Focus on measuring content toxicity and sentiment polarity to ensure the material is safe and emotionally appropriate, assuming that aligns with learning effectiveness" is incorrect.
While ensuring content safety is essential—especially for young learners—measuring toxicity and sentiment alone doesn't indicate whether the content is educationally effective or engaging. These metrics are part of a broader safety check, but should not be the primary approach for evaluating the learning impact or personalization quality.
**References:**
https://docs.aws.amazon.com/bedrock/latest/userguide/model-evaluation-tasks.html
https://aws.amazon.com/bedrock/evaluations
Domain: Applications of Foundation Models
---
#### 49. An AI system is being trained on a large dataset, and regulators require comprehensive visibility into the origin of every data point. The development team must ensure they can log each piece of training data back to its source, including details on when, where, and how it was collected.
Which concept best addresses this requirement?
- Enhances the interpretability of AI systems by documenting how models arrive at their decisions.
- Enables end-to-end tracking of data origin, movement, and transformations throughout the data lifecycle.
- Utilizes annotated datasets to increase the accuracy and effectiveness of machine learning models.
- Strengthens data security by managing and restricting access to sensitive information.
**CORRECT:** "Enables end-to-end tracking of data origin, movement, and transformations throughout the data lifecycle" is the correct answer.
This concept refers to data lineage. Data lineage tracks the entire journey of data — from its origin (where and when it was collected) through all the transformations, storage locations, and uses, including training AI models. When regulators require visibility into each data point's source and processing history, data lineage becomes critical. It ensures transparency, traceability, and accountability in data handling. This is especially important in regulated industries like healthcare, finance, and public sector, where understanding how data is used and modified is essential for compliance and audits.
**INCORRECT:** "Enhances the interpretability of AI systems by documenting how models arrive at their decisions" is incorrect.
This refers to explainability or model interpretability. While important for understanding model predictions, it doesn't track the source or history of training data, which is the requirement in this case.
**INCORRECT:** "Strengthens data security by managing and restricting access to sensitive information" is incorrect.
This describes data security practices, such as encryption, access control, and identity management. While crucial for protecting data, it doesn't help trace where data came from or how it has been used.
**INCORRECT:** "Utilizes annotated datasets to increase the accuracy and effectiveness of machine learning models" is incorrect.
This refers to data labeling or annotation, which improves model training quality. However, it doesn't address the traceability or documentation of data origin, which is required for regulatory transparency.
**References:**
https://docs.aws.amazon.com/sagemaker/latest/dg/lineage-tracking.html
Domain: Security, Compliance, and Governance for AI Solutions
---
#### 50. A technology company is implementing a chatbot to assist customers through their support center. The chatbot was initially pre-trained on a large dataset of general internet text, but the team wants it to provide more relevant, accurate, and helpful responses in the context of customer support.
Which strategy would be most effective for adapting to the specific needs of a customer support environment?
- Tuning model hyperparameters using synthetic customer service data.
- Fine-tuning the model on real-world customer support conversations and tickets.
- Using reinforcement learning from human feedback to adjust the chatbot's tone and politeness.
- Writing detailed prompts to guide the chatbot through each customer support scenario.
**CORRECT:** "Fine-tuning the model on real-world customer support conversations and tickets" is the correct answer.
Fine-tuning is the process of taking a pre-trained model and continuing training it on domain-specific data. In this case, adapting the chatbot with actual customer support transcripts and ticket histories ensures the model learns the context, terminology, tone, and frequently asked questions specific to the business. This greatly improves the chatbot's relevance, accuracy, and ability to respond effectively to customer inquiries. Fine-tuning is one of the most effective ways to customize a generative AI model for specific use cases, especially when working in sensitive or service-focused areas like customer support.
**INCORRECT:** "Using reinforcement learning from human feedback to adjust the chatbot's tone and politeness" is incorrect.
Reinforcement Learning from Human Feedback (RLHF) is powerful for aligning models with human values, tone, or safety considerations, but it's a more complex and resource-intensive process. While RLHF helps adjust behaviors like politeness, it does not directly teach the model domain-specific knowledge, such as handling product-related queries or troubleshooting steps.
**INCORRECT:** "Writing detailed prompts to guide the chatbot through each customer support scenario" is incorrect.
Prompt engineering is useful for shaping responses, but it has limitations in scalability and long-term adaptability. It doesn't equip the model with deeper understanding or structured knowledge of customer service. Prompts also require continuous manual updates for new use cases or changes in services.
**INCORRECT:** "Tuning model hyperparameters using synthetic customer service data" is incorrect.
Hyperparameter tuning focuses on model optimization during training (e.g., learning rate or batch size) but doesn't teach the model specific content. Using synthetic data may help in test environments, but it won't capture real customer language or interactions effectively.
**References:**
https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-foundation-models-fine-tuning-domain-adaptation.html
Domain: Applications of Foundation Models
---
#### 51. A data science team is building a predictive machine learning model to support a key business objective. After combining multiple internal data sources, they begin generating new variables such as average monthly user activity and account tenure in days. These new variables are intended to help the model better understand user behavior patterns and improve prediction accuracy.
Which stage of the machine learning lifecycle does this activity represent?
- Data acquisition to collect external data for the model
- Feature engineering to create meaningful inputs for the model
- Data labeling to ensure supervised learning is feasible
- Model deployment to start generating real-time predictions
**CORRECT:** "Feature engineering to create meaningful inputs for the model" is the correct answer.
Feature engineering is the process of creating new input variables (or "features") from existing raw data to help a machine learning model learn more effectively. In this scenario, the team is deriving new features like "average monthly user activity" and "account tenure in days" based on their existing data. These new features are intended to better capture user behavior patterns, which can improve the model's ability to make accurate predictions. Feature engineering is a key step in the ML lifecycle because the quality and relevance of features directly impact model performance.
**INCORRECT:** "Data labeling to ensure supervised learning is feasible" is incorrect.
Data labeling is the process of assigning target values (like categories or numeric outcomes) to training data, which is crucial for supervised learning tasks. However, in this case, the team is not labeling data—they are generating new features. So this is not the correct stage.
**INCORRECT:** "Model deployment to start generating real-time predictions" is incorrect.
Model deployment involves integrating a trained model into a live system so it can make predictions in production. The team has not yet reached that phase—they are still in the data preparation and feature engineering stage.
**INCORRECT:** "Data acquisition to collect external data for the model" is incorrect.
Data acquisition refers to gathering raw data from internal or external sources. The scenario specifically mentions using internal data and creating new variables, which indicates that data has already been collected. The focus here is on transforming that data, not acquiring it.
**References:**
https://aws.amazon.com/what-is/feature-engineering
https://docs.aws.amazon.com/sagemaker/latest/dg/feature-store.html
Domain: Fundamentals of AI and ML
---
#### 52. A content production company is exploring generative AI technologies to automate creative tasks such as writing blog posts, generating social media content, and designing promotional materials. The team needs to understand the fundamental principles of generative AI and how it differs from traditional AI models.
Which of the following best describes generative AI?
- Generative AI models require manual input for every generated output and cannot function independently.
- Generative AI models strictly follow rule-based programming, requiring predefined templates for every legal document.
- Generative AI models generate original, customized content for creative automation by learning patterns from large datasets.
- Generative AI models work like rule-based systems, requiring predefined instructions for every output.
**CORRECT:** "Generative AI models generate original, customized content for creative automation by learning patterns from large datasets" is the correct answer.
Generative AI refers to a type of artificial intelligence that can create new and original content, such as text, images, music, and more, by learning from large datasets. Unlike traditional AI models that are mainly used for classification or prediction tasks, generative AI focuses on producing new data that resembles the training data. This is especially useful for automating creative tasks like writing blog posts, designing graphics, or composing social media messages. These models use deep learning techniques such as transformers to understand and replicate human-like patterns in data. By doing so, they can produce high-quality, customized content that feels authentic and original, saving time and effort for content creators.
**INCORRECT:** "Generative AI models work like rule-based systems, requiring predefined instructions for every output" is incorrect.
Generative AI does not rely on predefined rules. Instead, it learns from large amounts of data to generate content dynamically. Rule-based systems are rigid, while generative AI is flexible and adaptive.
**INCORRECT:** "Generative AI models strictly follow rule-based programming, requiring predefined templates for every legal document" is incorrect.
Generative AI does not depend on strict templates or rules. While templates may be used in traditional systems, generative AI can create diverse outputs without following fixed formats.
**INCORRECT:** "Generative AI models require manual input for every generated output and cannot function independently" is incorrect.
Generative AI can generate content with minimal input, often from a short prompt. It can work independently once trained, making it efficient for automating content generation tasks.
**References:**
https://aws.amazon.com/what-is/generative-ai
Domain: Fundamentals of Generative AI
---
#### 53. A financial company wants to fine-tune a foundation model to understand domain-specific terms using Amazon Bedrock.
How does Bedrock enable this customization? (Select TWO.)
- Bedrock lets users train models from scratch using raw financial documents.
- Bedrock allows fine-tuning using labeled domain-specific data without managing infrastructure.
- Bedrock supports model deployment only through Amazon SageMaker pipelines.
- Bedrock supports prompt-based customization without modifying the model weights.
- Bedrock uses a pre-configured rules engine that prevents model customization.
**CORRECT:** "Bedrock allows fine-tuning using labeled domain-specific data without managing infrastructure" is a correct answer.
Amazon Bedrock provides a managed environment where organizations can fine-tune foundation models using their own labeled data. This process allows the model to better understand specialized terms and contexts like financial language without requiring companies to handle the complex infrastructure involved in traditional model training. This customization helps improve accuracy and relevance for specific business use cases such as financial chatbots, compliance automation, or risk assessments.
**CORRECT:** "Bedrock supports prompt-based customization without modifying the model weights" is a correct answer.
In addition to fine-tuning, Amazon Bedrock supports prompt-based customization, often called prompt engineering or few-shot prompting. This approach enables companies to guide the model's responses by giving it well-crafted instructions or examples, all without changing the model's underlying parameters. This is ideal for quick, low-cost customization and is especially useful for tasks that don't require full fine-tuning but still need tailored outputs.
**INCORRECT:** "Bedrock lets users train models from scratch using raw financial documents" is incorrect.
Bedrock does not allow users to train foundation models from scratch. Instead, it provides access to pre-trained models that can be customized. Training from scratch is resource-intensive and typically not supported in Bedrock's managed service model.
**INCORRECT:** "Bedrock supports model deployment only through Amazon SageMaker pipelines" is incorrect.
Amazon Bedrock operates independently of SageMaker. It provides its own deployment options via API endpoints, enabling users to integrate models into applications without needing SageMaker or ML pipelines.
**INCORRECT:** "Bedrock uses a pre-configured rules engine that prevents model customization" is incorrect.
This is incorrect. Bedrock is designed to allow customization through fine-tuning and prompt engineering. It does not restrict users with a fixed rules engine, but instead enables flexibility to adapt models for specific needs.
**References:**
https://docs.aws.amazon.com/bedrock/latest/userguide/custom-models.html
https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-engineering-guidelines.html
Domain: Applications of Foundation Models
---
#### 54. A legal firm is using a foundation model on Amazon Bedrock in a Retrieval Augmented Generation (RAG) setup to analyze contracts stored in Amazon S3. Each department works with different clients and must not access other clients' contract data.
How should the firm manage access control to uphold data security?
- Set up a VPC endpoint to Amazon Bedrock for each department.
- Enable AWS CloudTrail to monitor and block unauthorized model access.
- Use IAM service roles with resource-based policies that restrict each department to their client's data.
- Use prompt templates to ensure each department only asks relevant questions.
**CORRECT:** "Use IAM service roles with resource-based policies that restrict each department to their client's data" is the correct answer.
IAM service roles are a type of AWS Identity and Access Management (IAM) role that allows AWS services to perform actions on behalf of a user or application. These roles delegate specific permissions to AWS services—such as EC2, Lambda, SageMaker or Bedrock — enabling them to access other AWS resources securely. For example, a Lambda function can assume a service role to read data from an S3 bucket or write logs to CloudWatch, all without hardcoding credentials. This enhances security and simplifies permission management across cloud environments. AWS resource-based policies are JSON policy documents attached directly to AWS resources, such as S3 buckets (S3 Bucket Policy), Lambda functions (Lambda Function Resource Policy), or DynamoDB (DynamoDB Table Policy). These policies define who (users, roles, or accounts) can access the resource and what actions they are allowed to perform. Unlike IAM identity-based policies, which are attached to users or roles, resource-based policies enable cross-account access without needing the target user to modify their own permissions. They offer fine-grained control over resource sharing and access, enhancing flexibility and security in multi-account or collaborative environments. IAM service roles with resource-based policies are the most secure and scalable way to control access in AWS. In this scenario, each department should only have access to their own client's contracts in Amazon S3. By using IAM roles and attaching resource-based policies to S3 buckets or prefixes, the firm can ensure that only authorized departments can access specific data. This aligns with AWS's best practices for least privilege access. It also integrates well with a RAG setup using Bedrock, as the model can be called through securely managed roles that retrieve only the permitted context from S3. This solution effectively upholds data segregation, which is critical in legal environments.
**INCORRECT:** "Set up a VPC endpoint to Amazon Bedrock for each department" is incorrect.
VPC endpoints allow secure, private access to AWS services, including Amazon Bedrock, but they don't handle data-level access control. Creating separate VPC endpoints doesn't restrict which data from S3 each department can retrieve. It focuses on network-level isolation, not fine-grained permissions.
**INCORRECT:** "Use prompt templates to ensure each department only asks relevant questions" is incorrect.
Prompt templates help structure questions but do not enforce security or access control. Even with good prompts, a department could still potentially access or reference unauthorized data if no backend restrictions are in place. Prompt engineering is useful but not a security mechanism.
**INCORRECT:** "Enable AWS CloudTrail to monitor and block unauthorized model access" is incorrect.
AWS CloudTrail is useful for auditing and monitoring API activity but does not block access. It is a reactive tool used for logging and investigation—not proactive access control. By itself, it cannot enforce who can access which client data.
**References:**
https://docs.aws.amazon.com/bedrock/latest/userguide/security-iam-sr.html
https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_identity-vs-resource.html
https://docs.aws.amazon.com/AmazonS3/latest/userguide/security_iam_service-with-iam.html
Domain: Security, Compliance, and Governance for AI Solutions
---
#### 55. What is the best description of the Amazon Bedrock Playground?
- It provisions serverless endpoints for model hosting and parameter caching
- It provides an interactive interface to test prompts and model settings without code
- It automatically tunes models for production deployment across multiple regions
- It logs prompt versions for audit tracking across Amazon S3
**CORRECT:** "It provides an interactive interface to test prompts and model settings without code" is the correct answer.
Amazon Bedrock Playground is a no-code, web-based interface that lets users interact with and test foundation models (FMs) from leading providers such as Anthropic, AI21, and Amazon Titan. It allows users to experiment with prompts, tweak model parameters (like temperature and max tokens), and view real-time responses without writing a single line of code. This is particularly useful for rapid prototyping and understanding how different models behave before integrating them into applications or APIs. It's ideal for developers, business analysts, or AI practitioners who want a hands-on testing environment without the complexity of infrastructure or coding.
**INCORRECT:** "It automatically tunes models for production deployment across multiple regions" is incorrect.
Automated deployment and tuning feature, which is not what the Playground is intended for. Amazon Bedrock Playground is for testing and experimentation, not for auto-tuning or multi-region production setups.
**INCORRECT:** "It provisions serverless endpoints for model hosting and parameter caching" is incorrect.
Provisioning serverless endpoints is part of Amazon Bedrock's inference capabilities, not the Playground. While you can later deploy models for inference, the Playground itself is focused on exploration and testing, not hosting or caching.
**INCORRECT:** "It logs prompt versions for audit tracking across Amazon S3" is incorrect.
While prompt logs might be available in other parts of the application, the Playground is not specifically designed for version control or audit tracking to Amazon S3. Its main goal is interactive experimentation.
**References:**
https://docs.aws.amazon.com/bedrock/latest/userguide/playgrounds.html
Domain: Applications of Foundation Models
---
#### 56. A global e-commerce enterprise is developing a machine learning-based fraud detection system that processes customer data from users across Europe, North America, and Asia. To build trust and comply with international data privacy and protection regulations, which of the following compliance frameworks is most appropriate for the company to prioritize?
- Provides comprehensive best practices for managing secure and scalable cloud infrastructure, supporting operational efficiency and system reliability.
- Supports the design of ethically responsible machine learning models, encouraging fairness, transparency, and accountability in AI applications.
- Promotes the development of highly available and robust AI systems, ensuring continuity and resilience in automated decision-making.
- Emphasizes the protection of user data rights and privacy with strong requirements for user consent, transparency, and accountability.
**CORRECT:** "Emphasizes the protection of user data rights and privacy with strong requirements for user consent, transparency, and accountability" is the correct answer.
This option refers to General Data Protection Regulation (GDPR), which is the most appropriate compliance framework for the given scenario. GDPR is a legal framework established by the European Union (EU) to protect the personal data and privacy of individuals within the EU and the European Economic Area (EEA). It also applies to organizations outside the EU that offer goods or services to, or monitor the behavior of, EU residents. For a global e-commerce enterprise handling customer data across Europe, North America, and Asia, GDPR is critical for compliance. It mandates organizations to gain explicit consent before processing personal data, ensure transparency about data usage, and implement robust security and accountability mechanisms. By prioritizing GDPR, the company can build customer trust, avoid heavy penalties, and demonstrate global responsibility in data handling.
**INCORRECT:** "Provides comprehensive best practices for managing secure and scalable cloud infrastructure, supporting operational efficiency and system reliability" is incorrect.
This refers to frameworks like the AWS Well-Architected Framework. While helpful for designing reliable and secure cloud systems, it is focused on architecture and infrastructure best practices, not legal compliance or data privacy regulations.
**INCORRECT:** "Promotes the development of highly available and robust AI systems, ensuring continuity and resilience in automated decision-making" is incorrect.
This relates to AI system reliability and operational robustness. It's important for system uptime and quality but doesn't directly deal with legal data privacy or international compliance.
**INCORRECT:** "Supports the design of ethically responsible machine learning models, encouraging fairness, transparency, and accountability in AI applications" is incorrect.
This points to Responsible AI principles, which focus on fairness and ethics in AI. While valuable for AI governance, it is not a substitute for legal data protection requirements like GDPR when handling customer data across regions.
**References:**
https://aws.amazon.com/compliance/gdpr-center
https://aws.amazon.com/compliance/data-privacy-faq
Domain: Security, Compliance, and Governance for AI Solutions
---
#### 57. A global insurance company is leveraging generative AI to automate customer communications such as policy explanations, claims updates, and onboarding assistance. To ensure the technology is used ethically and complies with international regulations (e.g., GDPR, CCPA), the company aims to implement strong governance strategies. They are particularly concerned about mitigating privacy risks and addressing potential bias in the model outputs that could affect customer trust or lead to regulatory consequences.
Which of the following governance strategies should the company prioritize to support ethical and responsible use of generative AI? (Select TWO.)
- Perform regular audits of training datasets and model outputs to assess and mitigate bias and ensure fairness across demographic groups.
- Use encrypted storage for all training data to ensure data privacy without requiring additional data handling policies.
- Rely on model providers' built-in safeguards instead of implementing custom organizational policies.
- Establish a human-in-the-loop review process to validate and approve high-impact or sensitive model-generated outputs.
- Avoid using labeled data during model training to reduce the introduction of bias from annotation.
**CORRECT:** "Establish a human-in-the-loop review process to validate and approve high-impact or sensitive model-generated outputs" is a correct answer.
A human-in-the-loop (HITL) process involves having a human reviewer check and approve AI-generated content before it reaches the end user, especially when the content involves high-stakes or sensitive information. For a global insurance company, this ensures oversight and accountability, reduces the risk of errors or inappropriate responses, and maintains customer trust. It is especially important for generative AI outputs in regulated industries like insurance, where even small inaccuracies or biased statements could lead to legal consequences.
**CORRECT:** "Perform regular audits of training datasets and model outputs to assess and mitigate bias and ensure fairness across demographic groups" is a correct answer.
Bias in training data or model outputs can lead to unfair or discriminatory behavior, especially in applications that affect customers across different genders, races, or regions. By regularly auditing datasets and outputs, companies can detect patterns of bias and make necessary adjustments. This is essential for building ethical AI and ensuring compliance with regulations like GDPR and CCPA, which stress fairness, transparency, and accountability. Audits also contribute to the continuous improvement of the model and help build customer confidence in AI-based systems.
**INCORRECT:** "Use encrypted storage for all training data to ensure data privacy without requiring additional data handling policies" is incorrect.
While encryption is an important part of data protection, it is not a complete governance strategy on its own. Encryption ensures data is secure in transit and at rest, but it does not address issues like access control, data minimization, or how the data is used in model training and inference. Additional governance policies are still necessary to fully comply with privacy regulations.
**INCORRECT:** "Rely on model providers' built-in safeguards instead of implementing custom organizational policies" is incorrect.
While many AI service providers include default safeguards (e.g., filtering, moderation), relying only on them is risky. Organizations are responsible for ensuring ethical AI use based on their specific use case, customers, and regional regulations. Custom policies tailored to the company's values and legal obligations are essential for full compliance and responsible use of AI.
**INCORRECT:** "Avoid using labeled data during model training to reduce the introduction of bias from annotation" is incorrect.
Avoiding labeled data does not prevent bias—it may actually reduce the model's ability to learn accurately or behave consistently. The key is to use high-quality, diverse, and fairly annotated labeled data and to audit it regularly for bias. Skipping labeled data altogether would limit the model's performance and does not support ethical AI development.
**References:**
https://aws.amazon.com/blogs/enterprise-strategy/data-governance-in-the-age-of-generative-ai
https://aws.amazon.com/what-is/data-governance
Domain: Security, Compliance, and Governance for AI Solutions
---
#### 58. Your team needs a machine learning model for image recognition but lacks the resources to build one from scratch. Your team lead suggests using an existing model that has been trained on millions of images and can be fine-tuned for your specific use case.
What type of model should you use?
- Reinforcement learning model
- Open-source pre-trained model
- Custom model trained from scratch
- Supervised learning model built from scratch
**CORRECT:** "Open-source pre-trained model" is the correct answer.
In this scenario, where your team lacks the resources to build a machine learning model from scratch but needs a model for image recognition, the best option is to use an open-source pre-trained model. Pre-trained models are machine learning models that have been trained on large datasets, such as millions of images, to recognize general patterns. These models can be fine-tuned (through transfer learning) for specific use cases, saving time, effort, and computational resources. Open-source pre-trained models are widely available and accessible through libraries like TensorFlow Hub, PyTorch Hub, or AWS SageMaker JumpStart. By using a pre-trained model, you benefit from the model's ability to recognize common features, like edges or textures, while focusing on adapting it to your specific needs, such as recognizing particular objects relevant to your domain.
**INCORRECT:** "Custom model trained from scratch" is incorrect.
Training a custom model from scratch involves gathering data, defining the architecture, and training the model, which is resource-intensive. It is not ideal when resources are limited, especially for image recognition tasks, which require large datasets and significant computational power.
**INCORRECT:** "Reinforcement learning model" is incorrect.
Reinforcement learning involves training models based on reward systems and is typically used for tasks like robotics or game playing, not for image recognition. It's not suitable in this context where an image recognition model is needed.
**INCORRECT:** "Supervised learning model built from scratch" is incorrect.
While supervised learning is suitable for image recognition, building a model from scratch in this case would require significant resources. It is not efficient compared to using a pre-trained model, especially when existing models can be fine-tuned for the task at hand.
**References:**
https://aws.amazon.com/marketplace/solutions/machine-learning/pre-trained-models
https://docs.aws.amazon.com/sagemaker/latest/dg/studio-jumpstart.html
Domain: Fundamentals of AI and ML
---
#### 59. A healthcare organization is building a machine learning system to predict patient readmission rates. The team is working with both structured data unstructured data. The AI team wants to understand how these two types of data impact.
Which of the following statements best describes the structured and unstructured data in this case? (Select TWO)
- Both structured and unstructured data require complex feature engineering to be compatible with any machine learning algorithm.
- Unstructured data is easier to analyze using traditional machine learning algorithms compared to structured data.
- Structured data can be directly ingested into traditional machine learning algorithms, while unstructured data typically requires preprocessing to extract meaningful features.
- Structured data includes free text, images, and videos, whereas unstructured data is limited to numerical data only.
- Structured data is organized in rows and columns, while unstructured data has no predefined format, like text, images, and videos.
**CORRECT:** "Structured data is organized in rows and columns, while unstructured data has no predefined format, like text, images, and videos" is a correct answer.
Structured data is data that is organized and formatted in a way that's easy to process—typically in rows and columns, like spreadsheets or database tables. Examples include patient demographics, lab test results, or admission dates. Unstructured data, on the other hand, does not follow a specific format. It includes data like clinical notes, X-ray images, or audio recordings of doctor-patient conversations. Understanding the difference is essential because machine learning models handle these data types differently. Structured data can often be used directly, while unstructured data usually needs to be processed or transformed into a usable form.
**CORRECT:** "Structured data can be directly ingested into traditional machine learning algorithms, while unstructured data typically requires preprocessing to extract meaningful features" is a correct answer.
Traditional ML models like decision trees, linear regression, or support vector machines expect structured, numerical inputs. Structured data often requires minimal transformation—such as scaling or encoding—before being fed into models. On the other hand, unstructured data like text or images must be processed to extract features (e.g., using NLP for text, or CNNs for images) before models can use them. This preprocessing step helps convert unstructured formats into numerical representations suitable for model training.
**INCORRECT:** "Both structured and unstructured data require complex feature engineering to be compatible with any machine learning algorithm" is incorrect.
While feature engineering can enhance performance for both data types, structured data often requires less complexity. For example, numerical lab results or age can be used almost directly. Unstructured data like images or text usually needs more advanced processing. So, saying "both require complex feature engineering" overstates the case, especially for structured data.
**INCORRECT:** "Structured data includes free text, images, and videos, whereas unstructured data is limited to numerical data only" is incorrect.
This is the opposite of the truth. Structured data is usually numerical or categorical and neatly organized. Free text, images, and videos are unstructured because they don't follow a fixed format.
**INCORRECT:** "Unstructured data is easier to analyze using traditional machine learning algorithms compared to structured data" is incorrect.
Unstructured data is more challenging to analyze with traditional ML algorithms because it first needs to be converted into a structured format. Structured data, being already organized, is easier to handle and analyze using traditional methods.
**References:**
https://aws.amazon.com/compare/the-difference-between-structured-data-and-unstructured-data
Domain: Fundamentals of AI and ML
---
#### 60. Select and order the EC2 instance purchasing options from HIGHEST to LOWEST COST per hour. Each option should be selected one time. (Select and order THREE.)
Note: Select only the correct options, as the type of "Ordering" question is not supported here.
- Spot Instances
- On-Demand
- Savings Plans
**CORRECT:** From HIGHEST to LOWEST cost per hour: On-Demand, Savings Plans, Spot Instances.
**On-Demand** offers the most flexibility but comes at the highest hourly cost. It allows users to pay for compute capacity without long-term commitments, making them ideal for unpredictable workloads, short-term applications, or development/testing environments. Since there are no upfront payments or discounts, they are the most expensive option per hour.
**Savings Plans** offer significant cost savings compared to On-Demand Instances in exchange for a commitment to a specific amount of compute usage over one or three years. This option provides discounted rates while allowing some flexibility in instance types and regions, making it a good choice for businesses with predictable workloads.
**Spot Instances** provide the lowest cost per hour, offering discounts of up to 90% compared to On-Demand pricing. These instances are ideal for workloads that can tolerate interruptions, such as batch processing, big data analysis, and machine learning training. However, they can be terminated by AWS when capacity is needed, making them less reliable for critical applications.
**References:**
https://aws.amazon.com/ec2/pricing
Domain: Applications of Foundation Models
---
#### 61. A biotech startup is running large AI simulations on AWS to predict protein folding structures. They need to optimize the underlying hardware to reduce costs while maintaining high performance for these intensive AI workflows.
Which Amazon EC2 Chip is the best recommendation?
- Intel Xeon
- AWS Graviton3
- AWS Trainium
- AMD EPYC
**CORRECT:** "AWS Trainium" is the correct answer.
AWS Trainium is a custom-designed chip by AWS specifically built for high-performance machine learning model training. It offers powerful, cost-efficient performance for deep learning tasks like simulations, complex AI workflows, and model training. For a biotech startup running heavy AI simulations such as protein folding predictions — which involve very large computations — Trainium is the best choice. It helps optimize both cost and performance by being more efficient than general-purpose CPUs or even some GPUs for large-scale AI training. AWS Trainium instances are designed to give you the highest throughput and the lowest cost for training machine learning models on AWS.
**INCORRECT:** "Intel Xeon" is incorrect.
Intel Xeon processor is great for general-purpose compute tasks and some machine learning inference workloads. However, It is not optimized for large-scale AI model training or highly intensive simulations like protein folding. Using Intel Xeon would not offer the same cost or performance benefits for this AI-heavy use case.
**INCORRECT:** "AMD EPYC" is incorrect.
AMD EPYC processor is also excellent for general compute workloads and can provide better price-performance than Intel Xeon in many cases. Still it is not specialized for training machine learning models or running deep learning simulations at the scale needed for AI-heavy biotech work.
**INCORRECT:** "AWS Graviton3" is incorrect.
AWS Graviton3 chip is ARM-based processors that offer great energy efficiency and price-performance for general-purpose applications, web servers, and some ML inference tasks. However, it is not specialized for high-end AI model training like AWS Trainium. For heavy AI training, Graviton3 would not perform as well or be as cost-effective.
**References:**
https://aws.amazon.com/ai/machine-learning/trainium
Domain: Applications of Foundation Models
---
#### 62. A consulting firm is assisting an enterprise in integrating generative AI into its content development and customer engagement processes. The organization seeks to leverage Large Language Models (LLMs) to automate customer support, deliver personalized content, and extract insights from extensive textual data.
When evaluating appropriate use cases for LLMs across various industries and business functions, what should they consider?
- Large Language Models (LLMs) are designed to understand and generate human-like text, question answering, content summarization, and code generation.
- Large Language Models (LLMs) are best suited for automating mechanical operations and real-time control systems in manufacturing environments.
- Large Language Models (LLMs) focus on numerical data modeling, financial simulations, and sensor-based data analytics in industrial settings.
- Large Language Models (LLMs) are primarily used for computer vision tasks such as image recognition, 3D rendering, and multimedia content editing.
**CORRECT:** "Large Language Models (LLMs) are designed to understand and generate human-like text, question answering, content summarization, and code generation" is the correct answer.
Large Language Models (LLMs) are advanced AI models trained on massive amounts of text data. They are specifically built to understand natural language, generate human-like responses, summarize content, translate text, and even write code. These capabilities make LLMs ideal for applications like customer support automation (via chatbots), personalized content delivery (like product recommendations or dynamic emails), and text analytics (such as analyzing customer feedback or extracting insights from documents). Since LLMs work with language, they are perfect for tasks involving communication and knowledge processing across many industries including retail, finance, healthcare, and media.
**INCORRECT:** "Large Language Models (LLMs) are primarily used for computer vision tasks such as image recognition, 3D rendering, and multimedia content editing" is incorrect.
LLMs specialize in natural language processing, not computer vision. Tasks like image recognition or multimedia editing fall under the domain of computer vision models such as CNNs (Convolutional Neural Networks). These models are trained on visual data, not text.
**INCORRECT:** "Large Language Models (LLMs) are best suited for automating mechanical operations and real-time control systems in manufacturing environments" is incorrect.
Automating mechanical processes and real-time control systems typically involves industrial control systems or embedded software using sensor data and programmable logic controllers (PLCs). These tasks are not related to language understanding, which is the strength of LLMs.
**INCORRECT:** "Large Language Models (LLMs) focus on numerical data modeling, financial simulations, and sensor-based data analytics in industrial settings" is incorrect.
LLMs are not designed for tasks focused on numerical data analytics or simulations. These require statistical models, time-series forecasting, or specialized machine learning algorithms—not natural language understanding.
**References:**
https://aws.amazon.com/what-is/large-language-model
Domain: Fundamentals of Generative AI
---
#### 63. A global logistics company wants to build a machine learning model to predict delivery delays across its international shipping network. The data science team works with raw shipment data, including timestamps for package pickup and delivery. As part of their process, they calculate the total transit time by subtracting the pickup timestamp from the delivery timestamp.
Which phase of the machine learning process does this activity best describe?
- Applying compliance filters to training datasets
- Model optimization to reduce algorithmic complexity
- Feature engineering to improve model performance
- Data labeling to improve model accuracy
**CORRECT:** "Feature engineering to improve model performance" is the correct answer.
Feature engineering is the process of transforming raw data into meaningful input variables, or "features," that improve a machine learning model's ability to learn patterns and make accurate predictions. This involves creating new features, modifying existing ones, or selecting the most relevant data. Effective feature engineering can significantly boost model performance by helping the algorithm focus on the most important aspects of the data. For example, converting timestamps into transit times or extracting customer behavior patterns from clickstream data are common feature engineering tasks. It is a critical step before training the model.
**INCORRECT:** "Applying compliance filters to training datasets" is incorrect.
Applying compliance filters refers to removing or masking sensitive data to meet legal or regulatory requirements, such as GDPR or HIPAA. This process does not involve creating new features or calculating values like transit time. Therefore, it is not the correct choice for this scenario.
**INCORRECT:** "Model optimization to reduce algorithmic complexity" is incorrect.
Model optimization focuses on improving the performance of the machine learning algorithm itself, such as tuning hyperparameters or simplifying the model to reduce computation time. This activity happens after the data has been prepared and is unrelated to generating new features from raw data.
**INCORRECT:** "Data labeling to improve model accuracy" is incorrect.
Data labeling involves assigning labels or categories to raw data, especially in supervised learning tasks. For example, labeling emails as "spam" or "not spam." Calculating transit time is not labeling data but creating a new feature, so this is not the correct option.
**References:**
https://aws.amazon.com/what-is/feature-engineering
Domain: Fundamentals of AI and ML
---
#### 64. You have a multi-step AI workflow that uses a foundation model to generate personalized travel itineraries. Each itinerary requires the model to gather flight data from Amazon S3, query real-time pricing through an Amazon RDS database, and then compose a final response. You need an orchestration layer that can hand off tasks between these steps automatically.
Which AWS feature would best manage this multi-step process?
- AWS Glue DataBrew
- Amazon CloudWatch Events
- Amazon SageMaker Pipelines
- Amazon Kendra
**CORRECT:** "Amazon SageMaker Pipelines" is the correct answer.
Amazon SageMaker Pipelines is a fully managed CI/CD service for machine learning workflows. It helps orchestrate and automate different steps in an AI/ML workflow, ensuring smooth execution from data preprocessing to model inference. In this case, SageMaker Pipelines can efficiently manage the multi-step AI workflow by: Fetching flight data from Amazon S3, Querying real-time pricing from an Amazon RDS database, Combining the retrieved data and generating a final travel itinerary. With built-in support for conditional logic, parallel processing, and step-by-step execution tracking, SageMaker Pipelines is the best choice for automating and managing this AI workflow.
**INCORRECT:** "AWS Glue DataBrew" is incorrect.
AWS Glue DataBrew is a visual data preparation tool for cleaning and transforming data. While useful for preprocessing, it does not manage multi-step workflow orchestration or task automation like SageMaker Pipelines.
**INCORRECT:** "Amazon CloudWatch Events" is incorrect.
Amazon CloudWatch Events (now part of Amazon EventBridge) is used for event-driven automation, such as triggering AWS services based on changes. However, it is not designed to manage complex multi-step AI workflows with dependencies.
**INCORRECT:** "Amazon Kendra" is incorrect.
Amazon Kendra is an AI-powered enterprise search service. It helps retrieve and rank relevant documents but does not support multi-step workflow orchestration or AI pipeline execution.
**References:**
https://docs.aws.amazon.com/sagemaker/latest/dg/pipelines.html
Domain: Applications of Foundation Models
---
#### 65. A university is building an AI-powered admissions screening tool. The leadership wants to avoid any risk of favoring or rejecting students unfairly based on background.
Which of the following responsible AI dimensions should they prioritize? (Select TWO.)
- Transparency
- Model latency
- Fairness
- Model scalability
- Cost optimization
**CORRECT:** "Fairness" is a correct answer.
Fairness in AI means ensuring that the system treats all individuals and groups equally, without discrimination based on race, gender, socioeconomic status, or other personal characteristics. In the case of a university admissions screening tool, fairness is critical. The goal is to evaluate students purely on their qualifications and potential, not on factors that could introduce bias. If fairness is not prioritized, the AI model might learn from biased data and unfairly favor or reject certain groups of students. This could lead to legal, ethical, and reputational issues for the university. Ensuring fairness helps build trust and supports equal opportunities for all applicants.
**CORRECT:** "Transparency" is a correct answer.
Transparency in AI refers to making the system's decision-making process understandable and explainable to humans. This includes being clear about how the AI works, what data it uses, and how decisions are made. In university admissions, transparency is important because students, parents, and other stakeholders need to trust that the process is open and accountable. If the AI system makes a decision that impacts a student's future, the university must be able to explain why that decision was made. This helps prevent misunderstandings and builds confidence in the fairness of the admissions process.
**INCORRECT:** "Model latency" is incorrect.
Model latency refers to the time it takes for an AI model to process input and return a result. It is not a priority for university admissions. Admissions decisions are not made in real-time, so speed is less critical than fairness and transparency. Focusing on latency here would not address the risk of unfair or biased outcomes.
**INCORRECT:** "Cost optimization" is incorrect.
Cost optimization is about managing resources to reduce expenses while maintaining performance. It is not the top priority when dealing with fairness and ethical decision-making. Choosing cheaper solutions without ensuring fairness and transparency could lead to biased outcomes, harming the university's reputation and applicants' trust.
**INCORRECT:** "Model scalability" is incorrect.
Model scalability refers to the AI system's ability to handle larger workloads as demand grows. While scalability is important for systems that serve millions of users, a university admissions tool typically handles a limited number of applications each year. Therefore, scalability is not as important as ensuring that the system is fair and transparent.
**References:**
https://aws.amazon.com/ai/responsible-ai
Domain: Security, Compliance, and Governance for AI Solutions