02 AWS AI Practitioner Practice Test

#### 01. What is the primary function of the Stop sequences parameter in foundation models used through Amazon Bedrock? - It determines the maximum number of concurrent inference requests. - It forces the model to include specific keywords in the response. - It defines the minimum and maximum length of the model's response. - It specifies sequences of characters that end the model's generation process. **CORRECT:** "It specifies sequences of characters that end the model's generation process." is the correct answer. The Stop sequences parameter in Amazon Bedrock tells the foundation model when to stop generating text. You can provide one or more specific character sequences (like "###" or "\nUser:") as a stopping point. Once the model encounters any of these sequences while generating a response, it immediately stops further output. This is useful for making chatbot responses shorter, well-bounded, and task-focused, especially when the model tends to continue talking even after the question has been answered. For example, if a chatbot should only reply with a single sentence or stop when it sees a cue for the next message, stop sequences help enforce that behavior. **INCORRECT:** "It forces the model to include specific keywords in the response." is incorrect. This describes prompt engineering or the use of specific instructions, not the stop sequence feature. Stop sequences don't force keyword usage—they only tell the model when to stop generating text. **INCORRECT:** "It determines the maximum number of concurrent inference requests." is incorrect. This is related to system-level performance tuning or API rate limits, not how the model generates responses. Stop sequences are about the content, not request handling. **INCORRECT:** "It defines the minimum and maximum length of the model's response." is incorrect. Length control is done using parameters like max_tokens, not stop sequences. While max_tokens limits how long the response can be, it doesn't stop based on specific phrases like stop sequences do. **References:** https://docs.aws.amazon.com/bedrock/latest/userguide/inference-parameters.html Domain: Applications of Foundation Models --- #### 02. A financial institution is building machine learning models to assess credit risk by analyzing structured data, such as transaction history and customer profiles, as well as unstructured data, such as customer reviews and support interactions. The data science team needs to apply appropriate feature engineering techniques for each data type to ensure accurate risk predictions. What is the key difference in feature engineering tasks for structured data compared to unstructured data in the context of machine learning? (Select TWO.) - Structured data is always easier to preprocess than unstructured data. - Unstructured data, such as text or images, often requires dimensionality reduction techniques like PCA before training. - Unstructured data does not require preprocessing before training machine learning models. - Feature selection is more commonly applied in structured data, whereas feature extraction is essential for unstructured data. - Structured data typically requires tokenization and embeddings, whereas unstructured data does not. **CORRECT:** "Feature selection is more commonly applied in structured data, whereas feature extraction is essential for unstructured data." is the correct answer. Feature engineering is different for structured and unstructured data. With structured data (like transaction history or customer demographics), the data is already organized into clear fields (like age, income, or account balance). So, data scientists often use feature selection to choose the most important fields or columns to improve the model. In contrast, unstructured data (like customer reviews or chat logs) is not organized into columns. It needs feature extraction, which means converting raw data (like text) into a numerical form using techniques like TF-IDF, word embeddings, or NLP pipelines so that machine learning models can understand it. This is a key distinction in how we handle each type of data in machine learning. **CORRECT:** "Unstructured data, such as text or images, often requires dimensionality reduction techniques like PCA before training." is the correct answer. Unstructured data tends to be high-dimensional. For example, converting text into embeddings or images into pixel matrices results in very large feature sets. To reduce complexity and avoid overfitting, dimensionality reduction techniques like PCA (Principal Component Analysis) or t-SNE are often used. These techniques help in simplifying the data while preserving important patterns, making model training more efficient and effective. Structured data, in contrast, usually has fewer features and may not require such reductions. **INCORRECT:** "Structured data typically requires tokenization and embeddings, whereas unstructured data does not." is incorrect. This is incorrect because tokenization and embeddings are primarily used with unstructured data like text. Structured data is already formatted into fields and does not require tokenization. **INCORRECT:** "Structured data is always easier to preprocess than unstructured data." is incorrect. While structured data is usually easier to handle due to its organized format, the word "always" makes this option misleading. Some structured datasets may still require complex preprocessing (like handling missing values or outliers), while certain unstructured data might be simpler depending on the use case. **INCORRECT:** "Unstructured data does not require preprocessing before training machine learning models." is incorrect. This is incorrect. Unstructured data needs significant preprocessing—for example, converting text into vectors or cleaning noisy data. You cannot directly use raw unstructured data in most machine learning models. **References:** https://docs.aws.amazon.com/wellarchitected/latest/machine-learning-lens/feature-engineering.html Domain: Fundamentals of AI and ML --- #### 03. You are developing an AI-powered customer service system that needs to handle a series of tasks, such as identifying the customer's issue, retrieving relevant information from a knowledge base, and then generating a personalized response. This requires multiple steps to be completed in sequence while interacting with different systems. Which feature of Amazon Bedrock would help manage these multi-step tasks efficiently? - Retrieval Augmented Generation (RAG) - Fine-tuning the model for task-specific responses - Amazon Bedrock Agents - Pre-trained models **CORRECT:** "Amazon Bedrock Agents" is the correct answer. Amazon Bedrock Agents is designed to manage multi-step tasks by automating interactions with different systems and orchestrating the completion of tasks in a sequence. This agents can break down complex workflows into individual tasks and interact with various external services, such as retrieving information from a knowledge base and generating personalized responses based on that information. In the context of an AI-powered customer service system, Amazon Bedrock Agents would efficiently manage the sequential nature of tasks like identifying the customer's issue, retrieving relevant information, and generating a response, all within a cohesive and automated framework. **INCORRECT:** "Pre-trained models" is incorrect. Pre-trained models in Amazon Bedrock can provide foundational capabilities, such as understanding customer queries and generating responses, but it does not manage multi-step task workflows or interactions with external systems. **INCORRECT:** "Fine-tuning the model for task-specific responses" is incorrect. Fine-tuning a model improves its accuracy for specific tasks, but it does not address the orchestration of multi-step processes or system interactions, which are critical in this scenario. **INCORRECT:** "Retrieval Augmented Generation (RAG)" is incorrect. RAG focuses on improving the accuracy of generated responses by retrieving relevant information from a knowledge base, but it does not handle the management of sequential tasks or interactions with different systems as required in this use case. **References:** https://aws.amazon.com/bedrock/agents Domain: Applications of Foundation Models --- #### 04. A machine learning team is training a regression model using Amazon SageMaker. They want to systematically explore combinations of learning rate, batch size, and number of epochs to minimize the model's mean squared error on the validation dataset. What is this process known as? - Hyperparameter tuning to optimize model performance - Model compression to reduce inference latency - Feature engineering to reduce model complexity - Model evaluation using cross-validation techniques **CORRECT:** "Hyperparameter tuning to optimize model performance" is the correct answer. Hyperparameter tuning is the process of finding the best set of hyperparameters—such as learning rate, batch size, and number of epochs—that lead to optimal model performance. These parameters are not learned from the data but are set before training begins. In Amazon SageMaker, hyperparameter tuning jobs (HPO) can automatically try many combinations of these settings to minimize a specific metric, like mean squared error in regression tasks. This process helps the model generalize better and achieve higher accuracy or lower error on unseen data. In the question, the team is adjusting multiple training parameters systematically to reduce error, which perfectly describes hyperparameter tuning. **INCORRECT:** "Model evaluation using cross-validation techniques" is incorrect. Cross-validation is a technique used to assess how well a model generalizes to unseen data. It involves splitting the training data into multiple folds, training on some folds, and validating on the remaining ones. This helps evaluate a model's robustness, it doesn't involve changing hyperparameters to improve performance. **INCORRECT:** "Feature engineering to reduce model complexity" is incorrect. Feature engineering involves creating, selecting, or transforming input features to improve model performance. It can help reduce complexity and increase accuracy but it's not the same as adjusting learning rate or batch size, which are hyperparameters. **INCORRECT:** "Model compression to reduce inference latency" is incorrect. Model compression refers to techniques like pruning, quantization, or distillation used to reduce model size and speed up inference. It's typically applied after training, not during training, and has no role in optimizing training metrics like mean squared error. **References:** https://aws.amazon.com/what-is/hyperparameter-tuning https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning.html Domain: Fundamentals of AI and ML --- #### 05. A retail company manages a large database containing customer transactions, inventory records, and sales analytics. They want to empower non-technical employees, such as store managers and marketing staff, to easily retrieve insights without needing to know SQL or database structures. Which solution should the company implement to achieve this goal? - Use a clustering algorithm to find patterns in user queries and automatically categorize them for later analysis. - Implement a basic keyword-based search tool without the ability to translate full natural language queries into structured formats. - Build a static dashboard with pre-built SQL queries for common reporting tasks that users can select from. - Use a generative AI model to convert natural language instructions into structured SQL queries dynamically. **CORRECT:** "Use a generative AI model to convert natural language instructions into structured SQL queries dynamically." is the correct answer. A generative AI model designed to translate natural language into SQL allows non-technical users to interact with complex databases easily. Instead of writing SQL code, users can simply type natural language questions like "Show me last month's top-selling products," and the AI model automatically creates and runs the correct SQL query. This approach makes data access intuitive and empowers employees across different departments to retrieve insights without technical training. AWS services and best practices support using AI to bridge the gap between human language and structured data operations, improving productivity and decision-making for businesses. **INCORRECT:** "Use a clustering algorithm to find patterns in user queries and automatically categorize them for later analysis." is incorrect. Clustering algorithms are good for finding hidden patterns or grouping similar data points. However, clustering doesn't directly help users retrieve specific data insights. It organizes data but doesn't turn natural language queries into actionable SQL commands, so it doesn't solve the user's main problem. **INCORRECT:** "Build a static dashboard with pre-built SQL queries for common reporting tasks that users can select from." is incorrect. While static dashboards are useful, they are limited to predefined queries. Users can only access the reports that were anticipated by the dashboard designers. If a store manager needs custom or unexpected insights, static dashboards won't allow dynamic querying like generative AI does. **INCORRECT:** "Implement a basic keyword-based search tool without the ability to translate full natural language queries into structured formats." is incorrect. A simple keyword search tool can help users find documents or records based on keywords but lacks the ability to understand full user intent or generate new SQL queries. It is a basic solution that does not support complex data retrieval needs. **References:** https://aws.amazon.com/what-is/gpt Domain: Fundamentals of Generative AI --- #### 06. A company is using Amazon SageMaker Clarify to detect potential bias in their AI model that predicts job candidates' suitability for a role. They observe that female candidates are often ranked lower than their male counterparts, despite having similar qualifications. Which feature of SageMaker Clarify would best help the company identify the root cause of this bias? - Real-time monitoring of model performance in production. - Automated hyperparameter tuning to improve model accuracy. - Fairness detection by comparing outcomes for different groups. - Human audit of model decisions after deployment. **CORRECT:** "Fairness detection by comparing outcomes for different groups." is the correct answer. Amazon SageMaker Clarify provides a feature for detecting bias by comparing the outcomes of different groups, such as gender, race, or age. In this case, the company observes that female candidates are ranked lower than their male counterparts, despite similar qualifications. SageMaker Clarify's fairness detection would allow the company to analyze and compare the outcomes of the model for female and male candidates, helping them identify any systematic bias in the model's predictions. This feature helps to pinpoint the root cause by examining how the model treats different demographic groups. **INCORRECT:** "Human audit of model decisions after deployment." is incorrect. A human audit involves manually reviewing model decisions, but it is not a built-in feature of SageMaker Clarify. It wouldn't directly help identify the root cause of bias within the model like fairness detection. **INCORRECT:** "Real-time monitoring of model performance in production." is incorrect. Real-time monitoring tracks how well a model performs over time but does not directly help identify bias. It focuses more on the model's performance metrics, like accuracy or latency, rather than fairness. **INCORRECT:** "Automated hyperparameter tuning to improve model accuracy." is incorrect. Hyperparameter tuning optimizes a model's performance but does not specifically address fairness or bias. Improving accuracy does not necessarily resolve issues related to biased outcomes for certain groups. **References:** https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-configure-processing-jobs.html Domain: Guidelines for Responsible AI --- #### 07. A travel company wants to use machine learning to forecast customer demand for various holiday packages. The data analysts prefer a platform with ready-made models and quick setup options to avoid complex coding from scratch. They are assessing Amazon SageMaker JumpStart for their requirements. Which TWO statements correctly describe the key features of Amazon SageMaker JumpStart? (Select TWO.) - Includes built-in workflows that allow teams to deploy ML solutions quickly. - Offers pre-built machine learning models for rapid forecasting and predictions. - Automatically translates booking data into multiple languages for analysis. - Automatically identifies fraudulent travel transactions without additional configuration. - Directly provides real-time travel advisory updates without model customization. **CORRECT:** "Offers pre-built machine learning models for rapid forecasting and predictions." is the correct answer. Amazon SageMaker JumpStart provides a collection of pre-trained models and ready-to-use solutions for common machine learning tasks, such as demand forecasting, image classification, and text analysis. This makes it easier for teams—especially those without deep ML expertise—to quickly start projects. For the travel company, JumpStart's pre-built models can help forecast customer demand for holiday packages without writing code from scratch, making it ideal for fast and effective ML adoption. **CORRECT:** "Includes built-in workflows that allow teams to deploy ML solutions quickly." is the correct answer. One of the strengths of SageMaker JumpStart is its built-in end-to-end workflows. These templates guide users through the full machine learning lifecycle, from data processing to model training, evaluation, and deployment. This simplifies complex ML tasks and lets teams focus on results rather than the technical details. It's perfect for data analysts who want to move quickly and avoid spending time on setup and configuration. **INCORRECT:** "Automatically translates booking data into multiple languages for analysis." is incorrect. Translation of text data is handled by Amazon Translate, not SageMaker JumpStart. JumpStart does not focus on language translation tasks. **INCORRECT:** "Directly provides real-time travel advisory updates without model customization." is incorrect. SageMaker JumpStart doesn't offer real-time travel updates. It focuses on ML models and workflows, not live data feeds or external advisories. **INCORRECT:** "Automatically identifies fraudulent travel transactions without additional configuration." is incorrect. Detecting fraud typically requires custom models trained on specific business data. JumpStart may provide fraud detection templates, but they still need configuration and tuning based on your use case. **References:** https://aws.amazon.com/sagemaker/jumpstart Domain: Fundamentals of AI and ML --- #### 08. A financial institution notices that its classification model has high precision but lower recall. They want a metric that reflects both values fairly. What does the F1 Score represent? - The sum of precision and recall - The average of true positive and true negative rates - The difference between true positives and false positives - The harmonic mean of precision and recall **CORRECT:** "The harmonic mean of precision and recall" is the correct answer. The F1 Score is a metric used to evaluate the performance of a classification model, especially when the data is imbalanced or when both precision (how many of the predicted positives are actually correct) and recall (how many actual positives are correctly predicted) matter. The F1 Score is calculated as the harmonic mean of precision and recall, which means it gives a balanced measure that penalizes extreme values. If either precision or recall is very low, the F1 Score will also be low. This makes it an effective metric when a financial institution needs to find the right balance between catching all fraud (recall) and avoiding too many false alarms (precision). **INCORRECT:** "The sum of precision and recall" is incorrect. This would give a simple total without balancing their impact. It doesn't penalize imbalance between the two values and can be misleading when one value is significantly lower. **INCORRECT:** "The difference between true positives and false positives" is incorrect. This describes a part of the confusion matrix, but is not how F1 Score is calculated. It's more related to measuring precision, but not recall or their balance. **INCORRECT:** "The average of true positive and true negative rates" is incorrect. This sounds similar to accuracy or balanced accuracy, but it is not related to the F1 Score. F1 specifically focuses on precision and recall, not true negatives. **References:** https://docs.aws.amazon.com/machine-learning/latest/dg/binary-classification.html Domain: Fundamentals of AI and ML --- #### 09. A generative model in a customer service application occasionally produces inconsistent and imaginative responses, even when the input intent is clear. You want to constrain the model to produce more focused and repeatable outputs without reducing the vocabulary scope significantly. Which hyperparameter modification best supports this objective? - Expand the token limit to ensure more deterministic output - Increase the number of attention heads to refine contextual representation - Reduce model depth to minimize creative reasoning paths - Lower the temperature value to reduce sampling probability dispersion **CORRECT:** "Lower the temperature value to reduce sampling probability dispersion" is the correct answer. The temperature hyperparameter controls the randomness of a generative model's output. A lower temperature value reduces the range of possible token choices, making the output more deterministic, focused, and repeatable. This is ideal for customer service applications where consistency and clarity are more important than creativity. Lowering the temperature doesn't shrink the model's vocabulary but makes the model favor higher-probability tokens over more creative or unexpected ones. This helps the model stay on-topic and avoid imaginative or off-brand responses while still using natural language effectively. **INCORRECT:** "Expand the token limit to ensure more deterministic output" is incorrect. The token limit controls how much text the model can process or generate in one request. While increasing this allows for handling longer inputs or outputs, it does not influence how deterministic or focused the response is. It doesn't directly control randomness or creativity. **INCORRECT:** "Increase the number of attention heads to refine contextual representation" is incorrect. Increasing attention heads affects the architecture of the model itself. This is a training-time decision and cannot be changed during inference. Also, it doesn't directly relate to controlling the creativity or consistency of output during inference. **INCORRECT:** "Reduce model depth to minimize creative reasoning paths" is incorrect. Model depth refers to the number of layers in the neural network. Reducing it would require retraining the model from scratch and would degrade performance. It's also unrelated to output control during inference, and such a change would reduce understanding, not creativity. **References:** https://docs.aws.amazon.com/bedrock/latest/userguide/inference-parameters.html Domain: Applications of Foundation Models --- #### 10. A financial services company is deploying an AI model to assess customer eligibility for loan approval. To ensure customers are evaluated fairly regardless of their demographic or socioeconomic background, which responsible AI principle should the company prioritize? - Fairness - Performance - Sustainability - Explainability **CORRECT:** "Fairness" is the correct answer. Fairness is a key principle of responsible AI that focuses on ensuring that AI systems do not create or reinforce bias and that all users are treated equitably. In the context of loan approval, fairness means that the AI model should not discriminate against customers based on factors such as race, gender, age, or socioeconomic background. The goal is to make decisions that are unbiased and reflect legitimate financial criteria, not historical inequalities or systemic bias in the data. Prioritizing fairness helps organizations comply with ethical standards and regulatory requirements while building trust with customers. **INCORRECT:** "Explainability" is incorrect. Explainability refers to how well the model's decisions can be understood by humans. While important for building trust and transparency, it is not the primary concern in this scenario. The key issue here is how people are treated by the model, not why a decision was made. **INCORRECT:** "Sustainability" is incorrect. Sustainability in AI focuses on minimizing the environmental impact of training and running models. Although it is a responsible practice, it does not address concerns about fairness or bias in loan assessments. **INCORRECT:** "Performance" is incorrect. Performance relates to how accurately and efficiently the model works. It must not come at the cost of ethical principles like fairness, especially when human outcomes are involved. **References:** https://aws.amazon.com/ai/responsible-ai Domain: Guidelines for Responsible AI --- #### 11. A business needs to create a voice-based chatbot to improve customer interaction. They need a service that can convert text to speech and recognize spoken words to create a conversational agent. Which AWS services should they choose for this purpose? - Amazon S3 and Amazon Transcribe - Amazon Rekognition and Amazon Lex - Amazon Kendra and Amazon Textract - Amazon Polly and Amazon Lex **CORRECT:** "Amazon Polly and Amazon Lex" is the correct answer. Amazon Lex is a service designed for building conversational interfaces like chatbots. It can recognize spoken words and interpret user intent using natural language understanding (NLU). Amazon Polly is a text-to-speech service that converts text responses generated by the chatbot into lifelike speech, allowing for interactive voice-based communication. By combining Amazon Lex (for understanding spoken input) and Amazon Polly (for converting text to speech), a business can create a fully functional voice-based chatbot to improve customer interaction. **INCORRECT:** "Amazon Rekognition and Amazon Lex" is incorrect. Amazon Rekognition is used for image and video analysis, not voice processing. Amazon Lex is appropriate, but Rekognition does not fit the voice-based chatbot use case. **INCORRECT:** "Amazon S3 and Amazon Transcribe" is incorrect. Amazon S3 is for data storage, and Amazon Transcribe converts speech to text but does not handle the chatbot functionality or text-to-speech conversion needed for a voice-based chatbot. **INCORRECT:** "Amazon Kendra and Amazon Textract" is incorrect. Amazon Kendra is an enterprise search service, and Amazon Textract extracts text from documents. Neither is designed for creating conversational agents or voice interactions. **References:** https://docs.aws.amazon.com/lex/latest/dg/what-is.html https://docs.aws.amazon.com/polly/latest/dg/what-is.html Domain: Fundamentals of AI and ML --- #### 12. A SaaS provider is building an AI system for multiple enterprise clients using a shared AWS infrastructure. Each client's training data must be isolated and accessible only to their users. Which method offers secure and scalable access control? - Assign a universal IAM role with conditional access based on user metadata. - Use S3 bucket ACLs to assign client-specific permissions. - Configure IAM users for each dataset and manually manage their credentials. - Create a separate S3 bucket per client with custom IAM roles scoped to each. **CORRECT:** "Create a separate S3 bucket per client with custom IAM roles scoped to each." is the correct answer. Creating a separate Amazon S3 bucket for each client and assigning custom IAM roles scoped to that bucket ensures strong data isolation and access control. This approach enables the SaaS provider to enforce per-client data boundaries using IAM policies attached to roles. Clients can be granted access only to their own bucket, and each role can define precise permissions (e.g., read-only, write, delete). This method is scalable, as new clients can be onboarded by simply creating a new bucket and role, and secure, since no client can accidentally or maliciously access another's data. **INCORRECT:** "Use S3 bucket ACLs to assign client-specific permissions." is incorrect. Access Control Lists (ACLs) are a less flexible way of managing S3 permissions. They are difficult to scale and maintain for multiple clients and don't provide the same level of control or auditing as IAM policies. AWS also recommends using IAM policies over ACLs for most use cases. **INCORRECT:** "Assign a universal IAM role with conditional access based on user metadata." is incorrect. Using a single universal IAM role with conditional logic might seem efficient, but it introduces security risks and complexity. Conditions based on user metadata can be difficult to audit and maintain, and any misconfiguration could lead to data leakage between clients. **INCORRECT:** "Configure IAM users for each dataset and manually manage their credentials." is incorrect. Creating IAM users for every dataset is not scalable, especially for a multi-tenant SaaS application. Manual credential management is error-prone and insecure, and it violates best practices, which recommend using roles and temporary credentials instead of long-term IAM user credentials. **References:** https://docs.aws.amazon.com/AmazonS3/latest/userguide/security-best-practices.html https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-iam-policies.html Domain: Security, Compliance, and Governance for AI Solutions --- #### 13. A marketing agency wants to generate personalized product descriptions for different users using a generative AI model. The descriptions must be unique and tailored to user preferences. Which techniques and generative AI model components would be most effective in this scenario? (Select TWO.) - Transformer-based language models with embeddings - Few-shot prompt engineering to guide the model's responses - Supervised learning with classification techniques - Neural networks with labeled time-series data - Real-time inferencing with clustering algorithms **CORRECT:** "Transformer-based language models with embeddings" is the correct answer. Transformer-based language models with embeddings are highly effective for generating personalized product descriptions. These models, like GPT, use embeddings to represent words and phrases in a continuous vector space, capturing semantic relationships. This allows the model to understand context and user preferences better, enabling it to generate unique and tailored product descriptions. Transformer models excel in generating coherent, human-like text that adapts to different user inputs, making them ideal for creating personalized marketing content. **CORRECT:** "Few-shot prompt engineering to guide the model's responses" is the correct answer. Few-shot prompt engineering allows the model to generate more accurate and personalized content with minimal examples. By providing a few specific examples in the prompt, you can guide the generative AI model to create tailored product descriptions that align with individual user preferences. This technique enhances the model's ability to adapt to different contexts without needing extensive fine-tuning, making it both efficient and effective for generating unique marketing content. **INCORRECT:** "Neural networks with labeled time-series data" is incorrect. Neural networks trained on time-series data are not suited for generating personalized product descriptions. They are primarily used for forecasting and analyzing sequential data like stock prices or sensor readings, not for text generation. **INCORRECT:** "Real-time inferencing with clustering algorithms" is incorrect. Clustering algorithms are useful for segmenting users based on behavior but are not designed for generating text or product descriptions. They group data points rather than producing personalized content. **INCORRECT:** "Supervised learning with classification techniques" is incorrect. Supervised learning with classification techniques is used for tasks like identifying categories or labels but is not suitable for generating personalized and unique text. It lacks the flexibility and creativity required for generating tailored product descriptions. **References:** https://aws.amazon.com/what-is/generative-ai Domain: Fundamentals of Generative AI --- #### 14. A financial services firm is developing a generative AI solution to summarize analyst reports. The security team must address model outputs that could unintentionally reveal confidential strategies. Which discipline should be prioritized to address this challenge? - Identity management enhances account security and ensures appropriate access control. - Risk management identifies potential disclosure risks in AI outputs and guides mitigation strategies. - Logging improves system transparency and supports auditability. - Network isolation strengthens system protection and limits external access. **CORRECT:** "Risk management identifies potential disclosure risks in AI outputs and guides mitigation strategies." is the correct answer. Risk management is a core discipline in building secure and responsible AI systems, especially for generative AI use cases. It helps organizations identify, assess, and mitigate potential risks such as unintended disclosures, hallucinations, or data leakage from model outputs. In the context of summarizing analyst reports, there's a risk that the model might generate summaries that inadvertently include sensitive financial strategies, proprietary methods, or non-public insights. Risk management allows the firm to proactively identify these threats, define safety boundaries for model behavior, and implement mitigation measures such as output filtering, human-in-the-loop reviews, or prompt engineering safeguards. AWS emphasizes responsible AI practices where risk management plays a central role in addressing both technical vulnerabilities and business concerns around trust, compliance, and safety. **INCORRECT:** "Identity management enhances account security and ensures appropriate access control." is incorrect. While identity management ensures that only authorized users can access systems and data, it does not directly address the content of AI model outputs. It helps protect who sees what, but not what the AI says. **INCORRECT:** "Network isolation strengthens system protection and limits external access." is incorrect. Network isolation is useful for restricting access to resources from outside the environment. However, it doesn't help evaluate or filter model-generated content, which is the primary concern in this scenario. **INCORRECT:** "Logging improves system transparency and supports auditability." is incorrect. Logging is valuable for tracking system use and detecting anomalies. But by itself, it doesn't prevent or mitigate the risk of confidential information being generated by the model. It may help identify when it happened, but not how to stop it. **References:** https://aws.amazon.com/blogs/security/securing-generative-ai-an-introduction-to-the-generative-ai-security-scoping-matrix Domain: Security, Compliance, and Governance for AI Solutions --- #### 15. An insurance company is building AI models to predict claim risks. The team is analyzing different datasets but needs to understand how labeled data differs from unlabeled data to choose the right ML approach. What is a key difference between labeled and unlabeled data in machine learning? - Labeled data includes only structured fields, whereas unlabeled data has unstructured formats. - Labeled data requires unsupervised learning models, and unlabeled data fits supervised learning approaches. - Labeled data is suitable for time series models, and unlabeled data is limited to regression tasks only. - Labeled data contains input-output pairs, while unlabeled data only includes inputs without known outputs. **CORRECT:** "Labeled data contains input-output pairs, while unlabeled data only includes inputs without known outputs." is the correct answer. Labeled data means that every input example is paired with the correct output (also called a label). For example, if you have an image of a car and it's labeled "car," that's labeled data. Labeled data is used for supervised learning, where the model learns the relationship between inputs and known outputs. In contrast, unlabeled data only contains inputs without any associated outputs. The model must find patterns in the data itself, which is used in unsupervised learning. Understanding this difference helps teams decide whether they need supervised or unsupervised approaches based on the availability of labels. **INCORRECT:** "Labeled data is suitable for time series models, and unlabeled data is limited to regression tasks only." is incorrect. Labeled and unlabeled data are not restricted to specific types of models like time series or regression. Both types of models (time series forecasting, regression, classification) can work with labeled data, and unsupervised learning can work with unlabeled data. The key difference lies in the presence of labels, not the type of task. **INCORRECT:** "Labeled data requires unsupervised learning models, and unlabeled data fits supervised learning approaches." is incorrect. Labeled data is used in supervised learning, where the model learns from known outputs. Unlabeled data is used in unsupervised learning, where the model must find hidden patterns without labeled outputs. **INCORRECT:** "Labeled data includes only structured fields, whereas unlabeled data has unstructured formats." is incorrect. Both labeled and unlabeled data can be structured (like tables) or unstructured (like text, images, or videos). The distinction between labeled and unlabeled data is based on whether there are known outputs for the inputs, not the format of the data. **References:** https://aws.amazon.com/what-is/data-labeling Domain: Fundamentals of AI and ML --- #### 16. A startup aims to launch a generative AI-powered chatbot to enhance customer support. However, the team has minimal DevOps resources and wants to avoid the overhead of provisioning, scaling, and maintaining infrastructure while focusing on rapid development and deployment. Which key advantage of AWS generative AI services allows you to focus on building the application without worrying about managing servers? - Fully managed and serverless architecture enabling faster development - Speed-to-market driven by pre-trained model availability - Built-in accessibility for multi-modal input support - Cost-optimization through pay-per-inference pricing **CORRECT:** "Fully managed and serverless architecture enabling faster development" is the correct answer. A fully managed and serverless architecture means AWS takes care of all the heavy lifting—like provisioning servers, managing infrastructure, scaling to meet demand, and maintaining the backend. This allows developers and startups to focus purely on building and deploying their applications without needing deep DevOps expertise. AWS generative AI services like Amazon Bedrock and Amazon SageMaker JumpStart are great examples. They offer pre-built capabilities with no need to set up or manage servers, which is perfect for teams that want to move quickly. **INCORRECT:** "Built-in accessibility for multi-modal input support" is incorrect. This refers to the ability of some models to handle different input types (text, images, audio, etc.) in one system. While helpful for building richer experiences, it doesn't address the challenge of managing infrastructure or development speed. So, it's not the right fit for this scenario. **INCORRECT:** "Cost-optimization through pay-per-inference pricing" is incorrect. This is a real benefit of using AWS generative AI services—you pay only for what you use. However, this feature relates more to cost control than to infrastructure management. It doesn't directly solve the issue of avoiding provisioning or scaling. **INCORRECT:** "Speed-to-market driven by pre-trained model availability" is incorrect. Having access to pre-trained models definitely helps reduce time to market. However, the key issue in this scenario is managing infrastructure with limited DevOps, not model availability. So while this is a great advantage, it's not the main one needed here. **References:** https://docs.aws.amazon.com/serverless Domain: Fundamentals of Generative AI --- #### 17. A fashion retailer is developing a model to predict the style category of each clothing item. Each item must belong to only one category, such as formal, casual, or sportswear. Which type of classification is best suited? - Multi-label classification - Regression - Multi-class classification - Binary classification **CORRECT:** "Multi-class classification" is the correct answer. Multi-class classification is a type of supervised machine learning where each input is categorized into one class from three or more possible classes. It is used when the output variable has more than two categories and only one category can be correct for each input. For example, classifying clothing as formal, casual, or sportswear is a multi-class problem because each item must belong to exactly one of those categories. This approach uses algorithms like decision trees, logistic regression, or neural networks to learn patterns from labeled data and make predictions. In this scenario, the fashion retailer needs a model that classifies clothing items into one of several distinct and mutually exclusive categories—formal, casual, or sportswear. Since each item belongs to only one category, this makes it a perfect use case for multi-class classification. The model will learn patterns in the clothing features (like color, fabric, cut, etc.) and map them to the correct style category. **INCORRECT:** "Multi-label classification" is incorrect. Multi-label classification is used when an input can belong to multiple categories at the same time. For example, an email could be both "work" and "urgent." However, in this use case, each clothing item belongs to only one category. So, multi-label classification would not be appropriate here. **INCORRECT:** "Binary classification" is incorrect. Binary classification is used when there are only two possible outcomes or classes, such as "yes" or "no," or "spam" vs. "not spam." Since the clothing categories include more than two options (formal, casual, sportswear, etc.), binary classification cannot handle this task accurately. **INCORRECT:** "Regression" is incorrect. Regression is used for predicting continuous numerical values, such as predicting sales or temperature. It is not suitable for categorizing items into discrete labels or classes. Since this task involves choosing a specific category, regression is not applicable. **References:** https://docs.aws.amazon.com/machine-learning/latest/dg/types-of-ml-models.html https://aws.amazon.com/blogs/machine-learning/amazon-comprehend-now-supports-multi-label-custom-classification Domain: Fundamentals of AI and ML --- #### 18. A media company is exploring the use of generative AI to improve productivity and creativity. The company wants to use a model that can perform tasks like summarizing long-form articles, generating promotional content, and translating blogs into multiple languages. The company does not have in-house ML expertise and prefers a fully managed solution with minimal setup effort. They are exploring Amazon Bedrock for implementing this solution and is learning about the types of models it can utilize. Which of the following accurately describes a Foundation Model (FM) as defined in Amazon Bedrock? - A general-purpose model trained on a large diverse set of unstructured data and usable across multiple tasks without retraining - A low-latency model designed for near real-time personalization based on user activity - A model designed exclusively for vector search and document indexing - A model specialized for a single task and trained only on labeled data **CORRECT:** "A general-purpose model trained on a large diverse set of unstructured data and usable across multiple tasks without retraining." is the correct answer. Foundation Models (FMs) in Amazon Bedrock are large, general-purpose models trained on vast amounts of unstructured data such as text, images, code, or audio. These models are designed to understand and generate human-like language, which makes them adaptable to a wide range of tasks without needing to be retrained. For example, the same FM can be used to summarize long articles, generate creative marketing content, and translate text into multiple languages. Amazon Bedrock provides access to FMs from leading providers like AI21 Labs, Anthropic, Cohere, Meta, and Amazon's own Titan models. Since it is a fully managed service, customers can integrate these models into their applications using simple APIs—without needing machine learning expertise or infrastructure setup. **INCORRECT:** "A model specialized for a single task and trained only on labeled data." is incorrect. This describes a traditional machine learning model, not a foundation model. Foundation Models are general-purpose and trained on unlabeled and diverse data, making them useful for many tasks. Specialized models require labeled datasets and are not flexible without retraining, which makes them less ideal for dynamic tasks like content generation or translation. **INCORRECT:** "A model designed exclusively for vector search and document indexing." is incorrect. This option refers to vector database search systems or embedding models—not Foundation Models. While FMs can generate embeddings that help with semantic search, they are not exclusively designed for search or indexing. FMs have broader capabilities, such as generating content, understanding context, and answering questions. **INCORRECT:** "A low-latency model designed for near real-time personalization based on user activity." is incorrect. This describes a real-time recommendation or personalization system, which is often powered by smaller, task-specific models optimized for speed. On the other hand, Foundation Models focus on general-purpose understanding and generation tasks, and although they can contribute to personalization, that is not their primary design goal. **References:** https://docs.aws.amazon.com/bedrock/latest/userguide/foundation-models-reference.html https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html Domain: Fundamentals of Generative AI --- #### 19. A fintech startup is developing several machine learning (ML) models for fraud detection, credit scoring, and personalized financial recommendations. Each model is managed by a separate team with different responsibilities. To minimize the risk of accidental model misuse or unauthorized access, the startup wants to implement least-privilege access controls, ensuring team members only have access to the specific SageMaker resources they need to perform their job roles. Which Amazon SageMaker tool is best suited to help the company define and manage fine-grained access controls for different ML teams? - Use AWS IAM Password Policy to enforce access token rotation as a substitute for permission controls. - Use Amazon SageMaker Role Manager to define and assign fine-grained, role-based permissions aligned with the least-privilege principle. - Use Amazon SageMaker Model Monitor to limit login access and detect unauthorized user activities in real-time. - Use Amazon SageMaker Model Cards to set usage quotas and allocate model access based on documentation compliance. **CORRECT:** "Use Amazon SageMaker Role Manager to define and assign fine-grained, role-based permissions aligned with the least-privilege principle." is the correct answer. Amazon SageMaker Role Manager is a governance tool that allows organizations to define and manage fine-grained access controls in alignment with the principle of least privilege. This tool helps administrators create custom roles tailored to different job functions (e.g., data scientists, ML engineers, analysts), specifying exactly what each role can access within SageMaker, such as training jobs, models, endpoints, and notebooks. By using SageMaker Role Manager, the fintech startup can ensure that each team only has access to the specific ML resources needed for their role, reducing the risk of accidental misuse or unauthorized access. It simplifies security management while maintaining productivity and compliance. **INCORRECT:** "Use Amazon SageMaker Model Cards to set usage quotas and allocate model access based on documentation compliance." is incorrect. Model Cards in SageMaker are used for documentation and transparency—they provide details about a model's purpose, performance, ethical considerations, and limitations. They do not enforce access controls, nor can they manage permissions or usage quotas. **INCORRECT:** "Use Amazon SageMaker Model Monitor to limit login access and detect unauthorized user activities in real-time." is incorrect. Model Monitor is designed to track model performance in production, such as detecting data drift or bias. It does not control user access or prevent misuse. It's a monitoring tool, not an access control mechanism. **INCORRECT:** "Use AWS IAM Password Policy to enforce access token rotation as a substitute for permission controls." is incorrect. IAM password policies enforce security best practices for user credentials but do not provide fine-grained resource-level access control. They are not a substitute for role-based access in SageMaker or any other service. **References:** https://docs.aws.amazon.com/sagemaker/latest/dg/role-manager.html Domain: Security, Compliance, and Governance for AI Solutions --- #### 20. A team is fine-tuning a generative AI model to create creative writing prompts. To encourage more coherent and focused outputs, they want to limit the number of most likely next words the model can consider during generation. Which parameter should they adjust to achieve this? - Temperature - Top-k - Max Token Length - Top-p **CORRECT:** "Top-k" is the correct answer. Top-k is a decoding parameter used in generative AI models that limits the number of possible next words the model can choose from during text generation. When top-k is set, the model only considers the k most likely next tokens (words or subwords) based on its probability distribution, effectively filtering out less probable or irrelevant options. This can help make the output more focused and coherent, especially when generating creative or structured text like writing prompts. By narrowing the model's choices to the most likely tokens, it reduces randomness and encourages outputs that stay on topic and maintain logical flow. This is especially useful in applications where creativity needs to be guided and not completely open-ended. **INCORRECT:** "Max Token Length" is incorrect. This parameter controls how long the generated output can be, not which words are chosen during generation. It sets a hard limit on the number of tokens (words or characters) the model can generate, but doesn't influence coherence or word selection. **INCORRECT:** "Temperature" is incorrect. Temperature controls the randomness of the output. A lower temperature makes the model more deterministic, while a higher value makes it more creative. Although it affects generation quality, it doesn't limit the number of words considered—it just changes how probabilities are interpreted. **INCORRECT:** "Top-p" is incorrect. Top-p limits the percentage of next word choices to the smallest set of words whose combined probability exceeds a certain threshold (e.g., 0.9). While it helps with coherent text too, it works differently from top-k. Top-p adapts to the shape of the distribution, whereas top-k applies a fixed cut-off. The question specifically asks about limiting the number of possible next words, which makes top-k more precise. **References:** https://docs.aws.amazon.com/bedrock/latest/userguide/inference-parameters.html Domain: Applications of Foundation Models --- #### 21. You are building a real-time AI summarization system to be accessed globally. The model must support sudden surges in traffic with automatic scaling and require as little hands-on infrastructure configuration as possible. Which AWS-based solution most effectively meets this requirement? - Amazon SageMaker asynchronous inference with autoscaling configuration - Amazon ECS service with containerized model and application autoscaling - Amazon EC2 GPU fleet with custom horizontal scaling logic - Amazon Bedrock for serverless foundation model access with elastic scaling **CORRECT:** "Amazon Bedrock for serverless foundation model access with elastic scaling" is the correct answer. Amazon Bedrock is a fully managed, serverless platform that allows developers to access and integrate foundation models from leading AI providers without managing any underlying infrastructure. It automatically scales based on demand, making it ideal for real-time AI summarization systems with unpredictable traffic. With Bedrock, there's no need to configure servers, clusters, or autoscaling policies—you simply send requests to the model endpoint, and the service handles the rest. This makes it the most effective choice for globally distributed systems that need to support sudden surges in usage while minimizing operational overhead. **INCORRECT:** "Amazon SageMaker asynchronous inference with autoscaling configuration" is incorrect. While SageMaker asynchronous inference supports autoscaling and is cost-effective for batch or delayed workloads, it's not optimized for real-time use cases where low latency is critical. It introduces delays due to queuing and response polling. **INCORRECT:** "Amazon EC2 GPU fleet with custom horizontal scaling logic" is incorrect. Using EC2 GPU instances offers high control and customization but requires manual infrastructure management and complex scaling logic. It lacks the ease and elasticity of a serverless solution, making it unsuitable when you want minimal hands-on configuration. **INCORRECT:** "Amazon ECS service with containerized model and application autoscaling" is incorrect. ECS (Elastic Container Service) allows containerized model deployment and supports autoscaling, but you must manage clusters, scaling policies, and container health, which increases operational complexity. It's more effort to configure and scale than serverless solutions like Bedrock. **References:** https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html Domain: Applications of Foundation Models --- #### 22. A financial institution is migrating sensitive financial data to AWS Cloud services. They must understand the AWS shared responsibility model to comply with regulatory standards. Which TWO statements best describe the Shared Responsibility Model between AWS and the financial institution? (Select TWO.) - The financial institution must physically secure the AWS data centers used for data storage. - AWS is responsible for managing the customer's internal compliance policies and procedures. - AWS ensures the customer's financial data is backed up regularly at the application level. - AWS is responsible for maintaining and protecting the cloud hardware infrastructure. - The financial institution is responsible for data encryption, identity management, and application security. **CORRECT:** "AWS is responsible for maintaining and protecting the cloud hardware infrastructure." is the correct answer. Under the AWS Shared Responsibility Model, AWS is responsible for the "security of the cloud". This includes maintaining and securing the physical infrastructure, such as servers, networking hardware, storage, and data centers. AWS ensures that this global infrastructure is protected from unauthorized access, power failures, and other physical threats. **CORRECT:** "The financial institution is responsible for data encryption, identity management, and application security." is the correct answer. Customers (in this case, the financial institution) are responsible for the "security in the cloud." This includes encrypting their data, managing user access and IAM (Identity and Access Management), as well as securing the applications they run on AWS. These responsibilities are critical for complying with industry regulations and ensuring the confidentiality and integrity of sensitive financial data. **INCORRECT:** "AWS is responsible for managing the customer's internal compliance policies and procedures." is incorrect. AWS provides tools and compliance certifications, but it is not responsible for a customer's internal policies. The financial institution must define and manage their own compliance processes based on their industry requirements. **INCORRECT:** "The financial institution must physically secure the AWS data centers used for data storage." is incorrect. AWS handles the physical security of its data centers. Customers do not have access to AWS data centers and are not responsible for securing the facilities. **INCORRECT:** "AWS ensures the customer's financial data is backed up regularly at the application level." is incorrect. Backups at the application level are the customer's responsibility. While AWS offers services like Amazon S3, RDS, and backup tools, the customer must configure and manage backups based on their requirements. **References:** https://aws.amazon.com/compliance/shared-responsibility-model Domain: Security, Compliance, and Governance for AI Solutions --- #### 23. Arrange the prompt engineering techniques from LEAST structured to MOST structured. (Select and order THREE.) Note: Select only the correct options, as the type of "Ordering" question is not supported here. - Few-shot prompting - Zero-shot prompting - Chain-of-thought prompting **CORRECT:** "Zero-shot prompting" is the correct answer. Zero-shot prompting is the least structured approach in prompt engineering. It involves providing a model with a simple instruction without any examples. The model must generate a response based solely on its pre-trained knowledge. This technique is useful when dealing with general knowledge tasks but may lead to less accurate or unpredictable responses for complex queries. Since it lacks context or training examples, it relies entirely on the model's ability to infer the correct answer from the given prompt. **CORRECT:** "Few-shot prompting" is the correct answer. Few-shot prompting adds some structure by including a few examples in the prompt before asking the model to generate an answer. This method helps the model understand the expected response format and context better than zero-shot prompting. By providing relevant examples, few-shot prompting improves the model's accuracy and reliability, especially in cases where the task is nuanced or requires specific patterns. It serves as a middle ground between zero-shot and chain-of-thought prompting. **CORRECT:** "Chain-of-thought prompting" is the correct answer. Chain-of-thought prompting is the most structured technique. It instructs the model to break down complex problems into step-by-step reasoning before arriving at a final answer. This method is particularly effective for tasks that require logical reasoning, multi-step calculations, or in-depth problem-solving. By explicitly guiding the model to reason through a problem, chain-of-thought prompting significantly enhances the quality and accuracy of responses, especially for complex AI tasks. **References:** https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-foundation-models-customize-prompt-engineering.html https://aws.amazon.com/what-is/prompt-engineering Domain: Applications of Foundation Models --- #### 24. You are developing a language translation AI application and want to ensure that it follows ethical guidelines and avoids any societal bias. Which practice would be most effective for maintaining responsible AI? - Using a diverse dataset for training - Choosing the fastest algorithm for the task - Regularly updating the software and models - Implementing auto-scaling to handle more requests **CORRECT:** "Using a diverse dataset for training" is the correct answer. Using a diverse dataset for training is critical in ensuring that the language translation AI application does not exhibit societal biases and reflects a more inclusive range of language patterns and usages. Diverse datasets help in training models that can generalize better to various demographic groups, languages, dialects, and contexts, and also minimize the risk of reinforcing existing biases. This approach is aligned with ethical AI practices, which emphasize fairness and inclusivity in AI systems. **INCORRECT:** "Regularly updating the software and models" is incorrect. Regular updates can improve system performance and security but do not specifically address the ethical considerations related to bias in training data or algorithms. **INCORRECT:** "Implementing auto-scaling to handle more requests" is incorrect. Auto-scaling improves application availability and performance under varying loads but does not contribute to addressing bias or ensuring ethical AI practices. **INCORRECT:** "Choosing the fastest algorithm for the task" is incorrect. Selecting the fastest algorithm may enhance the application's efficiency but does not inherently address ethical issues such as bias or fairness in the AI system. **References:** https://aws.amazon.com/blogs/enterprise-strategy/your-ai-is-only-as-good-as-your-data Domain: Guidelines for Responsible AI --- #### 25. Your AI-powered legal document summarization model needs real-time access to a large knowledge base and must integrate structured and unstructured data sources. Which AWS service should you use? - Amazon Textract - Amazon Translate - Amazon OpenSearch Service - Amazon Kendra **CORRECT:** "Amazon Kendra" is the correct answer. Amazon Kendra is an AI-powered enterprise search service that provides real-time access to structured and unstructured data across various sources. It is designed to integrate knowledge bases, documents, and databases, making it ideal for an AI-powered legal document summarization model that needs quick and accurate access to relevant information. With natural language processing (NLP) capabilities, Amazon Kendra allows users to search and retrieve information efficiently from PDFs, Word documents, emails, and other legal data sources. It also supports integrations with Amazon S3, SharePoint, RDS, and other structured data repositories, ensuring comprehensive knowledge retrieval. Since legal document summarization involves analyzing large volumes of text, Amazon Kendra's semantic search capabilities help understand context, extract key information, and improve accuracy in AI-driven summarization models. **INCORRECT:** "Amazon Translate" is incorrect. Amazon Translate is a machine translation service that converts text between languages. It is not used for retrieving legal knowledge from structured and unstructured sources. **INCORRECT:** "Amazon OpenSearch Service" is incorrect. Amazon OpenSearch Service is designed for search and analytics on log data but lacks built-in NLP capabilities for understanding and summarizing legal documents effectively. **INCORRECT:** "Amazon Textract" is incorrect. Amazon Textract is an OCR (Optical Character Recognition) service that extracts text from scanned documents but does not perform real-time knowledge retrieval or integrate structured and unstructured data sources. **References:** https://docs.aws.amazon.com/kendra/latest/dg/what-is-kendra.html Domain: Applications of Foundation Models --- #### 26. A financial AI system processes confidential transaction records on AWS and stores results in Amazon S3. The security team wants to ensure all encryption keys are centrally managed, and access to those keys is restricted by role. What is the most appropriate solution? - Use client-side encryption and store the keys in a local KMS server - Configure Amazon S3 bucket policies to handle encryption at rest - Encrypt data using SSE-KMS with AWS-managed keys (aws/s3) - Use AWS KMS with customer-managed keys and IAM policies for fine-grained control **CORRECT:** "Use AWS KMS with customer-managed keys and IAM policies for fine-grained control" is the correct answer. AWS Key Management Service (KMS) allows users to create customer-managed keys (CMKs), giving them full control over key policies, rotation, deletion, and access. When sensitive financial data is involved—like transaction records—centralized key management with strict IAM-based access control is essential. Using CMKs lets the security team define fine-grained permissions for which roles or services can use the key for encryption or decryption. This approach ensures that only authorized roles can access encrypted data, and all key usage is logged in AWS CloudTrail for auditing. It aligns with security best practices for regulated industries like finance. **INCORRECT:** "Encrypt data using SSE-KMS with AWS-managed keys (aws/s3)" is incorrect. This option uses SSE-KMS with AWS-managed keys, which means AWS controls the key management lifecycle. While secure and easy to use, it doesn't offer the same granular control or auditability as customer-managed keys. It's not suitable when the organization needs central key control and fine-grained role restrictions. **INCORRECT:** "Use client-side encryption and store the keys in a local KMS server" is incorrect. Client-side encryption shifts the responsibility for encryption to the application, which can be complex and harder to manage at scale. Using a local KMS server breaks the centralized AWS-native security model, introduces operational overhead, and could reduce security if not properly configured and monitored. **INCORRECT:** "Configure Amazon S3 bucket policies to handle encryption at rest" is incorrect. S3 bucket policies control who can access or perform actions on the bucket, but they don't manage encryption keys. Encryption at rest must be defined using server-side encryption options like SSE-S3, SSE-KMS, or client-side encryption. Bucket policies alone can't provide key management or access restrictions to KMS keys. **References:** https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingKMSEncryption.html https://docs.aws.amazon.com/kms/latest/developerguide/key-policies.html Domain: Security, Compliance, and Governance for AI Solutions --- #### 27. An online retailer receives thousands of customer product reviews every month on their e-commerce platform. The business team wants to extract valuable insights from these reviews to understand what customers talk about and automatically organize them into categories. The team decides to use Amazon Comprehend to automate this analysis. Question: Which of the following actions can Amazon Comprehend perform to meet these goals? (Select TWO) - Summarize customer reviews into a single paragraph. - Detect key entities such as product names, brands, or features. - Classify reviews by sentiment or topics. - Generate a response for each customer review using AI. - Recommend products based on customer purchase history. **CORRECT:** "Detect key entities such as product names, brands, or features." is the correct answer. Amazon Comprehend uses Natural Language Processing (NLP) to identify and extract key entities in unstructured text. Entities can include product names, company names, places, dates, quantities, and more. In the context of customer reviews, Comprehend can recognize specific products, brands, and features mentioned by customers, which helps the business understand what topics are being frequently discussed. This supports better decision-making for marketing, product development, and customer experience improvements. **CORRECT:** "Classify reviews by sentiment or topics" is the correct answer. Amazon Comprehend includes built-in capabilities for sentiment analysis (positive, negative, neutral, or mixed) and topic modeling. This allows companies to automatically determine the tone of customer feedback and organize reviews into relevant categories, such as delivery issues, product quality, or pricing concerns. These insights help companies respond faster and prioritize areas needing attention, making this a perfect match for the retailer's goal of organizing reviews and extracting valuable insights. **INCORRECT:** "Generate a response for each customer review using AI" is incorrect. Amazon Comprehend does not generate text responses. This task would require a generative AI service, such as Amazon Bedrock (using models like Claude or Titan) or a custom chatbot using Amazon Lex. Comprehend is designed for analyzing and understanding text, not creating it. **INCORRECT:** "Recommend products based on customer purchase history" is incorrect. Product recommendation systems typically use collaborative filtering or machine learning models available through Amazon Personalize. Amazon Comprehend does not analyze user purchase history or generate recommendations. Its focus is on analyzing the content of text, not historical behavioral data. **INCORRECT:** "Summarize customer reviews into a single paragraph" is incorrect. Text summarization is not a native feature of Amazon Comprehend. While it can identify key phrases and sentiments, generating summaries would require a generative AI service such as Amazon Bedrock or a fine-tuned model on Amazon SageMaker that supports text summarization. **References:** https://docs.aws.amazon.com/comprehend/latest/dg/what-is.html https://aws.amazon.com/comprehend/features Domain: Fundamentals of AI and ML --- #### 28. A healthcare organization is analyzing a large dataset containing patient medical records, treatment histories, and health outcomes to gain insights into patient care and optimize treatment plans. The team is focused on calculating various statistical measures to summarize the data and using visualizations to identify trends and correlations. These tasks are crucial for understanding the underlying patterns and relationships in the data before moving on to building predictive models or advanced analytics. Which stage of the data science process does this work primarily fall under? - Model inference - Exploratory Data Analysis (EDA) - Data collection - Model training **CORRECT:** "Exploratory Data Analysis (EDA)" is the correct answer. Exploratory Data Analysis (EDA) is an early step in the data science process. It helps teams understand the structure, patterns, and relationships in their data before they build predictive models. In this scenario, the healthcare organization is calculating statistics and creating visualizations to explore medical records and treatment outcomes. These tasks—like finding averages, checking distributions, identifying correlations, and spotting missing values—are all part of EDA. The insights from EDA help guide decisions on data cleaning, feature selection, and model choice. It's like getting to know your data before you start making predictions. **INCORRECT:** "Model training" is incorrect. This step happens after EDA. It involves feeding clean, structured data into ML algorithms, not exploring the data or visualizing trends. **INCORRECT:** "Data collection" is incorrect. Data collection comes even earlier in the process. It involves gathering raw data but does not include analysis or visualization. **INCORRECT:** "Model inference" is incorrect. Model inference is the end of the pipeline. It's about making predictions with a model that has already been trained, very different from analyzing and summarizing raw data. **References:** https://aws.amazon.com/blogs/machine-learning/exploratory-data-analysis-feature-engineering-and-operationalizing-your-data-flow-into-your-ml-pipeline-with-amazon-sagemaker-data-wrangler Domain: Fundamentals of AI and ML --- #### 29. A financial services company has deployed a customer support chatbot powered by a generative AI model to handle queries about account balances, loan options, and transaction history. However, users have started experimenting with prompt injection attacks, where they craft inputs designed to manipulate the model's behavior—for example, trying to override system instructions or produce misleading financial advice. To protect the chatbot from such manipulation and ensure reliable, safe responses, the development team wants to implement the best mitigation strategy against prompt injection. Which approach is most effective in this scenario? - Use a larger foundation model with more parameters for better understanding the input context. - Implement input validation and sanitization before passing user input to the generative model. - Enable multi-turn memory to keep track of the full conversation context. - Use spelling and grammar correction to remove potentially confusing language. **CORRECT:** "Implement input validation and sanitization before passing user input to the generative model." is the correct answer. Prompt injection is a security risk in generative AI systems where users craft prompts to trick the model into ignoring system instructions or behaving unexpectedly. One of the most effective defenses against prompt injection is to validate and sanitize user input before it is passed to the model. This involves filtering out suspicious or manipulative language, removing prompt-like structures (e.g., "Ignore the above instructions…"), and enforcing strict formatting rules. By cleaning the input, developers can reduce the chances of a user injecting harmful instructions or influencing the model to produce unsafe or misleading outputs. This helps maintain the chatbot's reliability, especially in sensitive domains like financial services where trust and safety are critical. **INCORRECT:** "Use spelling and grammar correction to remove potentially confusing language." is incorrect. While spelling and grammar correction can improve user experience and model comprehension, it does not directly protect against malicious input patterns or instruction injection. Prompt injection is a logic-based attack, not a language quality issue. **INCORRECT:** "Enable multi-turn memory to keep track of the full conversation context." is incorrect. Multi-turn memory helps models remember context across a conversation, improving coherence. However, it does not prevent users from embedding harmful prompts in a single message. It can actually make things worse if malicious prompts are retained in memory. **INCORRECT:** "Use a larger foundation model with more parameters for better understanding the input context." is incorrect. Using a larger model may improve understanding and response quality, but it does not inherently protect against prompt injection. Bigger models can still be misled by cleverly crafted inputs if no validation is in place. **References:** https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-injection.html https://docs.aws.amazon.com/prescriptive-guidance/latest/llm-prompt-engineering-best-practices/common-attacks.html Domain: Guidelines for Responsible AI --- #### 30. Case Study: A technology consulting firm manages large volumes of unstructured text data from client feedback forms, emails, and social media posts. The firm wants to use AWS AI services and foundation models to classify sentiment, extract key topics, and generate summarized reports for each client. Question: The firm wants to create automated summaries of client feedback by using a foundation model (FM) that can handle lengthy text inputs. Which approach is the MOST cost-effective and requires the LEAST custom training? - Fine-tuning a large foundation model with domain-specific data. - Retrieval augmented generation (RAG) with the firm's feedback data for context. - Continued pre-training of an open source foundation model on the firm's entire corpus. - Zero-shot prompting with no domain context. **CORRECT:** "Retrieval Augmented Generation (RAG) with the firm's feedback data for context." is the correct answer. RAG is a powerful technique that combines information retrieval with generative AI models. Instead of fine-tuning a model, RAG dynamically fetches relevant client feedback data from a knowledge base and provides it as context for the foundation model. This approach is cost-effective because it eliminates the need for extensive training while improving the accuracy of generated summaries. AWS services like Amazon Bedrock and Amazon Kendra can be used for retrieval, while models like Anthropic Claude or Llama can generate high-quality summaries. By using RAG, the firm can handle lengthy text inputs efficiently and ensure that responses are relevant to each client's needs. This method reduces costs, improves scalability, and minimizes the need for custom model training. **INCORRECT:** "Fine-tuning a large foundation model with domain-specific data" is incorrect. Fine-tuning a large foundation model requires significant computational resources and expertise. It involves modifying a pre-trained model using labeled client feedback data, which can be expensive and time-consuming. Since the firm only needs sentiment classification, topic extraction, and summarization, fine-tuning is unnecessary. RAG provides a more cost-effective and flexible alternative. **INCORRECT:** "Continued pre-training of an open-source foundation model on the firm's entire corpus" is incorrect. Pre-training involves training a model on massive amounts of text data from scratch or continuing training on a new dataset. This process is even more resource-intensive than fine-tuning and requires high-end infrastructure, making it impractical for cost-conscious organizations. **INCORRECT:** "Zero-shot prompting with no domain context" is incorrect. Zero-shot prompting allows a model to generate responses without additional context or examples. While this is a quick and cheap approach, it may lead to inaccurate or generic summaries because the model lacks specific client data. RAG enhances output quality by providing relevant information at runtime, making it a better solution for personalized and meaningful client feedback summaries. **References:** https://aws.amazon.com/what-is/retrieval-augmented-generation Domain: Fundamentals of Generative AI --- #### 31. A retail company wants to deploy a generative AI chatbot without training a model from scratch. They choose to use a foundation model hosted on AWS. Which of the following are advantages of using a foundation model? (Select TWO.) - It provides a flexible base that can be reused across multiple domains and applications. - It automatically filters out all biased or offensive content without additional safeguards. - It removes the need for any human oversight or feedback during development. - It reduces the need for training from scratch and can be fine-tuned for specific use cases. - It eliminates the need for prompt engineering or customization. **CORRECT:** "It reduces the need for training from scratch and can be fine-tuned for specific use cases." is the correct answer. Foundation models are large pre-trained models that have been trained on vast amounts of general data. One of their biggest advantages is that you don't have to start from scratch. Instead, you can fine-tune or customize them for your specific business needs, such as creating a chatbot for customer service. This saves significant time, resources, and computational power. Fine-tuning helps the model adapt its responses to fit the company's tone, industry, and customer expectations. **CORRECT:** "It provides a flexible base that can be reused across multiple domains and applications." is the correct answer. Foundation models are designed to be general-purpose. Because they understand and generate natural language across a wide range of topics, they can be reused for many applications such as summarization, translation, chatbots, and more. This flexibility makes them ideal for businesses that want to deploy AI solutions in various parts of their operations without building new models for each task. **INCORRECT:** "It removes the need for any human oversight or feedback during development." is incorrect. Even with powerful models, human oversight is still important, especially to review output quality, address errors, and prevent misuse. Responsible AI requires human involvement for validation, testing, and continuous improvement. **INCORRECT:** "It automatically filters out all biased or offensive content without additional safeguards." is incorrect. Foundation models can reduce harmful outputs, but they are not perfect. Developers must still implement content moderation, ethical checks, and policies to ensure safe usage. Additional safeguards and monitoring are needed to handle edge cases. **INCORRECT:** "It eliminates the need for prompt engineering or customization." is incorrect. Prompt engineering is often necessary to get the best performance from a foundation model. Customizing prompts helps ensure the responses are relevant, accurate, and aligned with your specific use case. **References:** https://aws.amazon.com/what-is/foundation-models https://docs.aws.amazon.com/bedrock/latest/userguide/custom-models.html Domain: Fundamentals of Generative AI --- #### 32. A financial company is expanding its AI capabilities by integrating Large Language Models (LLMs) into its customer service and fraud detection systems. The company want to use AWS Cloud to ensure scalability, security, and efficient management of LLMs. The development team is searching for AWS services that provide end-to-end support for training, deploying, and managing these models while maintaining seamless cloud integration. Which AWS services should the company use to build and manage its LLM-powered AI solutions? (Select TWO.) - Amazon SageMaker - AWS Glue - Amazon Kinesis - Amazon Bedrock - Amazon Rekognition **CORRECT:** "Amazon Bedrock" is the correct answer. Amazon Bedrock is a fully managed service that allows you to build and scale generative AI applications using pre-trained foundation models from leading AI companies via an API. It offers serverless deployment, so there's no need to manage infrastructure. Bedrock integrates well with AWS services and provides built-in security and compliance, making it ideal for companies that want to implement Large Language Models (LLMs) quickly and securely. **CORRECT:** "Amazon SageMaker" is the correct answer. Amazon SageMaker is a comprehensive machine learning service that provides tools to build, train, fine-tune, deploy, and manage machine learning models, including LLMs. It supports custom model development, automatic model tuning, MLOps, and model hosting. For a financial company with specific needs (e.g., fraud detection or sensitive data handling), SageMaker is excellent for building custom LLM solutions with high control and enterprise-grade security. **INCORRECT:** "Amazon Rekognition" is incorrect. Amazon Rekognition is used for image and video analysis, such as facial recognition or object detection. It is not designed for handling LLMs or text-based AI workflows. **INCORRECT:** "AWS Glue" is incorrect. AWS Glue is primarily a data integration service for preparing and transforming data for analytics. While it supports data pipelines, it is not used for training or managing LLMs. **INCORRECT:** "Amazon Kinesis" is incorrect. Amazon Kinesis is designed for real-time data streaming and analytics. It can work with data pipelines feeding into ML models, but it doesn't offer direct support for training or managing LLMs. **References:** https://aws.amazon.com/bedrock https://aws.amazon.com/sagemaker Domain: Fundamentals of Generative AI --- #### 33. Your company is developing an AI model and must ensure that the data used is in compliance with data governance policies and regulatory requirements. Which of the following strategies provides the most effective approach to data governance? - Apply S3 bucket policies to enforce organization-wide compliance. - Use IAM roles to restrict developer access to training datasets. - Anonymize all data to ensure it meets compliance standards. - Apply logging, retention, and monitoring for data lifecycle management. **CORRECT:** "Apply logging, retention, and monitoring for data lifecycle management." is the correct answer. This is the most effective approach to data governance because it provides visibility, accountability, and control over how data is used and stored throughout its lifecycle. Logging ensures that every access or change to the data is recorded, retention policies define how long data should be stored, and monitoring helps detect unusual activity or potential violations. These practices help organizations comply with regulatory requirements such as GDPR or HIPAA and are aligned with AWS best practices for governance. A well-managed data lifecycle ensures that sensitive data is used responsibly and that expired or unnecessary data is deleted in a timely manner. **INCORRECT:** "Use IAM roles to restrict developer access to training datasets." is incorrect. While IAM roles are important for access control, they are just one part of a complete data governance strategy. They do not cover the full data lifecycle, such as logging, monitoring, or retention, which are critical for compliance and auditability. **INCORRECT:** "Anonymize all data to ensure it meets compliance standards" is incorrect. Anonymization helps protect sensitive information but is not a complete governance strategy. Some regulations require more than just anonymization—such as audit logging, data retention policies, and access tracking. Also, not all data can be anonymized if it needs to retain certain attributes for training purposes. **INCORRECT:** "Apply S3 bucket policies to enforce organization-wide compliance" is incorrect. S3 bucket policies control access at the storage level, but they are not sufficient alone for ensuring compliance. They don't provide insights into how data is used or track data over time, and they lack retention or monitoring capabilities. **References:** https://aws.amazon.com/compliance/data-protection Domain: Security, Compliance, and Governance for AI Solutions --- #### 34. An AI startup is exploring generative AI solutions for creating personalized music tracks based on user preferences, which can include genre, mood, and instruments. Which generative AI techniques would be most effective for creating music? (Select TWO.) - Reinforcement learning - Diffusion models - Decision Trees - Generative Adversarial Networks (GANs) - Transformer-based models **CORRECT:** "Generative Adversarial Networks (GANs)" is the correct answer. Generative Adversarial Networks (GANs) are highly effective for generating creative outputs, including music. GANs can be trained on music data, allowing the generator to create new music tracks based on learned patterns, while the discriminator ensures the quality of the output. By training GANs on user-preferred genres, moods, and instruments, they can produce personalized music that matches specific user preferences. GANs are known for their ability to generate high-quality, unique compositions, making them ideal for music creation tasks. **CORRECT:** "Transformer-based models" is the correct answer. Transformer-based models are effective for generating music, especially when user preferences include multiple variables like genre, mood, and instruments. These models can handle multi-modal inputs, such as text descriptions or user-defined parameters, and transform them into music outputs. By utilizing the attention mechanisms in transformers, the model can focus on specific musical attributes that match user preferences, enabling it to create personalized music tracks. Transformers are well-suited for handling the complexity of music generation due to their ability to process sequential data, such as notes and rhythms. **INCORRECT:** "Diffusion models" is incorrect. Diffusion models are primarily used for generating images and visual data. They are not optimized for creating sequential data like music, which requires capturing patterns over time rather than static representations. **INCORRECT:** "Decision Trees" is incorrect. Decision trees are used for classification or regression tasks, not generative tasks. They lack the complexity needed to model patterns in audio or music data, making them unsuitable for creating music. **INCORRECT:** "Reinforcement learning" is incorrect. Reinforcement learning is used for optimizing decisions based on feedback, but it is not commonly used for generating music. While it could be applied in a secondary role (e.g., refining music based on user feedback), it is not the primary technique for music generation. **References:** https://aws.amazon.com/what-is/gan https://aws.amazon.com/what-is/generative-ai Domain: Fundamentals of Generative AI --- #### 35. A retail company is using a foundation model trained on general internet text. They now want it to generate product descriptions tailored to their unique catalog and brand tone. The model must learn from existing product descriptions without retraining from scratch. Which approach should they use? - Zero-shot inference using prompt variations - Transfer learning via domain-specific fine-tuning - Token-level augmentation for vocabulary expansion - Embedding clustering with unsupervised grouping **CORRECT:** "Transfer learning via domain-specific fine-tuning" is the correct answer. Transfer learning via domain-specific fine-tuning allows you to take a foundation model that's already trained on a broad dataset (like general internet text) and adapt it to a specific domain—such as a retail catalog—by training it further on relevant, labeled examples. In this case, the company can fine-tune the model using their existing product descriptions, helping it learn the brand's tone, style, and unique terminology. This is far more efficient than training a model from scratch, and it ensures the output becomes more personalized and aligned with the company's content standards. AWS services like Amazon SageMaker and Amazon Bedrock support fine-tuning workflows for foundation models. **INCORRECT:** "Zero-shot inference using prompt variations" is incorrect. Zero-shot inference uses prompts to guide a model without additional training. While this can work for generic outputs, it doesn't allow the model to deeply internalize the brand-specific tone or product nuances. It is limited when consistency and brand voice are required across many outputs. **INCORRECT:** "Embedding clustering with unsupervised grouping" is incorrect. Embedding clustering is useful for grouping similar data points (e.g., products or customers) based on vector similarity, but it doesn't train or adapt a model for generating personalized content. It's more relevant to recommendation or search systems. **INCORRECT:** "Token-level augmentation for vocabulary expansion" is incorrect. Token-level augmentation modifies the model's tokenizer or vocabulary to handle new terms, but this is a low-level change. It does not teach the model how to generate coherent, brand-specific content and typically requires retraining from scratch, which the company wants to avoid. **References:** https://aws.amazon.com/what-is/transfer-learning Domain: Applications of Foundation Models --- #### 36. A development team has deployed a generative AI model with guardrails in place. However, during testing, they discovered that cleverly crafted inputs can still manipulate the model into producing outputs that violate the intended safety guidelines, despite restrictions. What security risk does this scenario most accurately represent? - Indirect prompt injection through prompt chaining - Data leakage from model memory during inference - Unauthorized parameter access via fine-tuning backdoors - Jailbreaking of the model using adversarial prompts **CORRECT:** "Jailbreaking of the model using adversarial prompts" is the correct answer. Jailbreaking is a technique where users craft special, tricky inputs (called adversarial prompts) to bypass the safety or ethical guidelines of an AI system. Even if guardrails are implemented, some users may find ways to "trick" the model into generating unsafe, biased, or restricted content. In our case, the team discovered that the model still produces problematic outputs when tested with cleverly crafted prompts. This directly points to a jailbreaking scenario, where input manipulation is used to override intended safety mechanisms. This is a common concern in generative AI models, making it crucial to continuously test and improve safeguards. **INCORRECT:** "Indirect prompt injection through prompt chaining" is incorrect. Indirect prompt injection happens when an AI system uses dynamic content from external sources (like user-generated input or APIs) and combines it into its final prompt without proper filtering. This can lead to unintended behavior. However, in our case, it doesn't involve external sources or chaining—just direct input manipulation—so it's not the right choice. **INCORRECT:** "Unauthorized parameter access via fine-tuning backdoors" is incorrect. This risk refers to someone inserting hidden triggers or malicious data during model training or fine-tuning. These "backdoors" can be activated later by specific inputs to change the model's behavior. But in our case, the model is already deployed, and the problem arises from input manipulation, not training-time tampering. **INCORRECT:** "Data leakage from model memory during inference" is incorrect. This happens when a model unintentionally reveals private or sensitive data it has seen during training (e.g., names, passwords, or confidential text). While dangerous, this isn't what's described here. The issue isn't about leaking data—it's about violating safety rules through crafted prompts. **References:** https://aws.amazon.com/blogs/machine-learning/implementing-advanced-prompt-engineering-with-amazon-bedrock Domain: Applications of Foundation Models --- #### 37. A healthcare company is using machine learning to predict patient readmission rates. The data science team must select an appropriate learning type based on the availability of outcome data. Which of the following statements are true about supervised and unsupervised learning? (Select TWO) - Unsupervised learning always achieves higher accuracy with structured datasets. - Supervised learning relies on statistical distributions instead of actual labeled datasets. - Supervised learning uses generative models to predict missing input data. - Supervised learning works with labeled data and learns from the outcome to make future predictions. - Unsupervised learning detects relationships and structures within data where the labels or categories are unknown. **CORRECT:** "Supervised learning works with labeled data and learns from the outcome to make future predictions." is the correct answer. Supervised learning is a machine learning approach where the algorithm is trained on a labeled dataset. This means that the data used for training includes both the input features and the correct output (outcome or label). The goal is to learn a mapping from inputs to outputs so that the model can accurately predict outcomes for new, unseen data. In the context of healthcare, if a dataset contains patient information along with a label indicating whether they were readmitted or not, supervised learning can be used to build a model that predicts readmission risk. This learning type is widely used for classification and regression problems. **CORRECT:** "Unsupervised learning detects relationships and structures within data where the labels or categories are unknown." is the correct answer. Unsupervised learning is used when the data does not have any labels or known outcomes. The goal is to uncover hidden patterns, relationships, or groupings in the data. Techniques like clustering and dimensionality reduction fall under this category. In healthcare, unsupervised learning can help find patterns in patient data, like grouping patients with similar symptoms, without prior knowledge of the outcome. It's especially useful for exploratory data analysis when outcomes are not clearly defined. **INCORRECT:** "Supervised learning uses generative models to predict missing input data." is incorrect. Supervised learning primarily focuses on predicting outputs from known inputs using discriminative models. Generative models, which attempt to model how data is generated, are typically associated with unsupervised or semi-supervised learning. Supervised learning does not aim to predict missing inputs but rather learn the relationship between input features and known outcomes. **INCORRECT:** "Unsupervised learning always achieves higher accuracy with structured datasets." is incorrect. Because accuracy isn't always a clear or appropriate metric in unsupervised learning, especially since there are no labels to compare results against. While structured datasets might make it easier to discover patterns, it doesn't guarantee higher performance. The effectiveness of unsupervised learning depends on the algorithm used and the nature of the data. **INCORRECT:** "Supervised learning relies on statistical distributions instead of actual labeled datasets." is incorrect. Supervised learning fundamentally depends on labeled datasets where each input is paired with an output. While statistical methods might be used during modeling, the core of supervised learning is using actual labeled data to train the model. **References:** https://aws.amazon.com/compare/the-difference-between-machine-learning-supervised-and-unsupervised Domain: Fundamentals of AI and ML --- #### 38. A legal tech company wants its AI chatbot to answer legal queries using up-to-date laws without retraining the model. What is the primary advantage of using Retrieval-Augmented Generation (RAG) for this use case? - RAG enables the model to generate new data from scratch without using any source. - RAG allows the model to retrieve updated information during runtime without retraining. - RAG restricts the model from using external data to avoid information leakage. - RAG fine-tunes the model to store data permanently in its architecture. **CORRECT:** "RAG allows the model to retrieve updated information during runtime without retraining." is the correct answer. Retrieval-Augmented Generation (RAG) is a technique that allows AI models to improve their responses by retrieving relevant, up-to-date information from an external data source, such as a document database or knowledge base. This happens at runtime, meaning the model can pull the latest information when answering a query without needing to be retrained. For a legal tech company, RAG helps the chatbot provide accurate and current legal information by retrieving the latest documents or records on-demand. This makes the system more flexible, cost-effective, and reliable because it reduces the need for frequent retraining every time laws change. **INCORRECT:** "RAG enables the model to generate new data from scratch without using any source." is incorrect. While AI models can generate text based on their training data, RAG specifically works by combining generation with retrieval of external information. It does not generate information from scratch without any source. **INCORRECT:** "RAG fine-tunes the model to store data permanently in its architecture." is incorrect. Fine-tuning changes the model's internal knowledge by retraining it on new data. RAG does not require fine-tuning or storing data permanently inside the model. Instead, it retrieves data from an external source at runtime. This makes fine-tuning unnecessary for keeping information up to date. **INCORRECT:** "RAG restricts the model from using external data to avoid information leakage." is incorrect. RAG is designed to use external data, not restrict it. Its main advantage is allowing the model to retrieve relevant information from trusted external sources to improve responses. **References:** https://aws.amazon.com/what-is/retrieval-augmented-generation Domain: Applications of Foundation Models --- #### 39. A global consulting firm is building a platform to automatically summarize meeting recordings and generate action items. The platform processes audio data and converts it into structured summaries. Which approach is most suitable for transforming human speech into meaningful text summaries? - Use robotic process automation (RPA) to analyze the audio pattern of the meeting and infer summaries visually. - Use Amazon Transcribe to convert speech to text, then apply Amazon Comprehend to analyze and summarize the text. - Apply graph neural networks to learn the spatial structure of voice patterns in raw audio waveforms. - Use Natural Language Processing (NLP) techniques to transcribe, analyze, and summarize human conversations effectively. **CORRECT:** "Use Amazon Transcribe to convert speech to text, then apply Amazon Comprehend to analyze and summarize the text." is the correct answer. This is the most suitable approach for transforming human speech into meaningful text summaries. Amazon Transcribe is designed to convert spoken language into written text using automatic speech recognition (ASR). Once the audio is transcribed, Amazon Comprehend can analyze the text using Natural Language Processing (NLP) to extract key phrases, entities, and sentiment, and even detect topics. This two-step pipeline enables a platform to first understand the content of the meeting through transcription and then derive structured summaries and action items using text analysis. It aligns well with AWS's recommended architecture for processing audio data and extracting insights from human language. **INCORRECT:** "Use Natural Language Processing (NLP) techniques to transcribe, analyze, and summarize human conversations effectively." is incorrect. While NLP is a core component of the solution, but it does not handle speech directly. NLP works on text, not raw audio. A transcription step is needed before applying NLP. Without converting speech to text first, NLP alone cannot analyze audio content. **INCORRECT:** "Use robotic process automation (RPA) to analyze the audio pattern of the meeting and infer summaries visually," is incorrect. RPA is used to automate repetitive digital tasks, like clicking through applications or copying data. It is not designed to analyze or understand audio content. It lacks the intelligence needed to perform speech recognition or text summarization. **INCORRECT:** "Apply graph neural networks to learn the spatial structure of voice patterns in raw audio waveforms." is incorrect. Graph neural networks are used for problems involving graph-structured data, such as social networks or molecular structures. They are not suited for processing raw audio for speech-to-text or summarization. This approach is unrelated to the task of summarizing human speech. **References:** https://docs.aws.amazon.com/transcribe/latest/dg/what-is.html https://docs.aws.amazon.com/comprehend/latest/dg/what-is.html Domain: Fundamentals of AI and ML --- #### 40. An MLOps engineer needs to deploy a computer vision model using Amazon SageMaker for an autonomous vehicle application. The model must provide ultra-low latency inference and scale instantaneously to handle varying sensor data input rates. Which SageMaker deployment strategy is MOST suitable, considering cost-efficiency for prolonged operation? - Amazon SageMaker Batch Transform with distributed data processing - Amazon SageMaker Multi-Model Endpoints with infrequent model updates - Amazon SageMaker Asynchronous Inference with GPU instances - Amazon SageMaker Real-Time Inference with Provisioned Concurrency **CORRECT:** "Amazon SageMaker Real-Time Inference with Provisioned Concurrency" is the correct answer. Amazon SageMaker Real-Time Inference involves deploying a model to a persistent endpoint that can respond to inference requests in real time. Provisioned Concurrency allows you to keep a specified number of inference containers initialized and ready to respond immediately. This is crucial for ultra-low latency requirements in an autonomous vehicle application. By pre-allocating compute resources, you minimize cold start times and ensure consistent performance, even with sudden spikes in sensor data. For prolonged operation, while it requires paying for provisioned instances, it is cost effective in the long term for latency-sensitive applications that need constant availability. **INCORRECT:** "Amazon SageMaker Asynchronous Inference with GPU instances" is incorrect. Asynchronous Inference is designed for processing large batches of data without requiring immediate responses. While GPUs can accelerate inference, asynchronous inference is not suitable for ultra-low latency requirements. It's better for tasks that can tolerate delays, which is not the case for real-time autonomous vehicle operations. **INCORRECT:** "Amazon SageMaker Batch Transform with distributed data processing" is incorrect. Batch Transform processes data in batches, which is not suitable for real-time inference. It's designed for offline inference tasks where latency is not a critical factor. Autonomous vehicles require immediate responses to sensor inputs, making batch processing inappropriate. **INCORRECT:** "Amazon SageMaker Multi-Model Endpoints with infrequent model updates" is incorrect. Multi-Model Endpoints allow you to host multiple models on a single endpoint, reducing costs when you have many models with infrequent usage. While it can be cost-effective for managing multiple models, it does not guarantee ultra-low latency and instantaneous scaling needed for real-time autonomous vehicle applications. The model loading time between different models can add significant latency. **References:** https://aws.amazon.com/blogs/machine-learning/announcing-provisioned-concurrency-for-amazon-sagemaker-serverless-inference Domain: Applications of Foundation Models --- #### 41. You are developing a machine learning based application that responds to user behavior with dynamic prompts in under 10 milliseconds. The backend requires globally distributed, low-latency access to structured session data. Which AWS database architecture is the most appropriate? - Amazon Redshift with materialized joins - Amazon DocumentDB with custom indexing - Amazon DynamoDB with global tables - Amazon Aurora Global Database **CORRECT:** "Amazon DynamoDB with global tables" is the correct answer. Amazon DynamoDB is a fully managed NoSQL database service that delivers single-digit millisecond performance at any scale. With global tables, DynamoDB allows you to replicate your data across multiple AWS Regions, enabling low-latency, high-speed access to structured session data anywhere in the world. This setup is ideal for real-time applications, such as machine learning systems that need to generate dynamic responses in under 10 milliseconds. Global tables are designed to handle workloads that require both high availability and minimal read/write latency across regions—making them perfect for globally distributed ML-based applications. **INCORRECT:** "Amazon Aurora Global Database" is incorrect. Aurora Global Database is a high-performance, MySQL- and PostgreSQL-compatible relational database designed for globally distributed applications. While it supports low-latency reads from secondary regions, its write operations are limited to a single primary region and do not support sub-10 ms response times for write-heavy or high-speed applications like real-time ML prompts. **INCORRECT:** "Amazon Redshift with materialized joins" is incorrect. Amazon Redshift is a powerful data warehouse service used for complex queries and analytics over large datasets. Although materialized joins can improve performance for analytical workloads, Redshift is not optimized for real-time, low-latency data access and isn't suitable for operational session management or real-time ML responses. **INCORRECT:** "Amazon DocumentDB with custom indexing" is incorrect. Amazon DocumentDB is designed for JSON document-based workloads, similar to MongoDB. While it supports indexing and can manage structured session data, it lacks the global replication and ultra-low latency capabilities of DynamoDB global tables, making it less suitable for highly responsive, globally distributed ML applications. **References:** https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GlobalTables.html Domain: Applications of Foundation Models --- #### 42. A software development team is building a system that needs to detect fraudulent transactions in real time. Initially, the team considered using hard-coded rules based on predefined thresholds. However, the business requirements are expected to change frequently, and the team is now evaluating the use of machine learning (ML) to improve adaptability and long-term performance. What is the key benefit of training a machine learning (ML) model instead of using hard-coded rules in a software system? - ML models depend entirely on fixed logic, offering consistency over adaptability. - ML models can be manually reprogrammed to adapt to future business needs. - ML models generalize from data to recognize patterns and improve over time without explicit programming. - ML models eliminate the need for algorithmic logic or input features during training. **CORRECT:** "ML models generalize from data to recognize patterns and improve over time without explicit programming." is the correct answer. Machine Learning (ML) allows systems to learn from historical data and make decisions or predictions without being explicitly programmed with fixed rules. This is especially useful in fraud detection, where patterns are often complex and constantly evolving. Instead of relying on hard-coded rules (which can quickly become outdated), ML models analyze data to detect subtle patterns and anomalies that might indicate fraudulent activity. As more data becomes available, the model can be retrained to adapt and improve. This makes ML much more flexible and scalable in dynamic environments compared to static rule-based systems. **INCORRECT:** "ML models can be manually reprogrammed to adapt to future business needs." is incorrect. This statement confuses manual programming with learning from data. ML models don't require manual reprogramming — that's the key advantage. They adapt by retraining on new data, not by rewriting code. **INCORRECT:** "ML models depend entirely on fixed logic, offering consistency over adaptability." is incorrect. This is the opposite of what ML offers. ML is about adaptability, not fixed logic. Rule-based systems offer consistency, but ML models are designed to learn and adjust as new data comes in. **INCORRECT:** "ML models eliminate the need for algorithmic logic or input features during training." is incorrect. ML still requires algorithmic logic (choice of algorithm like decision trees, neural networks, etc.) and input features (such as transaction amount, location, user history). The model learns patterns from these features — they are essential to training. **References:** https://aws.amazon.com/what-is/machine-learning Domain: Fundamentals of AI and ML --- #### 43. A company plans to fine-tune a foundation model using Amazon Bedrock. They need a secure and scalable solution to store their large dataset of customer reviews. Which AWS service should they use? - Amazon RDS - Amazon S3 - Amazon DynamoDB - Amazon Redshift **CORRECT:** "Amazon S3" is the correct answer. Amazon S3 (Simple Storage Service) is a scalable object storage service used to store and retrieve any amount of data from anywhere. It's widely used for backups, data lakes, and static content storage. Amazon S3 is used to store input data, prompt templates, and fine-tuning datasets. You can upload your data to S3 and configure Bedrock to access it by specifying the S3 URI. This allows Bedrock to utilize the data for tasks such as model customization, evaluation, or inference input/output management. In this case, the company is preparing to fine-tune a foundation model on Amazon Bedrock and needs to store a large dataset of customer reviews. Amazon S3 is the best choice because it can securely store massive datasets, integrate directly with many AWS AI/ML services (including Bedrock and SageMaker), and support fine-tuning tasks that require access to training data stored in a reliable and scalable location. Additionally, S3 offers features like access control, encryption, versioning, and lifecycle policies, making it a secure and cost-effective option for storing training data. **INCORRECT:** "Amazon DynamoDB" is incorrect. DynamoDB is a fast and flexible NoSQL database service optimized for key-value and document data. While great for low-latency lookups and real-time applications, it is not designed to store large unstructured datasets like text files used for model training. **INCORRECT:** "Amazon Redshift" is incorrect. Redshift is a fully managed data warehouse used for analytical workloads and structured data queries. It is suitable for running complex SQL queries on structured datasets, but it is not optimized for storing and retrieving large unstructured datasets like customer reviews in text files. **INCORRECT:** "Amazon RDS" is incorrect. Amazon RDS is a relational database service that supports SQL-based databases like MySQL and PostgreSQL. It's good for transactional applications but not for storing large-scale training datasets for ML models, especially when those datasets consist of files or large text records. **References:** https://aws.amazon.com/s3 https://aws.amazon.com/blogs/aws/customize-models-in-amazon-bedrock-with-your-own-data-using-fine-tuning-and-continued-pre-training Domain: Applications of Foundation Models --- #### 44. A developer wants their application to read text with a cheerful tone and insert natural pauses after commas. Which Amazon Polly feature enables these customizations? - Custom vocabulary API - Neural TTS - SSML - Voice cloning **CORRECT:** "SSML" is the correct answer. SSML (Speech Synthesis Markup Language) is a powerful feature supported by Amazon Polly that lets developers customize how text is spoken. With SSML, you can control voice tone, pitch, volume, speed, and insert natural pauses using <break> tags. You can also set an emotional tone, such as a cheerful or empathetic voice, using supported Amazon Polly neural voices with the <amazon:emotion> tag. For example, inserting pauses after commas or making the voice sound more cheerful helps make synthesized speech sound more human and engaging. This makes SSML the perfect tool for the developer's goal of customizing tone and pacing. **INCORRECT:** "Neural TTS" is incorrect. Neural Text-to-Speech (TTS) provides more natural-sounding voices using deep learning. While it improves the overall voice quality, it does not offer direct control over tone or pauses. That level of customization requires SSML. **INCORRECT:** "Custom vocabulary API" is incorrect. This feature helps Amazon Polly pronounce specific terms, like brand names or technical jargon, correctly. However, it does not affect tone or pauses in speech, so it's not suitable for the developer's use case. **INCORRECT:** "Voice cloning" is incorrect. Voice cloning allows the creation of custom voices that sound like a specific person, but it is not available for general use and does not control tone or pauses dynamically. It's unrelated to the basic customization the developer wants. **References:** https://docs.aws.amazon.com/polly/latest/dg/ssml.html Domain: Applications of Foundation Models --- #### 45. A financial services company is building a document summarization feature using Amazon Bedrock. The product owner asks how the pricing works to predict costs. Which statement describes the Amazon Bedrock pricing model? - Monthly fee covers unlimited API requests and data processing - Charges based on the tokens processed in the input and output - Fixed pricing based on the number of end users supported - Pay based on the number of inference requests submitted **CORRECT:** "Charges based on the tokens processed in the input and output" is the correct answer. Amazon Bedrock provides access to foundation models from various providers for tasks like summarization, text generation, and more. Its pricing model is based on the number of tokens processed, which includes both the tokens in the input prompt and the tokens in the generated output. Tokens can be parts of words, full words, or symbols, depending on the model's tokenizer. For example, summarizing a document would involve tokenizing the input document and the summary produced. This token-based billing helps businesses estimate costs based on how much content they process. **INCORRECT:** "Pay based on the number of inference requests submitted" is incorrect. While some AWS AI services charge per request, Amazon Bedrock charges based on token usage, not just the number of requests. A short request and a long document may have very different costs because they process different numbers of tokens. **INCORRECT:** "Monthly fee covers unlimited API requests and data processing" is incorrect. Amazon Bedrock does not offer an unlimited usage monthly fee. Instead, it uses a pay-as-you-go pricing model based on the number of tokens processed. This provides flexibility based on actual usage rather than a fixed monthly subscription. **INCORRECT:** "Fixed pricing based on the number of end users supported" is incorrect. Amazon Bedrock pricing is not tied to the number of end users. Costs depend on how much data is processed through the models, measured in tokens, not on how many people use the feature. **References:** https://aws.amazon.com/bedrock/pricing Domain: Fundamentals of Generative AI --- #### 46. When training a generative AI model for text summarization, which AWS service would provide the fastest and most cost-effective way to get started without building your infrastructure from scratch? - Amazon SageMaker Data Wrangler - Amazon SageMaker Clarify - Amazon SageMaker JumpStart - Amazon SageMaker Feature Store **CORRECT:** "Amazon SageMaker JumpStart" is the correct answer. Amazon SageMaker JumpStart is a fully managed capability that helps you quickly get started with machine learning by providing access to pre-trained models, example notebooks, and solution templates for common use cases—including generative AI tasks like text summarization. It eliminates the need to build infrastructure or code models from scratch. You can deploy, fine-tune, or experiment with models directly in a few clicks, making it the fastest and most cost-effective way to begin building generative AI solutions in SageMaker. JumpStart is especially useful for teams that want to experiment and prototype quickly. **INCORRECT:** "Amazon SageMaker Clarify" is incorrect. Clarify is a SageMaker feature focused on model bias detection, explainability, and fairness. It helps understand how models make predictions but does not assist in setting up or training generative models like text summarizers. **INCORRECT:** "Amazon SageMaker Data Wrangler" is incorrect. Data Wrangler simplifies data preparation and transformation tasks. While it helps clean and preprocess datasets, it doesn't offer pre-built models or starter infrastructure for generative AI tasks, so it's not ideal for quickly starting model training. **INCORRECT:** "Amazon SageMaker Feature Store" is incorrect. Feature Store is a central repository to store, retrieve, and share machine learning features across teams. It's valuable for managing ML features at scale but not suitable for quickly launching a generative AI model or starting a new project. **References:** https://docs.aws.amazon.com/sagemaker/latest/dg/whatis.html Domain: Fundamentals of Generative AI --- #### 47. A media company deploys an AI-powered tool to generate brief news article summaries from extensive reports. The team must evaluate the quality and relevance of the summaries before publishing. Which of the following evaluation methods best assesses the summarization quality? - ROC AUC - F1-score - ROUGE - BLEU **CORRECT:** "ROUGE" is the correct answer. ROUGE (Recall-Oriented Understudy for Gisting Evaluation) is a set of metrics designed to evaluate automatic summarization and machine translation by comparing the overlap between machine-generated content and human-written reference texts. It focuses on recall, measuring how much of the reference content is captured in the generated summary. ROUGE-N evaluates n-gram overlaps, while ROUGE-L assesses the longest common subsequence between texts. In the context of summarization, ROUGE is widely adopted because it effectively quantifies the similarity between AI-generated summaries and human-authored ones, ensuring the summaries retain essential information from the original reports. **INCORRECT:** "ROC AUC" is incorrect. ROC AUC (Receiver Operating Characteristic - Area Under Curve) is a performance measurement for classification problems at various threshold settings. It evaluates the ability of a model to distinguish between classes, making it suitable for binary classification tasks. However, summarization is a generative task, not a classification one, so ROC AUC isn't applicable for assessing summary quality. **INCORRECT:** "F1-score" is incorrect. The F1-score is the harmonic mean of precision and recall, commonly used in classification tasks to evaluate a model's accuracy. While useful for tasks like sentiment analysis, it doesn't assess the quality or relevance of generated summaries, making it unsuitable for summarization evaluation. **INCORRECT:** "BLEU" is incorrect. BLEU (Bilingual Evaluation Understudy) is primarily used for machine translation and evaluates how closely a machine-generated sentence matches a human-written one. While it can be applied to summarization, it is better suited for translation tasks and is less effective at evaluating content coverage and relevance than ROUGE. **References:** https://aws.amazon.com/blogs/machine-learning/evaluate-the-text-summarization-capabilities-of-llms-for-enhanced-decision-making-on-aws Domain: Applications of Foundation Models --- #### 48. A company is developing an AI solution on AWS that involves multiple departments including Data Science, DevOps, and Business Analytics. Each department requires specific permissions to AWS services but must not access resources outside their scope. What is the best way to enforce least privilege access across all departments? - Assign users to IAM groups and add inline policies based on user tasks. - Use IAM roles with permission boundaries and assign them based on department function. - Enable cross-account access between all teams and limit access with security groups. - Use AWS Organizations to assign service control policies (SCPs) to each department. **CORRECT:** "Use IAM roles with permission boundaries and assign them based on department function." is the correct answer. IAM roles allow assigning temporary, role-based access to AWS resources. When paired with permission boundaries, they provide fine-grained control over the maximum permissions a role (or user) can have—even if a more permissive policy is attached later. This approach helps enforce least privilege by ensuring each department (e.g., Data Science, DevOps, Business Analytics) only gets the access it needs for its function. It's scalable, flexible, and aligns with AWS best practices for access management. By using roles instead of permanent user credentials, organizations also benefit from increased security and better policy control. **INCORRECT:** "Use AWS Organizations to assign service control policies (SCPs) to each department." is incorrect. SCPs are useful for setting permission guardrails at the account level, not individual IAM users or roles. While SCPs can restrict access across accounts in an AWS Organization, they are not intended for managing permissions between departments within the same account. So this option lacks the flexibility needed for per-department control. **INCORRECT:** "Assign users to IAM groups and add inline policies based on user tasks." is incorrect. Inline policies are attached directly to users or groups and are harder to manage and audit over time. This approach is less scalable and prone to misconfigurations. Also, using only groups and inline policies does not support temporary access or permission boundaries, which limits enforcement of strict least privilege. **INCORRECT:** "Enable cross-account access between all teams and limit access with security groups." is incorrect. Cross-account access is used when users or services in different AWS accounts need to interact. Security groups are mainly for controlling network-level access, not IAM-level permissions. This option confuses network controls with identity and access controls, making it unsuitable for enforcing least privilege within the same organization. **References:** https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_boundaries.html https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html Domain: Security, Compliance, and Governance for AI Solutions --- #### 49. An AI development team is assessing their model's outputs to ensure equitable treatment across various demographic groups. Which AWS Responsible AI principle are they primarily addressing? - Considering impacts on different groups of stakeholders to promote equitable outcomes. - Appropriately obtaining, using, and protecting data and models to maintain user privacy. - Implementing mechanisms to monitor and steer AI system behavior effectively. - Ensuring the AI system's outputs are understandable and can be evaluated by stakeholders. **CORRECT:** "Considering impacts on different groups of stakeholders to promote equitable outcomes." is the correct answer. AWS emphasizes fairness as one of the core principles of Responsible AI. This principle focuses on ensuring that AI systems provide equitable outcomes for all users, regardless of their background, identity, or circumstances. In this scenario, the AI development team is actively assessing how their model performs across various demographic groups, which directly reflects their effort to identify and reduce any potential biases. By doing this, they are working to ensure that the AI system treats all individuals fairly and avoids unintended discrimination. This aligns directly with AWS's commitment to promoting fairness and equity in AI applications. **INCORRECT:** "Ensuring the AI system's outputs are understandable and can be evaluated by stakeholders." is incorrect. This refers to explainability, which ensures that both technical and non-technical users can understand how an AI system works and why it made a particular decision. While important, this option does not focus on fairness or demographic impact. **INCORRECT:** "Implementing mechanisms to monitor and steer AI system behavior effectively." is incorrect. This refers to governance, which is about controlling how an AI system behaves over time through monitoring, feedback loops, and adjustments. It's more about system control rather than fairness across groups. **INCORRECT:** "Appropriately obtaining, using, and protecting data and models to maintain user privacy." is incorrect. This addresses privacy and security, ensuring that data used in AI systems is collected and handled responsibly. While vital for user trust, this principle doesn't directly deal with assessing fairness for demographic groups. **References:** https://aws.amazon.com/ai/responsible-ai Domain: Guidelines for Responsible AI --- #### 50. A customer service chatbot is being enhanced to answer complex refund-related queries. Developers include a few example Q&A pairs in the prompt and add step-by-step reasoning in the examples. What prompting method is being used? - This is a zero-shot prompting with chain-of-thought prompting - This is a few-shot prompting combined with chain-of-thought prompting - This is a zero-shot prompting with multi-modal input explanation - This is an embedding-based vector search combined with supervised fine-tuning **CORRECT:** "This is a few-shot prompting combined with chain-of-thought prompting" is the correct answer. Few-shot prompting is a technique used in large language models (LLMs) where a few example input-output pairs (Q&A examples) are provided in the prompt. This helps the model learn the expected pattern or reasoning from the examples without actual model retraining. Chain-of-thought prompting is an additional method where each example includes a step-by-step reasoning process to guide the model in producing more logical and accurate answers. In this case, developers are enhancing the chatbot by adding a few example queries and step-by-step answers—clearly using both techniques together. This helps the model understand how to answer refund-related queries with clarity and reasoning. **INCORRECT:** "This is a zero-shot prompting with multi-modal input explanation" is incorrect. Zero-shot prompting means the model is given a task without any examples. Multi-modal refers to inputs that include text, images, or audio. Since the prompt includes examples and step-by-step reasoning, this is not a zero-shot scenario, and no multi-modal input is used here. **INCORRECT:** "This is an embedding-based vector search combined with supervised fine-tuning" is incorrect. Embedding-based vector search is used for retrieving similar documents or content based on similarity scores, and supervised fine-tuning involves training a model on labeled data to improve performance. These techniques involve more infrastructure and are not related to the example-based prompt method described in the question. **INCORRECT:** "This is a zero-shot prompting with chain-of-thought prompting" is incorrect. Zero-shot prompting does not use examples in the prompt, which contradicts the scenario where developers include example Q&A pairs. Hence, this option is incorrect. **References:** https://docs.aws.amazon.com/bedrock/latest/userguide/design-a-prompt.html https://aws.amazon.com/what-is/prompt-engineering Domain: Applications of Foundation Models --- #### 51. A fashion company wants to use a generative AI to analyze images of clothing to extract features such as sleeve length, neckline type, color palette, and fabric texture. What task is the generative model best suited to perform? - Manage warehouse inventory levels using sales performance tracking. - Predict customer purchase likelihood using historical sales data. - Can extract visual features and generate new clothing designs. - Assign pricing to clothing items based on demand and seasonal factors. **CORRECT:** "Can extract visual features and generate new clothing designs." is the correct answer. Generative AI models are highly capable when it comes to processing and understanding visual data. In this case, the model can analyze clothing images and automatically extract relevant visual features such as sleeve length, neckline type, color palette, and texture. This information is valuable for categorizing fashion items, enhancing search and recommendation engines, and generating insights into design trends. Additionally, generative AI can go beyond feature extraction to create entirely new clothing designs by learning from existing image patterns and styles. This is especially useful for design teams looking to accelerate innovation or offer AI-assisted customization. These capabilities make generative AI a perfect fit for fashion tech applications where creativity and automation intersect. **INCORRECT:** "Predict customer purchase likelihood using historical sales data." is incorrect. This task falls under predictive analytics and traditional machine learning, not generative AI. It involves statistical modeling of structured data rather than generating or analyzing image content. **INCORRECT:** "Assign pricing to clothing items based on demand and seasonal factors." is incorrect. Pricing optimization is typically handled using regression models and time-series analysis. Generative AI isn't the ideal tool for setting prices based on market dynamics. **INCORRECT:** "Manage warehouse inventory levels using sales performance tracking." is incorrect. Inventory management is a logistics and operations task, better suited for traditional forecasting models or rule-based systems, not generative AI. **References:** https://aws.amazon.com/ai/generative-ai Domain: Fundamentals of Generative AI --- #### 52. A global news agency is developing an AI system that recommends articles to users based on reading history and automatically summarizes each article before display. To ensure the summaries are contextually accurate and grounded in the original source material, which design strategy should be applied? - Implement Retrieval-Augmented Generation to combine document search with summarization - Train a reinforcement learning model on reader click-through data - Use fine-tuned transformer models with zero-shot classification - Apply prompt engineering with few-shot learning for query optimization **CORRECT:** "Implement Retrieval-Augmented Generation to combine document search with summarization" is the correct answer. Retrieval-Augmented Generation (RAG) is a strategy that improves the factual accuracy and contextual relevance of AI-generated content by combining traditional document retrieval with generative models. In this approach, the system first retrieves relevant context or source material (e.g., the original article) using a search or embedding-based retrieval mechanism. This context is then passed into a generative model (such as a foundation model in Amazon Bedrock) to generate a summary that is grounded in the actual source content. For a global news agency, RAG ensures that generated summaries remain factually accurate and aligned with the original article—especially important in journalism where misinformation must be avoided. **INCORRECT:** "Use fine-tuned transformer models with zero-shot classification" is incorrect. While fine-tuning can help adapt models to specific domains, zero-shot classification is about labeling content without examples. It does not support context-grounded summarization. Also, this strategy alone lacks the retrieval component necessary to ensure accuracy from source materials. **INCORRECT:** "Apply prompt engineering with few-shot learning for query optimization" is incorrect. Prompt engineering with few-shot learning can guide the model's behavior using examples, but it doesn't ensure factual consistency with long or detailed source content. It's helpful but not sufficient when summarization must be tightly aligned with specific documents. **INCORRECT:** "Train a reinforcement learning model on reader click-through data" is incorrect. Reinforcement learning (RL) from click-through data is better suited for personalized recommendation optimization, not for generating summaries grounded in original content. It can help with article ranking but not content accuracy in summaries. **References:** https://aws.amazon.com/what-is/retrieval-augmented-generation Domain: Applications of Foundation Models --- #### 53. A recommendation engine is being improved using RLHF to better reflect subjective user experiences. Which action should the development team take before training the reinforcement component? - Collect human preference data by comparing alternative model outputs - Analyze historical user interaction logs using statistical inference techniques - Collect publicly available datasets with comprehensive documentation - Generate predicted reward values for unannotated data points using a pre-trained model **CORRECT:** "Collect human preference data by comparing alternative model outputs" is the correct answer. Reinforcement Learning with Human Feedback (RLHF) is a method used to align AI models with human values and preferences. Before training the reinforcement component, a crucial step is collecting human preference data. This involves presenting human evaluators with multiple model-generated responses and asking them to rank or choose the preferred output. These comparisons help build a reward model that can guide future reinforcement learning. For a recommendation engine aiming to reflect subjective user experiences—like satisfaction, tone, or content relevance—this human-in-the-loop data collection is essential. It ensures the system learns from real user preferences, not just mathematical patterns. **INCORRECT:** "Collect publicly available datasets with comprehensive documentation" is incorrect. While public datasets are useful for pretraining or benchmarking, they typically do not contain human preference comparisons specific to your use case. RLHF relies on task-specific human feedback, not general data. **INCORRECT:** "Generate predicted reward values for unannotated data points using a pre-trained model" is incorrect. Using a pre-trained model to simulate reward values might help later, but it depends on having a trained reward model first—which is built using actual human feedback. So this step comes after collecting human preference data, not before. **INCORRECT:** "Analyze historical user interaction logs using statistical inference techniques" is incorrect. Interaction logs can be useful for understanding behavior trends, but they don't provide direct preference comparisons needed for RLHF. The goal is to know which output users prefer—something that logs alone don't clearly show. **References:** https://aws.amazon.com/what-is/reinforcement-learning-from-human-feedback Domain: Applications of Foundation Models --- #### 54. An e-learning company is using a foundation model deployed on Amazon Bedrock to auto-generate educational summaries. The quality of output is inconsistent across topics. The data science team wants to improve the clarity, accuracy, and structure of the generated content without changing the model or fine-tuning it. What technique should the team use to improve the model's response quality? - Pre-tokenization using custom vocabulary for domain adaptation - Transfer learning using structured input features - Embedding vector clustering for context personalization - Prompt engineering to guide the model using optimized input phrasing **CORRECT:** "Prompt engineering to guide the model using optimized input phrasing" is the correct answer. Prompt engineering is the process of crafting and optimizing the input prompts given to a foundation model to get more accurate, relevant, and well-structured responses. It's especially useful when teams want to improve output quality without changing the model architecture or fine-tuning. In Amazon Bedrock, where users interact with powerful foundation models through APIs, the quality of the input prompt greatly influences the output. By providing more detailed instructions, structured questions, or examples in the prompt, the team can help the model produce clearer, more topic-specific educational summaries. It's a cost-effective, model-agnostic technique suitable for real-time applications like this one. **INCORRECT:** "Transfer learning using structured input features" is incorrect. Transfer learning involves adapting a pre-trained model to a new task by further training it on related data. This approach does require modifying or fine-tuning the model, which goes against the team's goal of improving performance without altering the model. **INCORRECT:** "Embedding vector clustering for context personalization" is incorrect. Embedding clustering is a technique to group similar pieces of information using vector representations. It is helpful in information retrieval or personalization tasks, but it does not directly improve the structure or clarity of generated text from a foundation model. **INCORRECT:** "Pre-tokenization using custom vocabulary for domain adaptation" is incorrect. Pre-tokenization and custom vocabulary are used in the model training phase to improve understanding of domain-specific terms. This requires changing how the model processes text and is not applicable when using fixed models like those deployed on Amazon Bedrock without fine-tuning. **References:** https://aws.amazon.com/what-is/prompt-engineering Domain: Fundamentals of Generative AI --- #### 55. A global media company is building an AI-powered application on AWS. The company operates in regions with strict data protection laws. To comply with these laws, it wants to ensure that customer data is only stored and processed within the customer's region. What is the primary benefit of using data residency? - Data residency enables automatic deletion of expired data from Amazon S3 to optimize storage costs across regions. - Data residency allows customer data to remain within a specific geographic region to meet legal or regulatory requirements. - Data residency ensures that all training data is logged and retained for audit purposes within the AWS Cloud. - Data residency guarantees that encrypted data cannot be accessed by users outside the customer's IAM role. **CORRECT:** "Data residency allows customer data to remain within a specific geographic region to meet legal or regulatory requirements." is the correct answer. Data residency refers to the practice of ensuring that data is stored and processed in a specific geographic location. This is especially important for organizations that operate in regions with strict regulations, such as the EU's GDPR or data protection laws in countries like Canada or Australia. AWS enables customers to choose the region where their data resides, which helps them stay compliant with local legal or contractual obligations. By using services configured for a specific AWS Region, customers can make sure that both storage and processing of their data remain within the borders of that region. This is the key benefit of data residency — ensuring compliance with data sovereignty and regulatory requirements. **INCORRECT:** "Data residency ensures that all training data is logged and retained for audit purposes within the AWS Cloud." is incorrect. This option confuses data residency with logging and audit features. While AWS services do provide logging (e.g., CloudTrail, CloudWatch), these are separate from data residency. Data residency is about where the data is stored and processed — not about retaining all data for audit purposes. **INCORRECT:** "Data residency guarantees that encrypted data cannot be accessed by users outside the customer's IAM role." is incorrect. IAM (Identity and Access Management) policies control who can access data, not where it resides. Encryption and IAM work together to secure access, but data residency is about complying with regional legal requirements, not access control mechanisms. **INCORRECT:** "Data residency enables automatic deletion of expired data from Amazon S3 to optimize storage costs across regions." is incorrect. This refers to S3 lifecycle policies, which are used for cost optimization by deleting or archiving data — not about meeting data residency or compliance requirements. Data residency is about the physical/geographic location of data, not its retention policy. **References:** https://aws.amazon.com/blogs/security/addressing-data-residency-with-aws https://docs.aws.amazon.com/prescriptive-guidance/latest/strategy-aws-semicon-workloads/meeting-data-residency-requirements.html Domain: Security, Compliance, and Governance for AI Solutions --- #### 56. A company using generative AI for customer service is concerned about the model's interpretability. They need to ensure the model's responses are understandable and explainable to both users and regulators. Which limitation of generative AI should they be aware of in this context? - Generative AI models lack transparency and can be difficult to interpret. - Generative AI models are limited to specific industries like e-commerce. - Generative AI models do not support real-time customer interaction. - Generative AI models can provide only short, template-based responses. **CORRECT:** "Generative AI models lack transparency and can be difficult to interpret." is the correct answer. Generative AI models, especially those based on deep learning architectures like transformers, are often considered "black boxes" due to their complex internal workings and large number of parameters. This lack of transparency makes it challenging to understand how the model arrives at specific responses. For companies concerned about interpretability, this is a significant limitation. They need to ensure that the model's decision-making process is understandable and explainable to both users and regulators to maintain trust, comply with regulatory requirements, and facilitate debugging and improvement of the model. Techniques like Explainable AI (XAI) can help mitigate this issue, but they may not fully resolve the inherent opacity of such models. **INCORRECT:** "Generative AI models can provide only short, template-based responses." is incorrect. Generative AI models are designed to produce diverse and contextually rich outputs, not just short or template-based responses. They generate content based on learned patterns from large datasets, allowing them to create varied and creative responses. In customer service applications, this enables more natural and personalized interactions, enhancing user experience. **INCORRECT:** "Generative AI models do not support real-time customer interaction." is incorrect. Generative AI models can be deployed to handle real-time interactions with customers. With proper optimization and deployment strategies, such as using scalable infrastructure and efficient inference techniques, these models can generate responses quickly enough for live customer service scenarios. **INCORRECT:** "Generative AI models are limited to specific industries like e-commerce." is incorrect. Generative AI models are versatile and applicable across a wide range of industries, including but not limited to healthcare, finance, education, and customer service. Their ability to generate human-like text makes them valuable for various applications such as drafting emails, writing code, creating content, and more. They are not confined to any specific industry. **References:** https://aws.amazon.com/what-is/generative-ai https://aws.amazon.com/blogs/publicsector/generative-ai-understand-the-challenges-to-realize-the-opportunities Domain: Fundamentals of Generative AI --- #### 57. A media production company is considering the cloud to manage rendering workloads, video editing pipelines, and content distribution. The CTO is focused on how cloud computing might enhance their agility and scalability for dynamic workloads. Which of the following are the advantages of cloud computing? (Select TWO.) - Automatic conversion of legacy software into cloud-native applications without code refactoring. - Guaranteed isolation from other tenants at the hardware level in all cloud configurations. - Ability to provision large-scale computing capacity in minutes without upfront hardware investment. - Rapid deployment of high-performance computing resources with global reach. - Permanent data residency guarantees in any geographic region, regardless of regulations. **CORRECT:** "Ability to provision large-scale computing capacity in minutes without upfront hardware investment." is the correct answer. One of the key advantages of cloud computing is on-demand scalability. Media companies with dynamic workloads—like rendering and video processing—can instantly scale up computing power without needing to buy or maintain physical hardware. This flexibility allows teams to focus on creativity and delivery while minimizing infrastructure costs, especially for bursty or high-performance tasks like rendering. **CORRECT:** "Rapid deployment of high-performance computing resources with global reach." is the correct answer. Cloud providers like AWS offer high-performance compute (HPC) resources across multiple regions worldwide. This allows media production teams to process and distribute content closer to end-users, improving performance and reducing latency. It also means they can quickly spin up compute environments in different geographies, supporting remote teams or regional content delivery without delay. **INCORRECT:** "Permanent data residency guarantees in any geographic region, regardless of regulations." is incorrect. Cloud providers allow data residency controls, but they cannot guarantee permanent residency regardless of legal or regulatory changes. Data residency depends on how the customer configures and manages resources within compliance frameworks. **INCORRECT:** "Guaranteed isolation from other tenants at the hardware level in all cloud configurations." is incorrect. While dedicated instances offer hardware-level isolation, not all cloud configurations provide this. Standard cloud services often use shared infrastructure with logical isolation rather than physical separation. **INCORRECT:** "Automatic conversion of legacy software into cloud-native applications without code refactoring." is incorrect. Legacy software typically needs some level of modification or containerization to become cloud-native. There is no automatic conversion without developer involvement. **References:** https://docs.aws.amazon.com/whitepapers/latest/aws-overview/six-advantages-of-cloud-computing.html Domain: Fundamentals of AI and ML --- #### 58. A government agency has deployed sensitive risk analysis models in Amazon SageMaker within a VPC that is fully isolated from the internet. To perform real-time inference, the model must securely retrieve new data from Amazon S3 without exposing any external traffic. Which of the following best meets the agency's requirements? - Grant S3 access via AWS Direct Connect without VPC endpoint configuration. - Use a NAT gateway to route S3 traffic while applying endpoint access control. - Set up an S3 interface VPC endpoint to establish private connectivity between SageMaker and Amazon S3. - Enable VPC peering to Amazon S3 from SageMaker's subnet. **CORRECT:** "Set up an S3 interface VPC endpoint to establish private connectivity between SageMaker and Amazon S3." is the correct answer. An S3 interface VPC endpoint (powered by AWS PrivateLink) enables secure, private connectivity between Amazon SageMaker and Amazon S3 without sending traffic over the public internet. This is ideal for sensitive workloads, such as those in a government agency, where compliance and security are top priorities. By configuring the SageMaker instances within a VPC and connecting to S3 through a VPC endpoint, the agency ensures data remains inside the AWS network. This configuration eliminates the need for internet access or NAT gateways, reducing attack surfaces and improving security posture. Access to S3 buckets can be further restricted using endpoint policies and IAM roles, allowing tight control over which services and resources can be accessed. **INCORRECT:** "Use a NAT gateway to route S3 traffic while applying endpoint access control." is incorrect. While a NAT gateway allows private subnets to access the internet (including S3), it still routes traffic through the internet, which contradicts the agency's requirement for full isolation. **INCORRECT:** "Enable VPC peering to Amazon S3 from SageMaker's subnet." is incorrect. VPC peering allows communication between two VPCs but cannot be used to connect directly to Amazon S3, as S3 is a regional service, not in a customer VPC. **INCORRECT:** "Grant S3 access via AWS Direct Connect without VPC endpoint configuration." is incorrect. AWS Direct Connect offers dedicated network connections but does not inherently provide private access to S3 unless paired with a VPC endpoint or a Direct Connect Gateway configuration—which adds complexity and may not meet real-time inference needs efficiently. **References:** https://docs.aws.amazon.com/AmazonS3/latest/userguide/privatelink-interface-endpoints.html Domain: Security, Compliance, and Governance for AI Solutions --- #### 59. An AI startup is deciding which type of inference (batch or real-time) to use for processing user requests. The startup expects millions of daily requests, each requiring a near-instant response. Which approach meets these requirements most effectively? - Use a hybrid approach where half the requests are processed in real time and half are batch processed. - Use batch inference with offline metrics to predict user behavior in advance. - Use batch inference to minimize operational costs. - Use real-time inference to process each request with minimal latency. **CORRECT:** "Use real-time inference to process each request with minimal latency." is the correct answer. Real-time inference is the best approach when a system requires immediate responses to user requests. Since the startup expects millions of daily requests that require near-instant responses, batch inference would not be suitable, as it processes data in bulk at scheduled intervals rather than instantly. Real-time inference processes each request as it arrives, ensuring low latency and quick decision-making. This is essential for applications like chatbots, recommendation systems, fraud detection, and AI-driven customer support. While real-time inference may have higher operational costs than batch processing, it ensures a smooth and responsive user experience, which is critical for AI applications handling high request volumes. **INCORRECT:** "Use batch inference to minimize operational costs." is incorrect. Batch inference is cost-effective for processing large datasets at scheduled times, but it is unsuitable for applications requiring real-time responses. Since the startup expects millions of user requests daily, batch inference would introduce unacceptable delays, making it impractical for real-time applications. **INCORRECT:** "Use a hybrid approach where half the requests are processed in real time and half are batch processed." is incorrect. A hybrid approach is useful for some use cases, such as when some AI tasks require real-time inference while others can be processed in batches. However, for a startup expecting millions of real-time user requests, a hybrid approach would not effectively meet the requirement of minimal latency. **INCORRECT:** "Use batch inference with offline metrics to predict user behavior in advance." is incorrect. While offline batch processing can generate insights and predictive models, it does not address the need for instant responses to user queries. Predicting user behavior in advance does not replace the need for real-time inference when immediate decision-making is required. **References:** https://docs.aws.amazon.com/sagemaker/latest/dg/realtime-endpoints.html https://docs.aws.amazon.com/sagemaker/latest/dg/deploy-model.html Domain: Fundamentals of AI and ML --- #### 60. A retail company is using generative AI to create personalized marketing content. The security team wants to stay ahead of cyberattacks and also reduce any weaknesses in the AI system. What is the best way to meet both goals? - Use threat detection to check for old software libraries, and have vulnerability management block external API calls. - Use threat detection to monitor for live cyberattacks and vulnerability management to find and fix weak spots before they are exploited. - Let vulnerability management track traffic spikes and use threat detection to deal with outdated APIs. - Assign fraud alerts to vulnerability management, and let threat detection focus on finding code injections. **CORRECT:** "Use threat detection to monitor for live cyberattacks and vulnerability management to find and fix weak spots before they are exploited." is the correct answer. Threat detection involves continuously monitoring systems, networks, and applications to detect signs of cyberattacks in real time. It helps identify malicious behaviors such as unauthorized access, data breaches, or suspicious user activity. Vulnerability management is the process of identifying, evaluating, and addressing security flaws or misconfigurations before attackers can exploit them. This includes scanning for outdated software, weak permissions, or exposed endpoints. Together, these two practices form a proactive and reactive security approach: threat detection catches live attacks, while vulnerability management reduces the chances of those attacks being successful by patching known weaknesses. In the case of a retail company using generative AI, this strategy ensures the system is protected both against ongoing cyber threats and future risks, especially when AI systems are connected to customer data or external services. **INCORRECT:** "Let vulnerability management track traffic spikes and use threat detection to deal with outdated APIs." is incorrect. Tracking traffic spikes is generally handled by performance monitoring or anomaly detection, not vulnerability management. Outdated APIs are usually identified through vulnerability scans—not threat detection—which focuses on identifying attacks, not system updates. **INCORRECT:** "Use threat detection to check for old software libraries, and have vulnerability management block external API calls." is incorrect. Threat detection is not designed to scan for outdated software libraries; that's a function of vulnerability management. Also, blocking all external API calls is a rigid approach and not typically handled by vulnerability management tools—it can disrupt legitimate operations. **INCORRECT:** "Assign fraud alerts to vulnerability management, and let threat detection focus on finding code injections." is incorrect. Fraud alerts are often handled by separate systems focused on fraud analytics, not vulnerability management. While threat detection does look for signs like code injections, this answer splits responsibilities incorrectly and oversimplifies the broader roles of these security tools. **References:** https://aws.amazon.com/ai/generative-ai/security Domain: Security, Compliance, and Governance for AI Solutions --- #### 61. A global enterprise needs its generative AI model to be available in multiple regions to serve a worldwide user base. However, they are concerned about the additional costs of regional coverage. What is the key tradeoff they should consider? - Increased regional coverage reduces availability. - Expanding to multiple regions can increase costs. - Global redundancy automatically decreases costs. - High availability always reduces latency and costs. **CORRECT:** "Expanding to multiple regions can increase costs" is the correct answer. When deploying a generative AI model across multiple regions, a company must balance performance and cost. Expanding to multiple regions can indeed increase costs, as it requires maintaining infrastructure and resources in several data centers worldwide. This expansion improves latency and ensures better availability for users in different geographical locations. However, companies must account for expenses related to data replication, redundancy, and regional resource usage, all of which can significantly raise operational costs. Using cloud platforms like AWS, enterprises can optimize these costs by using regional pricing and designing their architecture to balance cost-efficiency with global performance. The tradeoff here is between improved user experience and the associated expenses of global deployment. **INCORRECT:** "Increased regional coverage reduces availability" is incorrect. Expanding to multiple regions typically improves availability by distributing workloads and reducing the risk of a single point of failure. This option is incorrect because increased regional coverage enhances availability rather than reducing it. **INCORRECT:** "Global redundancy automatically decreases costs" is incorrect. While global redundancy can improve reliability and fault tolerance, it does not automatically reduce costs. In fact, maintaining redundant systems across multiple regions tends to increase costs due to resource duplication. Thus, this statement is incorrect. **INCORRECT:** "High availability always reduces latency and costs" is incorrect. High availability may reduce latency by serving users from closer regions, but it often increases costs due to the need for additional infrastructure and resources. The assertion that high availability always reduces costs is incorrect. **References:** https://docs.aws.amazon.com/sap/latest/general/arch-guide-multi-region-architecture-patterns.html Domain: Fundamentals of Generative AI --- #### 62. An online educational platform is using a foundation model to power its AI tutor, which helps students by answering questions related to math, science, and history. The company wants to ensure the AI behaves responsibly—specifically, that it does not generate offensive, biased, or factually incorrect responses, which could harm student trust and violate ethical guidelines. What should the team implement to ensure safe and responsible content generation? - Choose transparent open-source models to guarantee safety and ethical alignment. - Use prompt engineering to guide the model toward safe and appropriate responses. - Use output logging and post-analysis to detect problematic behavior after deployment. - Use guardrails and content filtering mechanisms to ensure responsible output. **CORRECT:** "Use guardrails and content filtering mechanisms to ensure responsible output." is the correct answer. Amazon Bedrock Guardrails is a feature that designed to ensure the safe and responsible use of foundation models. Guardrails help developers control the behavior of generative AI applications by enforcing boundaries on model outputs. These include content filtering for profanity, hate speech, violence, sexual content, and the ability to block specific topics or define custom banned words. This functionality is especially important for use cases in sensitive domains like education, healthcare, or customer support. To configure content filtering in Amazon Bedrock, users can define Guardrail policies using the Bedrock console, AWS SDKs, or CLI. When creating or modifying a Guardrail, developers specify which content categories to block, set confidence thresholds, and optionally define custom filters or blocklists. These Guardrails are then associated with specific model invocations, allowing Bedrock to automatically screen and filter generated content in real time before it's returned to the end user. This makes it easier to deploy compliant, trustworthy applications without requiring complex custom moderation pipelines. **INCORRECT:** "Use prompt engineering to guide the model toward safe and appropriate responses." is incorrect. Prompt engineering helps shape model responses, but it does not guarantee safety or filter harmful outputs. It is a useful technique to encourage desirable behavior, but it cannot replace guardrails or content filters for safety-critical applications. **INCORRECT:** "Use output logging and post-analysis to detect problematic behavior after deployment." is incorrect. Logging and analyzing outputs after deployment helps identify issues, but it's a reactive approach. It doesn't prevent harmful responses from being shown to users in real time, which is a risk in educational settings. **INCORRECT:** "Choose transparent open-source models to guarantee safety and ethical alignment." is incorrect. While open-source models offer transparency, they do not guarantee safety or responsible behavior. Additional safety measures like guardrails or content filtering must still be implemented regardless of model openness. **References:** https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html https://aws.amazon.com/ai/responsible-ai Domain: Guidelines for Responsible AI --- #### 63. Select and order the Amazon SageMaker Inference based on their latency from the LOWEST to the HIGHEST. Each option should be selected one time. (Select and order THREE.) Note: Select only the correct options, as the type of "Ordering" question is not supported here. - Real-Time Inference - Asynchronous Inference - Batch Transform **CORRECT:** "Real-Time Inference" is the correct answer. Real-Time Inference provides the lowest latency among the three options. It is designed for applications that require immediate responses, such as chatbots, fraud detection systems, or personalized recommendations. Real-time inference is typically implemented using services like Amazon SageMaker Real-Time Inference, which maintains an always-on endpoint capable of responding to prediction requests in milliseconds. This makes it ideal for time-sensitive use cases where users expect instant feedback. **CORRECT:** "Asynchronous Inference" is the correct answer. Asynchronous Inference offers a balance between performance and flexibility. While it introduces slightly higher latency than real-time inference due to request queuing and potential cold start delays, it supports larger payloads and longer processing times. This mode is suitable for tasks where a small delay is acceptable, such as document processing, video analysis, or background AI tasks. AWS services like SageMaker Asynchronous Inference allow developers to submit a job and receive results via a callback or later retrieval. **CORRECT:** "Batch Transform" is the correct answer. Batch Transform has the highest latency of the three inference options. It is optimized for high-throughput processing rather than low-latency responses. Batch Transform is ideal for running predictions on large volumes of data in bulk, such as performing sentiment analysis across a massive dataset of customer reviews. Instead of delivering results in real-time or near-real-time, it processes the entire dataset and returns the output once the job completes. This makes it well-suited for offline or scheduled workloads where immediate response is not necessary. **References:** https://docs.aws.amazon.com/sagemaker/latest/dg/realtime-endpoints.html https://docs.aws.amazon.com/sagemaker/latest/dg/async-inference.html https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html Domain: Applications of Foundation Models --- #### 64. A media company is deploying a generative AI model to automatically produce summaries of news articles. To ensure the summaries reflect the company's editorial voice and tone, including specific language style, structure, and content preferences, the company wants to tailor the model accordingly. What is the most effective approach to achieve this goal? - Train a new language model from scratch using Amazon Comprehend to generate custom summaries. - Fine-tune the foundation model on Amazon Bedrock using a curated dataset of articles paired with editorial-approved summaries. - Use prompt engineering techniques within Amazon Bedrock to manually guide summary generation, without relying on training data. - Build a retrieval-augmented summarization system using Amazon Kendra to extract key sentences from existing content. **CORRECT:** "Fine-tune the foundation model on Amazon Bedrock using a curated dataset of articles paired with editorial-approved summaries." is the correct answer. Fine-tuning a foundation model on Amazon Bedrock using a carefully curated dataset is the most effective approach when an organization wants the model to adopt a specific tone, style, or content structure. In this case, the media company can provide training data that pairs full articles with summaries crafted or approved by their editorial team. This helps the model learn and replicate the organization's preferred voice and summarization standards. Amazon Bedrock supports fine-tuning for select models, allowing companies to customize generative outputs without managing the infrastructure. By leveraging this method, the model can consistently produce high-quality summaries that align with editorial policies. **INCORRECT:** "Train a new language model from scratch using Amazon Comprehend to generate custom summaries." is incorrect. Amazon Comprehend is designed for natural language processing tasks like entity recognition, sentiment analysis, and topic modeling—not for training generative models. Training a model from scratch also requires significant data, compute power, and expertise, which is unnecessary when effective fine-tuning of foundation models is available. **INCORRECT:** "Use prompt engineering techniques within Amazon Bedrock to manually guide summary generation, without relying on training data." is incorrect. Prompt engineering can guide output behavior but is limited when consistent tone, style, and structural alignment are needed. It's not a scalable solution for enforcing detailed editorial guidelines across thousands of summaries. **INCORRECT:** "Build a retrieval-augmented summarization system using Amazon Kendra to extract key sentences from existing content." is incorrect. Amazon Kendra is a search and retrieval service, not a generative AI platform. While it can identify relevant content, it does not generate new summaries or align output with a specific editorial tone. It's more suitable for Q&A or document search use cases. **References:** https://docs.aws.amazon.com/bedrock/latest/userguide/custom-models.html Domain: Fundamentals of Generative AI --- #### 65. A company wants to store and reuse features created during the data preprocessing stage across multiple machine learning models. Which AWS feature is best suited for this task? - Amazon SageMaker Ground Truth - Amazon SageMaker Feature Store - Amazon SageMaker Studio - Amazon SageMaker Data Wrangler **CORRECT:** "Amazon SageMaker Feature Store" is the correct answer. Amazon SageMaker Feature Store is a fully managed repository designed to store, share, and manage machine learning features used in models. It allows data scientists to create, store, and retrieve curated features for both online and offline use, ensuring consistency across model training and inference. By centralizing feature management, it helps eliminate redundant feature engineering, ensures version control, and supports collaboration among teams. SageMaker Feature Store integrates with other SageMaker services, making it easier to streamline the ML pipeline, reduce development time, and improve the scalability of machine learning workflows. **INCORRECT:** "Amazon SageMaker Ground Truth" is incorrect. SageMaker Ground Truth is a tool for labeling data to create high-quality datasets for training models, but it does not focus on storing or reusing features. **INCORRECT:** "Amazon SageMaker Studio" is incorrect. SageMaker Studio is an integrated development environment for machine learning, but it is not specifically designed for managing and storing features for reuse across models. **INCORRECT:** "Amazon SageMaker Data Wrangler" is incorrect. SageMaker Data Wrangler is used for preparing and transforming data, but it does not provide a centralized repository for storing features across multiple models. **References:** https://docs.aws.amazon.com/sagemaker/latest/dg/feature-store.html Domain: Fundamentals of AI and ML