MACHINE LEARNING
![]() |
![]() |
![]() |
Título del Test:![]() MACHINE LEARNING Descripción: Certificacion ML |




Comentarios |
---|
NO HAY REGISTROS |
You want to train an AutoML model to predict house prices by using a small public dataset stored in BigQuery. You need to prepare the data and want to use the simplest, most efficient approach. What should you do?. Write a query that preprocesses the data by using BigQuery and creates a new table. Create a Vertex AI managed dataset with the new table as the data source. Use Dataflow to preprocess the data. Write the output in TFRecord format to a Cloud Storage bucket. Write a query that preprocesses the data by using BigQuery. Export the query results as CSV files, and use those files to create a Vertex AI managed dataset. Use a Vertex AI Workbench notebook instance to preprocess the data by using the pandas library. Export the data as CSV files, and use those files to create a Vertex AI managed dataset. You developed a Vertex AI ML pipeline that consists of preprocessing and training steps and each set of steps runs on a separate custom Docker image. Your organization uses GitHub and GitHub Actions as CI/CD to run unit and integration tests. You need to automate the model retraining workflow so that it can be initiated both manually and when a new version of the code is merged in the main branch. You want to minimize the steps required to build the workflow while also allowing for maximum flexibility. How should you configure the CI/CD workflow?. Trigger a Cloud Build workflow to run tests, build custom Docker images, push the images to Artifact Registry, and launch the pipeline in Vertex AI Pipelines. Trigger GitHub Actions to run the tests, launch a job on Cloud Run to build custom Docker images, push the images to Artifact Registry, and launch the pipeline in Vertex AI Pipelines. Trigger GitHub Actions to run the tests, build custom Docker images, push the images to Artifact Registry, and launch the pipeline in Vertex AI Pipelines. Trigger GitHub Actions to run the tests, launch a Cloud Build workflow to build custom Docker images, push the images to Artifact Registry, and launch the pipeline in Vertex AI Pipelines. You are working with a dataset that contains customer transactions. You need to build an ML model to predict customer purchase behavior. You plan to develop the model in BigQuery ML, and export it to Cloud Storage for online prediction. You notice that the input data contains a few categorical features, including product category and payment method. You want to deploy the model as quickly as possible. What should you do?. Use the TRANSFORM clause with the ML.ONE_HOT_ENCODER function on the categorical features at model creation and select the categorical and non-categorical features. Use the ML.ONE_HOT_ENCODER function on the categorical features and select the encoded categorical features and non-categorical features as inputs to create your model. Use the CREATE MODEL statement and select the categorical and non-categorical features. Use the ML.MULTI_HOT_ENCODER function on the categorical features, and select the encoded categorical features and non-categorical features as inputs to create your model. You need to develop an image classification model by using a large dataset that contains labeled images in a Cloud Storage bucket. What should you do?. Use Vertex AI Pipelines with the Kubeflow Pipelines SDK to create a pipeline that reads the images from Cloud Storage and trains the model. Use Vertex AI Pipelines with TensorFlow Extended (TFX) to create a pipeline that reads the images from Cloud Storage and trains the model. Import the labeled images as a managed dataset in Vertex AI and use AutoML to train the model. Convert the image dataset to a tabular format using Dataflow Load the data into BigQuery and use BigQuery ML to train the model. You are developing a model to detect fraudulent credit card transactions. You need to prioritize detection, because missing even one fraudulent transaction could severely impact the credit card holder. You used AutoML to tram a model on users' profile information and credit card transaction data After training the initial model, you notice that the model is failing to detect many fraudulent transactions. How should you adjust the training parameters in AutoML to improve model performance? (Choose two.). Increase the score threshold. Decrease the score threshold. Add more positive examples to the training set. Add more negative examples to the training set. Reduce the maximum number of node hours for training. You need to deploy a scikit-leam classification model to production. The model must be able to serve requests 24/7, and you expect millions of requests per second to the production application from 8 am to 7 pm. You need to minimize the cost of deployment. What should you do?. Deploy an online Vertex AI prediction endpoint. Set the max replica count to 1. Deploy an online Vertex AI prediction endpoint. Set the max replica count to 100. Deploy an online Vertex AI prediction endpoint with one GPU per replica. Set the max replica count to 1. Deploy an online Vertex AI prediction endpoint with one GPU per replica. Set the max replica count to 100. You work with a team of researchers to develop state-of-the-art algorithms for financial analysis. Your team develops and debugs complex models in TensorFlow. You want to maintain the ease of debugging while also reducing the model training time. How should you set up your training environment?. Configure a v3-8 TPU VM. SSH into the VM to train and debug the model. Configure a v3-8 TPU node. Use Cloud Shell to SSH into the Host VM to train and debug the model. Configure a n1 -standard-4 VM with 4 NVIDIA P100 GPUs. SSH into the VM and use ParameterServerStraregv to train the model. Configure a n1-standard-4 VM with 4 NVIDIA P100 GPUs. SSH into the VM and use MultiWorkerMirroredStrategy to train the model. You created an ML pipeline with multiple input parameters. You want to investigate the tradeoffs between different parameter combinations. The parameter options are • Input dataset • Max tree depth of the boosted tree regressor • Optimizer learning rate You need to compare the pipeline performance of the different parameter combinations measured in F1 score, time to train, and model complexity. You want your approach to be reproducible, and track all pipeline runs on the same platform. What should you do?. 1. Use BigQueryML to create a boosted tree regressor, and use the hyperparameter tuning capability. 2. Configure the hyperparameter syntax to select different input datasets: max tree depths, and optimizer learning rates. Choose the grid search option. 1. Create a Vertex AI pipeline with a custom model training job as part of the pipeline. Configure the pipeline’s parameters to include those you are investigating. 2. In the custom training step, use the Bayesian optimization method with F1 score as the target to maximize. 1. Create a Vertex AI Workbench notebook for each of the different input datasets. 2. In each notebook, run different local training jobs with different combinations of the max tree depth and optimizer learning rate parameters. 3. After each notebook finishes, append the results to a BigQuery table. 1. Create an experiment in Vertex AI Experiments. 2. Create a Vertex AI pipeline with a custom model training job as part of the pipeline. Configure the pipeline’s parameters to include those you are investigating. 3. Submit multiple runs to the same experiment, using different values for the parameters. You received a training-serving skew alert from a Vertex AI Model Monitoring job running in production. You retrained the model with more recent training data, and deployed it back to the Vertex AI endpoint, but you are still receiving the same alert. What should you do?. Update the model monitoring job to use a lower sampling rate. Update the model monitoring job to use the more recent training data that was used to retrain the model. Temporarily disable the alert. Enable the alert again after a sufficient amount of new production traffic has passed through the Vertex AI endpoint. Temporarily disable the alert until the model can be retrained again on newer training data. Retrain the model again after a sufficient amount of new production traffic has passed through the Vertex AI endpoint. You developed a custom model by using Vertex AI to forecast the sales of your company’s products based on historical transactional data. You anticipate changes in the feature distributions and the correlations between the features in the near future. You also expect to receive a large volume of prediction requests. You plan to use Vertex AI Model Monitoring for drift detection and you want to minimize the cost. What should you do?. Use the features for monitoring. Set a monitoring-frequency value that is higher than the default. Use the features for monitoring. Set a prediction-sampling-rate value that is closer to 1 than 0. Use the features and the feature attributions for monitoring. Set a monitoring-frequency value that is lower than the default. Use the features and the feature attributions for monitoring. Set a prediction-sampling-rate value that is closer to 0 than 1. You have recently trained a scikit-learn model that you plan to deploy on Vertex AI. This model will support both online and batch prediction. You need to preprocess input data for model inference. You want to package the model for deployment while minimizing additional code. What should you do?. 1. Upload your model to the Vertex AI Model Registry by using a prebuilt scikit-ieam prediction container. 2. Deploy your model to Vertex AI Endpoints, and create a Vertex AI batch prediction job that uses the instanceConfig.instanceType setting to transform your input data. 1. Wrap your model in a custom prediction routine (CPR). and build a container image from the CPR local model. 2. Upload your scikit learn model container to Vertex AI Model Registry. 3. Deploy your model to Vertex AI Endpoints, and create a Vertex AI batch prediction job. 1. Create a custom container for your scikit learn model. 2. Define a custom serving function for your model. 3. Upload your model and custom container to Vertex AI Model Registry. 4. Deploy your model to Vertex AI Endpoints, and create a Vertex AI batch prediction job. 1. Create a custom container for your scikit learn model. 2. Upload your model and custom container to Vertex AI Model Registry. 3. Deploy your model to Vertex AI Endpoints, and create a Vertex AI batchprediction job that uses the instanceConfig.instanceType setting to transform your input data. You work for a food product company. Your company’s historical sales data is stored in BigQuery.You need to use Vertex AI’s custom training service to train multiple TensorFlow models that read the data from BigQuery and predict future sales. You plan to implement a data preprocessing algorithm that performs mm-max scaling and bucketing on a large number of features before you start experimenting with the models. You want to minimize preprocessing time, cost, and development effort. How should you configure this workflow?. Write the transformations into Spark that uses the spark-bigquery-connector, and use Dataproc to preprocess the data. Write SQL queries to transform the data in-place in BigQuery. Add the transformations as a preprocessing layer in the TensorFlow models. Create a Dataflow pipeline that uses the BigQuerylO connector to ingest the data, process it, and write it back to BigQuery. You have created a Vertex AI pipeline that includes two steps. The first step preprocesses 10 TB data completes in about 1 hour, and saves the result in a Cloud Storage bucket. The second step uses the processed data to train a model. You need to update the model’s code to allow you to test different algorithms. You want to reduce pipeline execution time and cost while also minimizing pipeline changes. What should you do?. Add a pipeline parameter and an additional pipeline step. Depending on the parameter value, the pipeline step conducts or skips data preprocessing, and starts model training. Create another pipeline without the preprocessing step, and hardcode the preprocessed Cloud Storage file location for model training. Configure a machine with more CPU and RAM from the compute-optimized machine family for the data preprocessing step. Enable caching for the pipeline job, and disable caching for the model training step. You work for a bank. You have created a custom model to predict whether a loan application should be flagged for human review. The input features are stored in a BigQuery table. The model is performing well, and you plan to deploy it to production. Due to compliance requirements the model must provide explanations for each prediction. You want to add this functionality to your model code with minimal effort and provide explanations that are as accurate as possible. What should you do?. Create an AutoML tabular model by using the BigQuery data with integrated Vertex Explainable AI. Create a BigQuery ML deep neural network model and use the ML.EXPLAIN_PREDICT method with the num_integral_steps parameter. Upload the custom model to Vertex AI Model Registry and configure feature-based attribution by using sampled Shapley with input baselines. Update the custom serving container to include sampled Shapley-based explanations in the prediction outputs. You recently used XGBoost to train a model in Python that will be used for online serving. Your model prediction service will be called by a backend service implemented in Golang running on a Google Kubernetes Engine (GKE) cluster. Your model requires pre and postprocessing steps. You need to implement the processing steps so that they run at serving time. You want to minimize code changes and infrastructure maintenance, and deploy your model into production as quickly as possible. What should you do?. Use FastAPI to implement an HTTP server. Create a Docker image that runs your HTTP server, and deploy it on your organization’s GKE cluster. Use FastAPI to implement an HTTP server. Create a Docker image that runs your HTTP server, Upload the image to Vertex AI Model Registry and deploy it to a Vertex AI endpoint. Use the Predictor interface to implement a custom prediction routine. Build the custom container, upload the container to Vertex AI Model Registry and deploy it to a Vertex AI endpoint. Use the XGBoost prebuilt serving container when importing the trained model into Vertex AI. Deploy the model to a Vertex AI endpoint. Work with the backend engineers to implement the pre- and postprocessing steps in the Golang backend service. You recently deployed a pipeline in Vertex AI Pipelines that trains and pushes a model to a Vertex AI endpoint to serve real-time traffic. You need to continue experimenting and iterating on your pipeline to improve model performance. You plan to use Cloud Build for CI/CD You want to quickly and easily deploy new pipelines into production, and you want to minimize the chance that the new pipeline implementations will break in production. What should you do?. Set up a CI/CD pipeline that builds and tests your source code. If the tests are successful, use the Google. Cloud console to upload the built container to Artifact Registry and upload the compiled pipeline to Vertex AI Pipelines. Set up a CI/CD pipeline that builds your source code and then deploys built artifacts into a pre-production environment. Run unit tests in the pre-production environment. If the tests are successful deploy the pipeline to production. Set up a CI/CD pipeline that builds and tests your source code and then deploys built artifacts into a pre-production environment. After a successful pipeline run in the pre-production environment, deploy the pipeline to production. Set up a CI/CD pipeline that builds and tests your source code and then deploys built artifacts into a pre-production environment. After a successful pipeline run in the pre-production environment, rebuild the source code and deploy the artifacts to production. You work for a bank with strict data governance requirements. You recently implemented a custom model to detect fraudulent transactions. You want your training code to download internal data by using an API endpoint hosted in your project’s network. You need the data to be accessed in the most secure way, while mitigating the risk of data exfiltration. What should you do?. Enable VPC Service Controls for peerings, and add Vertex AI to a service perimeter. Create a Cloud Run endpoint as a proxy to the data. Use Identity and Access Management (IAM) authentication to secure access to the endpoint from the training job. Configure VPC Peering with Vertex AI, and specify the network of the training job. Download the data to a Cloud Storage bucket before calling the training job. You are deploying a new version of a model to a production Vertex Al endpoint that is serving traffic. You plan to direct all user traffic to the new model. You need to deploy the model with minimal disruption to your application. What should you do?. 1. Create a new endpoint 2. Create a new model. Set it as the default version. Upload the model to Vertex AI Model Registry 3. Deploy the new model to the new endpoint 4. Update Cloud DNS to point to the new endpoint. 1. Create a new endpoint 2. Create a new model. Set the parentModel parameter to the model ID of the currently deployed model and set it as the default version. Upload the model to Vertex AI Model Registry 3. Deploy the new model to the new endpoint, and set the new model to 100% of the traffic. 1. Create a new model. Set the parentModel parameter to the model ID of the currently deployed model. Upload the model to Vertex AI Model Registry. 2. Deploy the new model to the existing endpoint, and set the new model to 100% of the traffic. 1. Create a new model. Set it as the default version. Upload the model to Vertex AI Model Registry 2. Deploy the new model to the existing endpoint. You are training an ML model on a large dataset. You are using a TPU to accelerate the training process. You notice that the training process is taking longer thanexpected. You discover that the TPU is not reaching its full capacity. What should you do?. Increase the learning rate. Increase the number of epochs. Decrease the learning rate. Increase the batch size. You work for a retail company. You have a managed tabular dataset in Vertex AI that contains sales data from three different stores. The dataset includes several features, such as store name and sale timestamp. You want to use the data to train a model that makes sales predictions for a new store that will open soon. You need to split the data between the training, validation, and test sets. What approach should you use to split the data?. Use Vertex AI manual split, using the store name feature to assign one store for each set. Use Vertex AI default data split. Use Vertex AI chronological split, and specify the sales timestamp feature as the time variable. Use Vertex AI random split, assigning 70% of the rows to the training set, 10% to the validation set, and 20% to the test set. You have developed a BigQuery ML model that predicts customer chum, and deployed the model to Vertex AI Endpoints. You want to automate the retraining of your model by using minimal additional code when model feature values change. You also want to minimize the number of times that your model is retrained to reduce training costs. What should you do?. 1 Enable request-response logging on Vertex AI Endpoints 2. Schedule a TensorFlow Data Validation job to monitor prediction drift 3. Execute model retraining if there is significant distance between the distributions. 1. Enable request-response logging on Vertex AI Endpoints 2. Schedule a TensorFlow Data Validation job to monitor training/serving skew 3. Execute model retraining if there is significant distance between the distributions. 1. Create a Vertex AI Model Monitoring job configured to monitor prediction drift 2. Configure alert monitoring to publish a message to a Pub/Sub queue when a monitoring alert is detected 3. Use a Cloud Function to monitor the Pub/Sub queue, and trigger retraining in BigQuery. 1. Create a Vertex AI Model Monitoring job configured to monitor training/serving skew 2. Configure alert monitoring to publish a message to a Pub/Sub queue when a monitoring alert is detected 3. Use a Cloud Function to monitor the Pub/Sub queue, and trigger retraining in BigQuery. You have been tasked with deploying prototype code to production. The feature engineering code is in PySpark and runs on Dataproc Serverless. The model training is executed by using a Vertex AI custom training job. The two steps are not connected, and the model training must currently be run manually after the feature engineering step finishes. You need to create a scalable and maintainable productionprocess that runs end-to-end and tracks the connections between steps. What should you do?. Create a Vertex AI Workbench notebook. Use the notebook to submit the Dataproc Serverless feature engineering job. Use the same notebook to submit the custom model training job. Run the notebook cells sequentially to tie the steps together end-to-end. Create a Vertex AI Workbench notebook. Initiate an Apache Spark context in the notebook and run the PySpark feature engineering code. Use the same notebook to run the custom model training job in TensorFlow. Run the notebook cells sequentially to tie the steps together end-to-end. Use the Kubeflow pipelines SDK to write code that specifies two components: - The first is a Dataproc Serverless component that launches the feature engineering job - The second is a custom component wrapped in the create_custom_training_job_from_component utility that launches the custom model training job Create a Vertex AI Pipelines job to link and run both components. Use the Kubeflow pipelines SDK to write code that specifies two components - The first component initiates an Apache Spark context that runs the PySpark feature engineering code - The second component runs the TensorFlow custom model training code Create a Vertex AI Pipelines job to link and run both components. You recently deployed a scikit-learn model to a Vertex AI endpoint. You are now testing the model on live production traffic. While monitoring the endpoint, you discover twice as many requests per hour than expected throughout the day. You want the endpoint to efficiently scale when the demand increases in the future to prevent users from experiencing high latency. What should you do?. Deploy two models to the same endpoint, and distribute requests among them evenly. Configure an appropriate minReplicaCount value based on expected baseline traffic. Set the target utilization percentage in the autoscailngMetricSpecs configuration to a higher value. Change the model’s machine type to one that utilizes GPUs. You work at a bank. You have a custom tabular ML model that was provided by the bank’s vendor. The training data is not available due to its sensitivity. The model is packaged as a Vertex AI Model serving container, which accepts a string as input for each prediction instance. In each string, the feature values are separated by commas. You want to deploy this model to production for online predictions and monitor the feature distribution over time with minimal effort. What should you do?. 1. Upload the model to Vertex AI Model Registry, and deploy the model to a Vertex AI endpoint 2. Create a Vertex AI Model Monitoring job with feature drift detection as the monitoring objective, and provide an instance schema. 1. Upload the model to Vertex AI Model Registry, and deploy the model to a Vertex AI endpoint 2. Create a Vertex AI Model Monitoring job with feature skew detection as the monitoring objective, and provide an instance schema. 1. Refactor the serving container to accept key-value pairs as input format 2. Upload the model to Vertex AI Model Registry, and deploy the model to a Vertex AI endpoint 3. Create a Vertex AI Model Monitoring job with feature drift detection as the monitoring objective. 1. Refactor the serving container to accept key-value pairs as input format 2. Upload the model to Vertex AI Model Registry, and deploy the model to a Vertex AI endpoint 3. Create a Vertex AI Model Monitoring job with feature skew detection as the monitoring objective. You are implementing a batch inference ML pipeline in Google Cloud. The model was developed using TensorFlow and is stored in SavedModel format in Cloud Storage. You need to apply the model to a historical dataset containing 10 TB of data that is stored in a BigQuery table. How should you perform the inference?. Export the historical data to Cloud Storage in Avro format. Configure a Vertex AI batch prediction job to generate predictions for the exported data. Import the TensorFlow model by using the CREATE MODEL statement in BigQuery ML. Apply the historical data to the TensorFlow model. Export the historical data to Cloud Storage in CSV format. Configure a Vertex AI batch prediction job to generate predictions for the exported data. Configure a Vertex AI batch prediction job to apply the model to the historical data in BigQuery. You recently deployed a model to a Vertex AI endpoint. Your data drifts frequently, so you have enabled request-response logging and created a Vertex AI Model Monitoring job. You have observed that your model is receiving higher traffic than expected. You need to reduce the model monitoring cost while continuing to quickly detect drift. What should you do?. Replace the monitoring job with a DataFlow pipeline that uses TensorFlow Data Validation (TFDV). Replace the monitoring job with a custom SQL script to calculate statistics on the features and predictions in BigQuery. Decrease the sample_rate parameter in the RandomSampleConfig of the monitoring job. Increase the monitor_interval parameter in the ScheduleConfig of the monitoring job. You work for a retail company. You have created a Vertex AI forecast model that produces monthly item sales predictions. You want to quickly create a report that will help to explain how the model calculates the predictions. You have one month of recent actual sales data that was not included in the training dataset. How should you generate data for your report?. Create a batch prediction job by using the actual sales data. Compare the predictions to the actuals in the report. Create a batch prediction job by using the actual sales data, and configure the job settings to generate feature attributions. Compare the results in the report. Generate counterfactual examples by using the actual sales data. Create a batch prediction job using the actual sales data and the counterfactual examples. Compare the results in the report. Train another model by using the same training dataset as the original, and exclude some columns. Using the actual sales data create one batch prediction job by using the new model and another one with the original model. Compare the two sets of predictions in the report. Your team has a model deployed to a Vertex AI endpoint. You have created a Vertex AI pipeline that automates the model training process and is triggered by a Cloud Function. You need to prioritize keeping the model up-to-date, but also minimize retraining costs. How should you configure retraining?. Configure Pub/Sub to call the Cloud Function when a sufficient amount of new data becomes available. Configure a Cloud Scheduler job that calls the Cloud Function at a predetermined frequency that fits your team’s budget. Enable model monitoring on the Vertex AI endpoint. Configure Pub/Sub to call the Cloud Function when anomalies are detected. Enable model monitoring on the Vertex AI endpoint. Configure Pub/Sub to call the Cloud Function when feature drift is detected. Your company stores a large number of audio files of phone calls made to your customer call center in an on-premises database. Each audio file is in wav format and is approximately 5 minutes long. You need to analyze these audio files for customer sentiment. You plan to use the Speech-to-Text API You want to use the most efficient approach. What should you do?. 1. Upload the audio files to Cloud Storage 2. Call the speech:longrunningrecognize API endpoint to generate transcriptions 3. Call the predict method of an AutoML sentiment analysis model to analyze the transcriptions. 1. Upload the audio files to Cloud Storage. 2. Call the speech:longrunningrecognize API endpoint to generate transcriptions 3. Create a Cloud Function that calls the Natural Language API by using the analyzeSentiment method. 1. Iterate over your local files in Python 2. Use the Speech-to-Text Python library to create a speech.RecognitionAudio object, and set the content to the audio file data 3. Call the speech:recognize API endpoint to generate transcriptions 4. Call the predict method of an AutoML sentiment analysis model to analyze the transcriptions. 1. Iterate over your local files in Python 2. Use the Speech-to-Text Python Library to create a speech.RecognitionAudio object and set the content to the audio file data 3. Call the speech:longrunningrecognize API endpoint to generate transcriptions. 4. Call the Natural Language API by using the analyzeSentiment method. You work for a social media company. You want to create a no-code image classification model for an iOS mobile application to identify fashion accessories. You have a labeled dataset in Cloud Storage. You need to configure a training workflow that minimizes cost and serves predictions with the lowest possible latency. What should you do?. Train the model by using AutoML, and register the model in Vertex AI Model Registry. Configure your mobile application to send batch requests during prediction. Train the model by using AutoML Edge, and export it as a Core ML model. Configure your mobile application to use the .mlmodel file directly. Train the model by using AutoML Edge, and export the model as a TFLite model. Configure your mobile application to use the .tflite file directly. Train the model by using AutoML, and expose the model as a Vertex AI endpoint. Configure your mobile application to invoke the endpoint during prediction. You work for a retail company. You have been asked to develop a model to predict whether a customer will purchase a product on a given day. Your team has processed the company’s sales data, and created a table with the following rows: • Customer_id • Product_id • Date • Days_since_last_purchase (measured in days) • Average_purchase_frequency (measured in 1/days) • Purchase (binary class, if customer purchased product on the Date)You work for a retail company. You have been asked to develop a model to predict whether a customer will purchase a product on a given day. Your team has processed the company’s sales data, and created a table with the following rows: • Customer_id • Product_id • Date • Days_since_last_purchase (measured in days) • Average_purchase_frequency (measured in 1/days) • Purchase (binary class, if customer purchased product on the Date) You need to interpret your model’s results for each individual prediction. What should you do?. Create a BigQuery table. Use BigQuery ML to build a boosted tree classifier. Inspect the partition rules of the trees to understand how each prediction flows through the trees. Create a Vertex AI tabular dataset. Train an AutoML model to predict customer purchases. Deploy the model to a Vertex AI endpoint and enable feature attributions. Use the “explain” method to get feature attribution values for each individual prediction. Create a BigQuery table. Use BigQuery ML to build a logistic regression classification model. Use the values of the coefficients of the model to interpret the feature importance, with higher values corresponding to more importance. Create a Vertex AI tabular dataset. Train an AutoML model to predict customer purchases. Deploy the model to a Vertex AI endpoint. At each prediction, enable L1 regularization to detect non-informative features. You work for a company that captures live video footage of checkout areas in their retail stores. You need to use the live video footage to build a model to detect the number of customers waiting for service in near real time. You want to implement a solution quickly and with minimal effort. How should you build the model?. Use the Vertex AI Vision Occupancy Analytics model. Use the Vertex AI Vision Person/vehicle detector model. Train an AutoML object detection model on an annotated dataset by using Vertex AutoML. Train a Seq2Seq+ object detection model on an annotated dataset by using Vertex AutoML. You work as an analyst at a large banking firm. You are developing a robust scalable ML pipeline to tram several regression and classification models. Your primary focus for the pipeline is model interpretability. You want to productionize the pipeline as quickly as possible. What should you do?. Use Tabular Workflow for Wide & Deep through Vertex AI Pipelines to jointly train wide linear models and deep neural networks. Use Google Kubernetes Engine to build a custom training pipeline for XGBoost-based models. Use Tabular Workflow for TabNet through Vertex AI Pipelines to train attention-based models. Use Cloud Composer to build the training pipelines for custom deep learning-based models. You developed a Transformer model in TensorFlow to translate text. Your training data includes millions of documents in a Cloud Storage bucket. You plan to use distributed training to reduce training time. You need to configure the training job while minimizing the effort required to modify code and to manage the cluster’s configuration. What should you do?. Create a Vertex AI custom training job with GPU accelerators for the second worker pool. Use tf.distribute.MultiWorkerMirroredStrategy for distribution. Create a Vertex AI custom distributed training job with Reduction Server. Use N1 high-memory machine type instances for the first and second pools, and use N1 high-CPU machine type instances for the third worker pool. Create a training job that uses Cloud TPU VMs. Use tf.distribute.TPUStrategy for distribution. Create a Vertex AI custom training job with a single worker pool of A2 GPU machine type instances. Use tf.distribute.MirroredStrategv for distribution. You are developing a process for training and running your custom model in production. You need to be able to show lineage for your model and predictions. What should you do?. 1. Create a Vertex AI managed dataset. 2. Use a Vertex AI training pipeline to train your model. 3. Generate batch predictions in Vertex AI. 1. Use a Vertex AI Pipelines custom training job component to tram your model. 2. Generate predictions by using a Vertex AI Pipelines model batch predict component. 1. Upload your dataset to BigQuery. 2. Use a Vertex AI custom training job to train your model. 3. Generate predictions by using Vertex Al SDK custom prediction routines. 1. Use Vertex AI Experiments to train your model. 2. Register your model in Vertex AI Model Registry. 3. Generate batch predictions in Vertex AI. You work for a hotel and have a dataset that contains customers’ written comments scanned from paper-based customer feedback forms, which are stored as PDF files. Every form has the same layout. You need to quickly predict an overall satisfaction score from the customer comments on each form. How should you accomplish this task?. Use the Vision API to parse the text from each PDF file. Use the Natural Language API analyzeSentiment feature to infer overall satisfaction scores. Use the Vision API to parse the text from each PDF file. Use the Natural Language API analyzeEntitySentiment feature to infer overall satisfaction scores. Uptrain a Document AI custom extractor to parse the text in the comments section of each PDF file. Use the Natural Language API analyzeSentiment feature to infer overall satisfaction scores. Uptrain a Document AI custom extractor to parse the text in the commentssection of each PDF file. Use the Natural Language API analyzeEntitySentiment feature to infer overall satisfaction scores. You developed a Vertex AI pipeline that trains a classification model on data stored in a large BigQuery table. The pipeline has four steps, where each step is created by a Python function that uses the KubeFlow v2 API. The components have the following names: ( imagen 1) You launch your Vertex AI pipeline as the following: ( imagen 2) You perform many model iterations by adjusting the code and parameters of the training step. You observe high costs associated with the development, particularly the data export and preprocessing steps. You need to reduce model development costs. What should you do?. Change the components’ YAML filenames to export.yaml, preprocess,yaml, f "train- {dt}.yaml", f"calibrate-{dt).vaml". Add the {"kubeflow.v1.caching": True} parameter to the set of params provided to your PipelineJob. Move the first step of your pipeline to a separate step, and provide a cached path to Cloud Storage as an input to the main pipeline. Change the name of the pipeline to f"my-awesome-pipeline-{dt}". You work for a startup that has multiple data science workloads. Your compute infrastructure is currently on-premises, and the data science workloads are native to PySpark. Your team plans to migrate their data science workloads to Google Cloud. You need to build a proof of concept to migrate one data science job to Google Cloud. You want to propose a migration process that requires minimal cost and effort. What should you do first?. Create a n2-standard-4 VM instance and install Java, Scala, and Apache Spark dependencies on it. Create a Google Kubernetes Engine cluster with a basic node pool configuration, install Java, Scala, and Apache Spark dependencies on it. Create a Standard (1 master, 3 workers) Dataproc cluster, and run a Vertex AI Workbench notebook instance on it. Create a Vertex AI Workbench notebook with instance type n2-standard-4. You work for a bank. You have been asked to develop an ML model that will support loan application decisions. You need to determine which Vertex AI services to include in the workflow. You want to track the model’s training parameters and the metrics per training epoch. You plan to compare the performance of each version of the model to determine the best model based on your chosen metrics. Which Vertex AI services should you use?. Vertex ML Metadata, Vertex AI Feature Store, and Vertex AI Vizier. Vertex AI Pipelines, Vertex AI Experiments, and Vertex AI Vizier. Vertex ML Metadata, Vertex AI Experiments, and Vertex AI TensorBoard. Vertex AI Pipelines, Vertex AI Feature Store, and Vertex AI TensorBoard. You work for an auto insurance company. You are preparing a proof-of-concept ML application that uses images of damaged vehicles to infer damaged parts. Your team has assembled a set of annotated images from damage claim documents in the company’s database. The annotations associated with each image consist of a bounding box for each identified damaged part and the part name. You have been given a sufficientbudget to train models on Google Cloud. You need to quickly create an initial model. What should you do?. Download a pre-trained object detection model from TensorFlow Hub. Fine-tune the model in Vertex AI Workbench by using the annotated image data. Train an object detection model in AutoML by using the annotated imagedata. Create a pipeline in Vertex AI Pipelines and configure the AutoMLTrainingJobRunOp component to train a custom object detection model by using the annotated image data. Train an object detection model in Vertex AI custom training by using the annotated image data. You are analyzing customer data for a healthcare organization that is stored in Cloud Storage. The data contains personally identifiable information (PII). You need to perform data exploration and preprocessing while ensuring the security and privacy of sensitive fields. What should you do?. Use the Cloud Data Loss Prevention (DLP) API to de-identify the PII before performing data exploration and preprocessing. Use customer-managed encryption keys (CMEK) to encrypt the PII data at rest, and decrypt the PII data during data exploration and preprocessing. Use a VM inside a VPC Service Controls security perimeter to perform data exploration and preprocessing. Use Google-managed encryption keys to encrypt the PII data at rest, and decrypt the PII data during data exploration and preprocessing. You are building a predictive maintenance model to preemptively detect part defects in bridges. You plan to use high definition images of the bridges as model inputs. You need to explain the output of the model to the relevant stakeholders so they can take appropriate action. How should you build the model?. Use scikit-learn to build a tree-based model, and use SHAP values to explain the model output. Use scikit-learn to build a tree-based model, and use partial dependence plots (PDP) to explain the model output. Use TensorFlow to create a deep learning-based model, and use integrated Gradients to explain the model output. Use TensorFlow to create a deep learning-based model, and use the sampled Shapley method to explain the model output. You work for a hospital that wants to optimize how it schedules operations. You need to create a model that uses the relationship between the number of surgeries scheduled and beds used. You want to predict how many beds will be needed for patients each day in advance based on the scheduled surgeries. You have one year of data for the hospital organized in 365 rows. The data includes the following variables for each day: • Number of scheduled surgeries • Number of beds occupied • Date You want to maximize the speed of model development and testing. What should you do?. Create a BigQuery table. Use BigQuery ML to build a regression model, with number of beds as the target variable, and number of scheduled surgeries and date features (such as day of week) as the predictors. Create a BigQuery table. Use BigQuery ML to build an ARIMA model, with number of beds as the target variable, and date as the time variable. Create a Vertex AI tabular dataset. Train an AutoML regression model, with number of beds as the target variable, and number of scheduled minor surgeries and date features (such as day of the week) as the predictors. Create a Vertex AI tabular dataset. Train a Vertex AI AutoML Forecastingmodel, with number of beds as the target variable, number of scheduled surgeries as a covariate and date as the time variable. You recently developed a wide and deep model in TensorFlow. You generated training datasets using a SQL script that preprocessed raw data in BigQuery by performing instance-level transformations of the data. You need to create a training pipeline to retrain the model on a weekly basis. The trained model will be used to generate daily recommendations. You want to minimize model development and training time. How should you develop the training pipeline?. Use the Kubeflow Pipelines SDK to implement the pipeline. Use the BigQueryJobOp component to run the preprocessing script and theCustomTrainingJobOp component to launch a Vertex AI training job. Use the Kubeflow Pipelines SDK to implement the pipeline. Use theDataflowPythonJobOp component to preprocess the data and the CustomTrainingJobOp component to launch a Vertex AI training job. Use the TensorFlow Extended SDK to implement the pipeline Use the ExampleGen component with the BigQuery executor to ingest the data the Transform component to preprocess the data, and the Trainer component to launch a Vertex AI training job. Use the TensorFlow Extended SDK to implement the pipeline Implement the preprocessing steps as part of the input_fn of the model. Use the ExampleGencomponent with the BigQuery executor to ingest the data and the Trainer component to launch a Vertex AI training job. You are training a custom language model for your company using a large dataset. You plan to use the Reduction Server strategy on Vertex AI. You need to configure the worker pools of the distributed training job. What should you do?. Configure the machines of the first two worker pools to have GPUs, and to use a container image where your training code runs. Configure the third worker pool to have GPUs, and use the reductionserver container image. Configure the machines of the first two worker pools to have GPUs and to use a container image where your training code runs. Configure the third worker pool to use the reductionserver container image without accelerators, and choose a machine type that prioritizes bandwidth. Configure the machines of the first two worker pools to have TPUs and to use a container image where your training code runs. Configure the third worker pool without accelerators, and use the reductionserver container image without accelerators, and choose a machine type that prioritizes bandwidth. Configure the machines of the first two pools to have TPUs, and to use a container image where your training code runs. Configure the third pool to have TPUs, and use the reductionserver container image. You have trained a model by using data that was preprocessed in a batch Dataflow pipeline. Your use case requires real-time inference. You want to ensure that the data preprocessing logic is applied consistently between training and serving. What should you do?. Perform data validation to ensure that the input data to the pipeline is the same format as the input data to the endpoint. Refactor the transformation code in the batch data pipeline so that it can be used outside of the pipeline. Use the same code in the endpoint. Refactor the transformation code in the batch data pipeline so that it can be used outside of the pipeline. Share this code with the end users of the endpoint. Batch the real-time requests by using a time window and then use the Dataflow pipeline to preprocess the batched requests. Send the preprocessed requests to the endpoint. You need to develop a custom TensorFlow model that will be used for online predictions. The training data is stored in BigQuery You need to apply instance-level data transformations to the data for model training and serving. You want to use the same preprocessing routine during model training and serving. How should you configure the preprocessing routine?. Create a BigQuery script to preprocess the data, and write the result to another BigQuery table. Create a pipeline in Vertex AI Pipelines to read the data from BigQuery and preprocess it using a custom preprocessing component. Create a preprocessing function that reads and transforms the data from BigQuery. Create a Vertex AI custom prediction routine that calls the preprocessing function at serving time. Create an Apache Beam pipeline to read the data from BigQuery and preprocess it by using TensorFlow Transform and Dataflow. You are pre-training a large language model on Google Cloud. This model includes custom TensorFlow operations in the training loop. Model training will use a large batch size, and you expect training to take several weeks. You need to configure a training architecture that minimizes both training time and compute costs. What should you do?. Implement 8 workers of a2-megagpu-16g machines by using tf.distribute.MultiWorkerMirroredStrategy. Implement a TPU Pod slice with -accelerator-type=v4-l28 by using tf.distribute.TPUStrategy. Implement 16 workers of c2d-highcpu-32 machines by using tf.distribute.MirroredStrategy. Implement 16 workers of a2-highgpu-8g machines by using tf.distribute.MultiWorkerMirroredStrategy. You are developing an ML pipeline using Vertex AI Pipelines. You want your pipeline to upload a new version of the XGBoost model to Vertex AI Model Registry and deploy it to Vertex AI Endpoints for online inference. You want to use the simplest approach. What should you do?. Use the Vertex AI REST API within a custom component based on a vertex-ai/prediction/xgboost-cpu image. Use the Vertex AI ModelEvaluationOp component to evaluate the model. Use the Vertex AI SDK for Python within a custom component based on a python:3.10 image. Chain the Vertex AI ModelUploadOp and ModelDeployOp components together. You work for an online retailer. Your company has a few thousand short lifecycle products. Your company has five years of sales data stored in BigQuery. You have been asked to build a model that will make monthly sales predictions for each product. You want to use a solution that can be implemented quickly with minimal effort. What should you do?. Use Prophet on Vertex AI Training to build a custom model. Use Vertex AI Forecast to build a NN-based model. Use BigQuery ML to build a statistical ARIMA_PLUS model. Use TensorFlow on Vertex AI Training to build a custom model. You are creating a model training pipeline to predict sentiment scores from text-based product reviews. You want to have control over how the model parameters are tuned, and you will deploy the model to an endpoint after it has been trained. You will use Vertex AI Pipelines to run the pipeline. You need to decide which Google Cloud pipeline components to use. What components should you choose?. TabularDatasetCreateOp, CustomTrainingJobOp, and EndpointCreateOp. TextDatasetCreateOp, AutoMLTextTrainingOp, and EndpointCreateOp. TabularDatasetCreateOp. AutoMLTextTrainingOp, and ModelDeployOp. TextDatasetCreateOp, CustomTrainingJobOp, and ModelDeployOp. Your team frequently creates new ML models and runs experiments. Your team pushes code to a single repository hosted on Cloud Source Repositories. You want to create a continuous integration pipeline that automatically retrains the models whenever there is any modification of the code. What should be your first step to set up the CI pipeline?. Configure a Cloud Build trigger with the event set as "Pull Request". Configure a Cloud Build trigger with the event set as "Push to a branch". Configure a Cloud Function that builds the repository each time there is a code change. Configure a Cloud Function that builds the repository each time a new branch is created. You have built a custom model that performs several memory-intensive preprocessing tasks before it makes a prediction. You deployed the model to a Vertex AI endpoint, and validated that results were received in a reasonable amount of time. After routing user traffic to the endpoint, you discover that the endpoint does not autoscale as expected when receiving multiple requests. What should you do?. Use a machine type with more memory. Decrease the number of workers per machine. Increase the CPU utilization target in the autoscaling configurations. Decrease the CPU utilization target in the autoscaling configurations. Your company manages an ecommerce website. You developed an ML model that recommends additional products to users in near real time based on items currently in the user’s cart. The workflow will include the following processes: 1. The website will send a Pub/Sub message with the relevant data and then receive a message with the prediction from Pub/Sub 2. Predictions will be stored in BigQuery 3. The model will be stored in a Cloud Storage bucket and will be updated frequently You want to minimize prediction latency and the effort required to update the model. How should you reconfigure the architecture?. Write a Cloud Function that loads the model into memory for prediction. Configure the function to be triggered when messages are sent to Pub/Sub. Create a pipeline in Vertex AI Pipelines that performs preprocessing, prediction, and postprocessing. Configure the pipeline to be triggered by a Cloud Function when messages are sent to Pub/Sub. Expose the model as a Vertex AI endpoint. Write a custom DoFn in a Dataflow job that calls the endpoint for prediction. Use the RunInference API with WatchFilePattern in a Dataflow job that wraps around the model and serves predictions. You are collaborating on a model prototype with your team. You need to create a Vertex AI Workbench environment for the members of your team and also limit access to other employees in your project. What should you do?. 1. Create a new service account and grant it the Notebook Viewer role 2. Grant the Service Account User role to each team member on the service account 3. Grant the Vertex AI User role to each team member 4. Provision a Vertex AI Workbench user-managed notebook instance that uses the new service account. 1. Grant the Vertex AI User role to the default Compute Engine service account 2. Grant the Service Account User role to each team member on the default Compute Engine service account 3. Provision a Vertex AI Workbench user-managed notebook instance that uses the default Compute Engine service account. 1. Create a new service account and grant it the Vertex AI User role 2. Grant the Service Account User role to each team member on the service account 3. Grant the Notebook Viewer role to each team member. 4. Provision a Vertex AI Workbench user-managed notebook instance that uses the new service account. 1. Grant the Vertex AI User role to the primary team member 2. Grant the Notebook Viewer role to the other team members3. Provision a Vertex AI Workbench user-managed notebook instance that uses the primary user’s account. You work at a leading healthcare firm developing state-of-the-art algorithms for various use cases. You have unstructured textual data with custom labels. You need to extract and classify various medical phrases with these labels. What should you do?. Use the Healthcare Natural Language API to extract medical entities. Use a BERT-based model to fine-tune a medical entity extraction model. Use AutoML Entity Extraction to train a medical entity extraction model. Use TensorFlow to build a custom medical entity extraction model. You developed a custom model by using Vertex AI to predict your application's user churn rate. You are using Vertex AI Model Monitoring for skew detection. The training data stored in BigQuery contains two sets of features - demographic and behavioral. You later discover that two separate models trained on each set perform better than the original model. You need to configure a new model monitoring pipeline that splits traffic among the two models. You want to use the same prediction-sampling-rate and monitoring-frequency for each model. You also want to minimize management effort. What should you do?. Keep the training dataset as is. Deploy the models to two separate endpoints, and submit two Vertex AI Model Monitoring jobs with appropriately selected feature-thresholds parameters. Keep the training dataset as is. Deploy both models to the same endpoint and submit a Vertex AI Model Monitoring job with a monitoring-config-from-file parameter that accounts for the model IDs and feature selections. Separate the training dataset into two tables based on demographic and behavioral features. Deploy the models to two separate endpoints, and submit two Vertex AI Model Monitoring jobs. Separate the training dataset into two tables based on demographic and behavioral features. Deploy both models to the same endpoint, and submit a Vertex AI Model Monitoring job with a monitoring-config-from-file parameter that accounts for the model IDs and training datasets. You work for a pharmaceutical company based in Canada. Your team developed a BigQuery ML model to predict the number of flu infections for the next month in Canada. Weather data is published weekly, and flu infection statistics are published monthly. You need to configure a model retraining policy that minimizes cost. What should you do?. Download the weather and flu data each week. Configure Cloud Scheduler to execute a Vertex AI pipeline to retrain the model weekly. Download the weather and flu data each month. Configure Cloud Scheduler to execute a Vertex AI pipeline to retrain the model monthly. Download the weather and flu data each week. Configure Cloud Scheduler to execute a Vertex AI pipeline to retrain the model every month. Download the weather data each week, and download the flu data each month. Deploy the model to a Vertex AI endpoint with feature drift monitoring, and retrain the model if a monitoring alert is detected. You are building a MLOps platform to automate your company’s ML experiments and model retraining. You need to organize the artifacts for dozens of pipelines. How should you store the pipelines’ artifacts?. Store parameters in Cloud SQL, and store the models’ source code and binaries in GitHub. Store parameters in Cloud SQL, store the models’ source code in GitHub, and store the models’ binaries in Cloud Storage. Store parameters in Vertex ML Metadata, store the models’ source code in GitHub, and store the models’ binaries in Cloud Storage. Store parameters in Vertex ML Metadata and store the models’ source code and binaries in GitHub. You work for a telecommunications company. You’re building a model to predict which customers may fail to pay their next phone bill. The purpose of this model is to proactively offer at-risk customers assistance such as service discounts and bill deadline extensions. The data is stored in BigQuery and the predictive features that are available for model training include: - Customer_id - Age - Salary (measured in local currency) - Sex - Average bill value (measured in local currency) - Number of phone calls in the last month (integer) - Average duration of phone calls (measured in minutes) You need to investigate and mitigate potential bias against disadvantaged groups, while preserving model accuracy. What should you do?. Determine whether there is a meaningful correlation between the sensitive features and the other features. Train a BigQuery ML boosted trees classification model and exclude the sensitive features and any meaningfully correlated features. Train a BigQuery ML boosted trees classification model with all features. Use the ML.GLOBAL_EXPLAIN method to calculate the global attribution values for each feature of the model. If the feature importance value for any of the sensitive features exceeds a threshold, discard the model and tram without this feature. Train a BigQuery ML boosted trees classification model with all features. Use the ML.EXPLAIN_PREDICT method to calculate the attribution values for each feature for each customer in a test set. If for any individual customer, the importance value for any feature exceeds a predefined threshold, discard the model and train the model again without this feature. Define a fairness metric that is represented by accuracy across the sensitive features. Train a BigQuery ML boosted trees classification model with all features. Use the trained model to make predictions on a test set. Join the data back with the sensitive features, and calculate a fairness metric to investigate whether it meets your requirements. You recently trained a XGBoost model that you plan to deploy to production for online inference. Before sending a predict request to your model’s binary, you need to perform a simple data preprocessing step. This step exposes a REST API that accepts requests in your internal VPC Service Controls and returns predictions. You want to configure this preprocessing step while minimizing cost and effort. What should you do?. Store a pickled model in Cloud Storage. Build a Flask-based app, package the app in a custom container image, and deploy the model to Vertex AI Endpoints. Build a Flask-based app, package the app and a pickled model in a custom container image, and deploy the model to Vertex AI Endpoints. Build a custom predictor class based on XGBoost Predictor from the Vertex AI SDK, package it and a pickled model in a custom container image based on a Vertex built-in image, and deploy the model to Vertex AI Endpoints. Build a custom predictor class based on XGBoost Predictor from the Vertex AI SDK, and package the handler in a custom container image based on a Vertex built-in container image. Store a pickled model in Cloud Storage, and deploy the model to Vertex AI Endpoints. You work at a bank. You need to develop a credit risk model to support loan application decisions. You decide to implement the model by using a neural network in TensorFlow. Due to regulatory requirements, you need to be able to explain the model’s predictions based on its features. When the model is deployed, you also want to monitor the model’s performance over time. You decided to use Vertex AI for both model development and deployment. What should you do?. Use Vertex Explainable AI with the sampled Shapley method, and enable Vertex AI Model Monitoring to check for feature distribution drift. Use Vertex Explainable AI with the sampled Shapley method, and enable Vertex AI Model Monitoring to check for feature distribution skew. Use Vertex Explainable AI with the XRAI method, and enable Vertex AI Model Monitoring to check for feature distribution drift. Use Vertex Explainable AI with the XRAI method, and enable Vertex AI Model Monitoring to check for feature distribution skew. You are investigating the root cause of a misclassification error made by one of your models. You used Vertex AI Pipelines to train and deploy the model. The pipeline reads data from BigQuery. creates a copy of the data in Cloud Storage in TFRecord format, trains the model in Vertex AI Training on that copy, and deploys the model to a Vertex AI endpoint. You have identified the specific version of that model that misclassified, andyou need to recover the data this model was trained on. How should you find that copy of the data?. Use Vertex AI Feature Store. Modify the pipeline to use the feature store, and ensure that all training data is stored in it. Search the feature store for the data used for the training. Use the lineage feature of Vertex AI Metadata to find the model artifact. Determine the version of the model and identify the step that creates the data copy and search in the metadata for its location. Use the logging features in the Vertex AI endpoint to determine the timestamp of the model’s deployment. Find the pipeline run at that timestamp. Identify the step that creates the data copy, and search in the logs for its location. Find the job ID in Vertex AI Training corresponding to the training for the model. Search in the logs of that job for the data used for the training. You work for a manufacturing company. You need to train a custom image classification model to detect product defects at the end of an assembly line. Although your model is performing well, some images in your holdout set are consistently mislabeled with high confidence. You want to use Vertex AI to understand your model’s results. What should you do?. Configure feature-based explanations by using Integrated Gradients. Set visualization type to PIXELS, and set clip_percent_upperbound to 95. Create an index by using Vertex AI Matching Engine. Query the index with your mislabeled images. Configure feature-based explanations by using XRAI. Set visualization type to OUTLINES, and set polarity to positive. Configure example-based explanations. Specify the embedding output layer to be used for the latent space representation. You are training models in Vertex AI by using data that spans across multiple Google Cloud projects. You need to find, track, and compare the performance of the different versions of your models. Which Google Cloud services should you include in your ML workflow?. Dataplex, Vertex AI Feature Store, and Vertex AI TensorBoard. Vertex AI Pipelines, Vertex AI Feature Store, and Vertex AI Experiments. Dataplex, Vertex AI Experiments, and Vertex AI ML Metadata. Vertex AI Pipelines, Vertex AI Experiments, and Vertex AI Metadata. You are using Keras and TensorFlow to develop a fraud detection model. Records of customer transactions are stored in a large table in BigQuery. You need to preprocess these records in a cost-effective and efficient way before you use them to train the model. The trained model will be used to perform batch inference in BigQuery. How should you implement the preprocessing workflow?. Implement a preprocessing pipeline by using Apache Spark, and run the pipeline on Dataproc. Save the preprocessed data as CSV files in a Cloud Storage bucket. Load the data into a pandas DataFrame. Implement the preprocessing steps using pandas transformations, and train the model directly on the DataFrame. Perform preprocessing in BigQuery by using SQL. Use the BigQueryClient in TensorFlow to read the data directly from BigQuery. Implement a preprocessing pipeline by using Apache Beam, and run the pipeline on Dataflow. Save the preprocessed data as CSV files in a Cloud Storage bucket. You need to use TensorFlow to train an image classification model. Your dataset is located in a Cloud Storage directory and contains millions of labeled images. Before training the model, you need to prepare the data. You want the data preprocessing and model training workflow to be as efficient, scalable, and low maintenance as possible. What should you do?. 1. Create a Dataflow job that creates sharded TFRecord files in a Cloud Storage directory. 2. Reference tf.data.TFRecordDataset in the training script. 3. Train the model by using Vertex AI Training with a V100 GPU. 1. Create a Dataflow job that moves the images into multiple Cloud Storage directories, where each directory is named according to the corresponding label 2. Reference tfds.folder_dataset:ImageFolder in the training script. 3. Train the model by using Vertex AI Training with a V100 GPU. 1. Create a Jupyter notebook that uses an nt-standard-64 V100 GPU Vertex AI Workbench instance. 2. Write a Python script that creates sharded TFRecord files in a directory inside the instance. 3. Reference tf.data.TFRecordDataset in the training script. 4. Train the model by using the Workbench instance. 1. Create a Jupyter notebook that uses an n1-standard-64, V100 GPU Vertex AI Workbench instance. 2. Write a Python script that copies the images into multiple Cloud Storage directories, where each. directory is named according to the corresponding label. 3. Reference tfds.foladr_dataset.ImageFolder in the training script. 4. Train the model by using the Workbench instance. You are building a custom image classification model and plan to use Vertex AI Pipelines to implement the end-to-end training. Your dataset consists of images that need to be preprocessed before they can be used to train the model. The preprocessing steps include resizing the images, converting them to grayscale, and extracting features. You have already implemented some Python functions for the preprocessing tasks. Which components should you use in your pipeline?. DataprocSparkBatchOp and CustomTrainingJobOp. DataflowPythonJobOp, WaitGcpResourcesOp, and CustomTrainingJobOp. dsl.ParallelFor, dsl.component, and CustomTrainingJobOp. ImageDatasetImportDataOp, dsl.component, and AutoMLImageTrainingJobRunOp. You work for a retail company that is using a regression model built with BigQuery ML to predict product sales. This model is being used to serve online predictions. Recently you developed a new version of the model that uses a different architecture (custom model). Initial analysis revealed that both models are performing as expected. You want to deploy the new version of the model to production and monitor the performance over the next two months. You need to minimize the impact to the existing and future model users. How should you deploy the model?. Import the new model to the same Vertex AI Model Registry as a different version of the existing model. Deploy the new model to the same Vertex AI endpoint as the existing model, and use traffic splitting to route 95% of production traffic to the BigQuery ML model and 5% of production traffic to the new model. Import the new model to the same Vertex AI Model Registry as the existing model. Deploy the models to one Vertex AI endpoint. Route 95% of production traffic to the BigQuery ML model and 5% of production traffic to the new model. Import the new model to the same Vertex AI Model Registry as the existing model. Deploy each model to a separate Vertex AI endpoint. Deploy the new model to a separate Vertex AI endpoint. Create a Cloud Run service that routes the prediction requests to the corresponding endpoints based on the input feature values. You are using Vertex AI and TensorFlow to develop a custom image classification model. You need the model’s decisions and the rationale to be understandable to your company’s stakeholders. You also want to explore the results to identify any issues or potential biases. What should you do?. 1. Use TensorFlow to generate and visualize features and statistics. 2. Analyze the results together with the standard model evaluation metrics. 1. Use TensorFlow Profiler to visualize the model execution. 2. Analyze the relationship between incorrect predictions and execution bottlenecks. 1. Use Vertex Explainable AI to generate example-based explanations. 2. Visualize the results of sample inputs from the entire dataset together with the standard model evaluation metrics. 1. Use Vertex Explainable AI to generate feature attributions. Aggregate feature attributions over the entire dataset. 2. Analyze the aggregation result together with the standard model evaluation metrics. You work for a large retailer, and you need to build a model to predict customer chum. The company has a dataset of historical customer data, including customer demographics purchase history, and website activity. You need to create the model in BigQuery ML and thoroughly evaluate its performance. What should you do?. Create a linear regression model in BigQuery ML, and register the model in Vertex AI Model Registry. Evaluate the model performance in Vertex AI . Create a logistic regression model in BigQuery ML and register the model in Vertex AI Model Registry. Evaluate the model performance in Vertex AI . Create a linear regression model in BigQuery ML. Use the ML.EVALUATE function to evaluate the model performance. Create a logistic regression model in BigQuery ML. Use the ML.CONFUSION_MATRIX function to evaluate the model performance. You are developing a model to identify traffic signs in images extracted from videos taken from the dashboard of a vehicle. You have a dataset of 100,000 images that were cropped to show one out of ten different traffic signs. The images have been labeled accordingly for model training, and are stored in a Cloud Storage bucket. You need to be able to tune the model during each training run. How should you train the model?. Train a model for object detection by using Vertex AI AutoML. Train a model for image classification by using Vertex AI AutoML. Develop the model training code for object detection, and train a model by using Vertex AI custom training. Develop the model training code for image classification, and train a model by using Vertex AI custom training. You have deployed a scikit-team model to a Vertex AI endpoint using a custom model server. You enabled autoscaling: however, the deployed model fails to scale beyond one replica, which led to dropped requests. You notice that CPU utilization remains low even during periods of high load. What should you do?. Attach a GPU to the prediction nodes. Increase the number of workers in your model server. Schedule scaling of the nodes to match expected demand. Increase the minReplicaCount in your DeployedModel configuration. You work for a pet food company that manages an online forum. Customers upload photos of their pets on the forum to share with others. About 20 photos are uploaded daily. You want to automatically and in near real time detect whether each uploaded photo has an animal. You want to prioritize time and minimize cost of your application development and deployment. What should you do?. Send user-submitted images to the Cloud Vision API. Use object localization to identify all objects in the image and compare the results against a list of animals. Download an object detection model from TensorFlow Hub. Deploy the model to a Vertex AI endpoint. Send new user-submitted images to the model endpoint to classify whether each photo has an animal. Manually label previously submitted images with bounding boxes around any animals. Build an AutoML object detection model by using Vertex AI. Deploy the model to a Vertex AI endpoint Send new user-submitted images to your model endpoint to detect whether each photo has an animal. Manually label previously submitted images as having animals or not. Create an image dataset on Vertex AI. Train a classification model by using Vertex AutoML to distinguish the two classes. Deploy the model to a Vertex AI endpoint. Send new user-submitted images to your model endpoint to classify whether each photo has an animal. You work at a mobile gaming startup that creates online multiplayer games. Recently, your company observed an increase in players cheating in the games, leading to a loss of revenue and a poor user experience You built a binary classification model to determine whether a player cheated after a completed game session, and then send a message to other downstream systems to ban the player that cheated. Your model has performed well during testing, and you now need to deploy the model to production. You want your serving solution to provide immediate classifications after a completed game session to avoid further loss of revenue. What should you do?. Import the model into Vertex AI Model Registry. Use the Vertex Batch Prediction service to run batch inference jobs. Save the model files in a Cloud Storage bucket. Create a Cloud Function to read the model files and make online inference requests on the Cloud Function. Save the model files in a VM. Load the model files each time there is a prediction request, and run an inference job on the VM. Import the model into Vertex AI Model Registry. Create a Vertex AI endpoint that hosts the model, and make online inference requests. You have created a Vertex AI pipeline that automates custom model training. You want to add a pipeline component that enables your team to most easily collaborate whenrunning different executions and comparing metrics both visually and programmatically. What should you do?. Add a component to the Vertex AI pipeline that logs metrics to a BigQuery table. Query the table to compare different executions of the pipeline. Connect BigQuery to Looker Studio to visualize metrics. Add a component to the Vertex AI pipeline that logs metrics to a BigQuery table. Load the table into a pandas DataFrame to compare different executions of the pipeline. Use Matplotlib to visualize metrics. Add a component to the Vertex AI pipeline that logs metrics to Vertex ML Metadata. Use Vertex AI Experiments to compare different executions of the pipeline. Use Vertex AI TensorBoard to visualize metrics. Add a component to the Vertex AI pipeline that logs metrics to Vertex ML Metadata. Load the Vertex ML Metadata into a pandas DataFrame to compare different executions of the pipeline. Use Matplotlib to visualize metrics. Your team is training a large number of ML models that use different algorithms, parameters, and datasets. Some models are trained in Vertex AI Pipelines, and some are trained on Vertex AI Workbench notebook instances. Your team wants to compare the performance of the models across both services. You want to minimize the effort required to store the parameters and metrics. What should you do?. Implement an additional step for all the models running in pipelines and notebooks to export parameters and metrics to BigQuery. Create a Vertex AI experiment. Submit all the pipelines as experiment runs. For models trained on notebooks log parameters and metrics by using the Vertex AI SDK. Implement all models in Vertex AI Pipelines Create a Vertex AI experiment, and associate all pipeline runs with that experiment. Store all model parameters and metrics as model metadata by using the Vertex AI Metadata API. You work on a team that builds state-of-the-art deep learning models by using the TensorFlow framework. Your team runs multiple ML experiments each week, which makes it difficult to track the experiment runs. You want a simple approach to effectively track, visualize, and debug ML experiment runs on Google Cloud while minimizing any overhead code. How should you proceed?. Set up Vertex AI Experiments to track metrics and parameters. Configure Vertex AI TensorBoard for visualization. Set up a Cloud Function to write and save metrics files to a Cloud Storage bucket. Configure a Google Cloud VM to host TensorBoard locally for visualization. Set up a Vertex AI Workbench notebook instance. Use the instance to save metrics data in a Cloud Storage bucket and to host TensorBoard locally for visualization. Set up a Cloud Function to write and save metrics files to a BigQuery table. Configure a Google Cloud VM to host TensorBoard locally for visualization. Your work for a textile manufacturing company. Your company has hundreds of machines, and each machine has many sensors. Your team used the sensory data to build hundreds of ML models that detect machine anomalies. Models are retrained daily, and you need to deploy these models in a cost-effective way. The models must operate 24/7 without downtime and make sub millisecond predictions. What should you do?. Deploy a Dataflow batch pipeline and a Vertex AI Prediction endpoint. Deploy a Dataflow batch pipeline with the Runlnference API, and use model refresh. Deploy a Dataflow streaming pipeline and a Vertex AI Prediction endpoint with autoscaling. Deploy a Dataflow streaming pipeline with the Runlnference API, and use automatic model refresh. You are developing an ML model that predicts the cost of used automobiles based on data such as location, condition, model type, color, and engine/battery efficiency. The data is updated every night. Car dealerships will use the model to determine appropriate car prices. You created a Vertex AI pipeline that reads the data splits the data into training/evaluation/test sets performs feature engineering trains the model by using the training dataset and validates the model by using the evaluation dataset. You need to configure a retraining workflow that minimizes cost. What should you do?. Compare the training and evaluation losses of the current run. If the losses are similar, deploy the model to a Vertex AI endpoint. Configure a cron job to redeploy the pipeline every night. Compare the training and evaluation losses of the current run. If the losses are similar, deploy the model to a Vertex AI endpoint with training/serving skew threshold model monitoring. When the model monitoring threshold is triggered redeploy the pipeline. Compare the results to the evaluation results from a previous run. If the performance improved deploy the model to a Vertex AI endpoint. Configure a cron job to redeploy the pipeline every night. Compare the results to the evaluation results from a previous run. If the performance improved deploy the model to a Vertex AI endpoint with training/serving skew threshold model monitoring. When the model monitoring threshold is triggered redeploy the pipeline. You recently used BigQuery ML to train an AutoML regression model. You shared results with your team and received positive feedback. You need to deploy your model for online prediction as quickly as possible. What should you do?. Retrain the model by using BigQuery ML, and specify Vertex AI as the model registry. Deploy the model from Vertex AI Model Registry to a Vertex AI endpoint,. Retrain the model by using Vertex Al Deploy the model from Vertex AI Model. Registry to a Vertex AI endpoint. Alter the model by using BigQuery ML, and specify Vertex AI as the model registry. Deploy the model from Vertex AI Model Registry to a Vertex AI endpoint. Export the model from BigQuery ML to Cloud Storage. Import the model into Vertex AI Model Registry. Deploy the model to a Vertex AI endpoint. You recently used BigQuery ML to train an AutoML regression model. You shared results with your team and received positive feedback. You need to deploy your model for online prediction as quickly as possible. What should you do?. Retrain the model by using BigQuery ML, and specify Vertex AI as the model registry. Deploy the model from Vertex AI Model Registry to a Vertex AI endpoint,. Retrain the model by using Vertex Al Deploy the model from Vertex AI Model. Registry to a Vertex AI endpoint. Alter the model by using BigQuery ML, and specify Vertex AI as the model registry. Deploy the model from Vertex AI Model Registry to a Vertex AI endpoint. Export the model from BigQuery ML to Cloud Storage. Import the model into Vertex AI Model Registry. Deploy the model to a Vertex AI endpoint. You built a deep learning-based image classification model by using on-premises data. You want to use Vertex AI to deploy the model to production. Due to security concerns, you cannot move your data to the cloud. You are aware that the input data distribution might change over time. You need to detect model performance changes in production. What should you do?. Use Vertex Explainable AI for model explainability. Configure feature-based explanations. Use Vertex Explainable AI for model explainability. Configure example-based explanations. Create a Vertex AI Model Monitoring job. Enable training-serving skew detection for your model. Create a Vertex AI Model Monitoring job. Enable feature attribution skew and drift detection for your model. You trained a model packaged it with a custom Docker container for serving, and deployed it to Vertex AI Model Registry. When you submit a batch prediction job, it fails with this error: "Error model server never became ready. Please validate that your model file or container configuration are valid. " There are no additional errors in the logs. What should you do?. Add a logging configuration to your application to emit logs to Cloud Logging. Change the HTTP port in your model’s configuration to the default value of 8080. Change the healthRoute value in your model’s configuration to /healthcheck. Pull the Docker image locally, and use the docker run command to launch it locally. Use the docker logs command to explore the error logs. You are developing an ML model to identify your company’s products in images. You have access to over one million images in a Cloud Storage bucket. You plan to experiment with different TensorFlow models by using Vertex AI Training. You need to read images at scale during training while minimizing data I/O bottlenecks. What should you do?. Load the images directly into the Vertex AI compute nodes by using Cloud Storage FUSE. Read the images by using the tf.data.Dataset.from_tensor_slices function. Create a Vertex AI managed dataset from your image data. Access the AIP_TRAINING_DATA_URI environment variable to read the images by using the tf.data.Dataset.list_files function. Convert the images to TFRecords and store them in a Cloud Storage bucket. Read the TFRecords by using the tf.data.TFRecordDataset function. Store the URLs of the images in a CSV file. Read the file by using the tf.data.experimental.CsvDataset function. You work at an ecommerce startup. You need to create a customer churn prediction model. Your company’s recent sales records are stored in a BigQuery table. You want to understand how your initial model is making predictions. You also want to iterate on the model as quickly as possible while minimizing cost. How should you build your first model?. Export the data to a Cloud Storage bucket. Load the data into a pandas DataFrame on Vertex AI Workbench and train a logistic regression model with scikit-learn. Create a tf.data.Dataset by using the TensorFlow BigQueryClient. Implement a deep neural network in TensorFlow. Prepare the data in BigQuery and associate the data with a Vertex AI dataset. Create an AutoMLTabularTrainingJob to tram a classification model. Export the data to a Cloud Storage bucket. Create a tf.data.Dataset to read the data from Cloud Storage. Implement a deep neural network in TensorFlow. You are developing a training pipeline for a new XGBoost classification model based on tabular data. The data is stored in a BigQuery table. You need to complete the following steps: 1. Randomly split the data into training and evaluation datasets in a 65/35 ratio2. Conduct feature engineering 3. Obtain metrics for the evaluation dataset 4. Compare models trained in different pipeline executions How should you execute these steps?. 1. Using Vertex AI Pipelines, add a component to divide the data into training and evaluation sets, and add another component for feature engineering. 2. Enable autologging of metrics in the training component. 3. Compare pipeline runs in Vertex AI Experiments. 1. Using Vertex AI Pipelines, add a component to divide the data into training and evaluation sets, and add another component for feature engineering. 2. Enable autologging of metrics in the training component. 3. Compare models using the artifacts’ lineage in Vertex ML Metadata. 1. In BigQuery ML, use the CREATE MODEL statement with BOOSTED_TREE_CLASSIFIER as the model type and use BigQuery to handle the data splits. 2. Use a SQL view to apply feature engineering and train the model using the data in that view. 3. Compare the evaluation metrics of the models by using a SQL query with the ML.TRAINING_INFO statement. 1. In BigQuery ML, use the CREATE MODEL statement with BOOSTED_TREE_CLASSIFIER as the model type and use BigQuery to handle the data splits. 2. Use ML TRANSFORM to specify the feature engineering transformations and tram the model using the data in the table. 3. Compare the evaluation metrics of the models by using a SQL query with the ML.TRAINING_INFO statement. You work for a company that sells corporate electronic products to thousands of businesses worldwide. Your company stores historical customer data in BigQuery. You need to build a model that predicts customer lifetime value over the next three years. Youwant to use the simplest approach to build the model and you want to have access to visualization tools. What should you do?. Create a Vertex AI Workbench notebook to perform exploratory data analysis. Use IPython magics to create a new BigQuery table with input features. Use the BigQuery console to run the CREATE MODEL statement. Validate the results by using the ML.EVALUATE and ML.PREDICT statements. Run the CREATE MODEL statement from the BigQuery console to create an AutoML model. Validate the results by using the ML.EVALUATE and ML.PREDICT statements. Create a Vertex AI Workbench notebook to perform exploratory data analysis and create input features. Save the features as a CSV file in Cloud Storage. Import the CSV file as a new BigQuery table. Use the BigQuery console to run the CREATE MODEL statement. Validate the results by using the ML.EVALUATE and ML.PREDICT statements. Create a Vertex AI Workbench notebook to perform exploratory data analysis. Use IPython magics to create a new BigQuery table with input features, create the model, and validate the results by using the CREATE MODEL, ML.EVALUATE, and ML.PREDICT statements. You work for a delivery company. You need to design a system that stores and manages features such as parcels delivered and truck locations over time. The system must retrieve the features with low latency and feed those features into a model for online prediction. The data science team will retrieve historical data at a specific point in time for model training. You want to store the features with minimal effort. What should you do?. Store features in Bigtable as key/value data. Store features in Vertex AI Feature Store. Store features as a Vertex AI dataset, and use those features to train the models hosted in Vertex AI endpoints. Store features in BigQuery timestamp partitioned tables, and use the BigQuery Storage Read API to serve the features. You are working on a prototype of a text classification model in a managed Vertex AI Workbench notebook. You want to quickly experiment with tokenizing text by using a Natural Language Toolkit (NLTK) library. How should you add the library to your Jupyter kernel?. Install the NLTK library from a terminal by using the pip install nltk command. Write a custom Dataflow job that uses NLTK to tokenize your text and saves the output to Cloud Storage. Create a new Vertex AI Workbench notebook with a custom image that includes the NLTK library. Install the NLTK library from a Jupyter cell by using the !pip install nltk --user command. You have recently used TensorFlow to train a classification model on tabular data. You have created a Dataflow pipeline that can transform several terabytes of data into training or prediction datasets consisting of TFRecords. You now need to productionize the model, and you want the predictions to be automatically uploaded to a BigQuery table on a weekly schedule. What should you do?. Import the model into Vertex AI and deploy it to a Vertex AI endpoint. On Vertex AI Pipelines, create a pipeline that uses the DataflowPythonJobOp and the ModelBacthPredictOp components. Import the model into Vertex AI and deploy it to a Vertex AI endpoint. Create a Dataflow pipeline that reuses the data processing logic sends requests to the endpoint, and then uploads predictions to a BigQuery table. Import the model into Vertex AI. On Vertex AI Pipelines, create a pipeline that uses the DataflowPvthonJobOp and the ModelBatchPredictOp components. Import the model into BigQuery. Implement the data processing logic in a SQL query. On Vertex AI Pipelines create a pipeline that uses the BigquervQueryJobOp and the BigqueryPredictModelJobOp components. You work for an online grocery store. You recently developed a custom ML model that recommends a recipe when a user arrives at the website. You chose the machine type on the Vertex AI endpoint to optimize costs by using the queries per second (QPS) that the model can serve, and you deployed it on a single machine with 8 vCPUs and no accelerators. A holiday season is approaching and you anticipate four times more traffic during this time than the typical daily traffic. You need to ensure that the model can scale efficiently to the increased demand. What should you do?. 1. Maintain the same machine type on the endpoint. 2. Set up a monitoring job and an alert for CPU usage. 3. If you receive an alert, add a compute node to the endpoint. 1. Change the machine type on the endpoint to have 32 vCPUs. 2. Set up a monitoring job and an alert for CPU usage. 3. If you receive an alert, scale the vCPUs further as needed. 1. Maintain the same machine type on the endpoint Configure the endpoint to enable autoscaling based on vCPU usage. 2. Set up a monitoring job and an alert for CPU usage. 3. If you receive an alert, investigate the cause. 1. Change the machine type on the endpoint to have a GPU. Configure the endpoint to enable autoscaling based on the GPU usage. 2. Set up a monitoring job and an alert for GPU usage. 3. If you receive an alert, investigate the cause. You recently trained an XGBoost model on tabular data. You plan to expose the model for internal use as an HTTP microservice. After deployment, you expect a small number of incoming requests. You want to productionize the model with the least amount of effort and latency. What should you do?. Deploy the model to BigQuery ML by using CREATE MODEL with the BOOSTED_TREE_REGRESSOR statement, and invoke the BigQuery API from the microservice. Build a Flask-based app. Package the app in a custom container on Vertex AI, and deploy it to Vertex AI Endpoints. Build a Flask-based app. Package the app in a Docker image, and deploy it to Google Kubernetes Engine in Autopilot mode. Use a prebuilt XGBoost Vertex container to create a model, and deploy it to Vertex AI Endpoints. You work for an international manufacturing organization that ships scientific products all over the world. Instruction manuals for these products need to be translated to 15 different languages. Your organization’s leadership team wants to start using machine learning to reduce the cost of manual human translations and increase translation speed.You need to implement a scalable solution that maximizes accuracy and minimizes operational overhead. You also want to include a process to evaluate and fix incorrect translations. What should you do?. Create a workflow using Cloud Function triggers. Configure a Cloud Function that is triggered when documents are uploaded to an input Cloud Storage bucket. Configure another Cloud Function that translates the documents using the Cloud Translation API, and saves the translations to an output Cloud Storage bucket. Use human reviewers to evaluate the incorrect translations. Create a Vertex AI pipeline that processes the documents launches, an AutoML Translation training job, evaluates the translations and deploys the model to a Vertex AI endpoint with autoscaling and model monitoring. When there is a predetermined skew between training and live data, re-trigger the pipeline with the latest data. Use AutoML Translation to train a model. Configure a Translation Hub project, and use the trained model to translate the documents. Use human reviewers to evaluate the incorrect translations. Use Vertex AI custom training jobs to fine-tune a state-of-the-art open source pretrained model with your data. Deploy the model to a Vertex AI endpoint with autoscaling and model monitoring. When there is a predetermined skew between the training and live data, configure a trigger to run another training job with the latest data. You have developed an application that uses a chain of multiple scikit-learn models to predict the optimal price for your company’s products. The workflow logic is shown in the diagram. Members of your team use the individual models in other solution workflows. You want to deploy this workflow while ensuring version control for each individual model and the overall workflow. Your application needs to be able to scale down to zero. You want to minimize the compute resource utilization and the manual effort required to manage this solution. What should you do?. Expose each individual model as an endpoint in Vertex AI Endpoints. Create a custom container endpoint to orchestrate the workflow. Create a custom container endpoint for the workflow that loads each model’s individual files Track the versions of each individual model in BigQuery. Expose each individual model as an endpoint in Vertex AI Endpoints. Use Cloud Run to orchestrate the workflow. Load each model’s individual files into Cloud Run. Use Cloud Run to orchestrate the workflow. Track the versions of each individual model in BigQuery. You are developing a model to predict whether a failure will occur in a critical machine part. You have a dataset consisting of a multivariate time series and labels indicating whether the machine part failed. You recently started experimenting with a few different preprocessing and modeling approaches in a Vertex AI Workbench notebook. You want to log data and track artifacts from each run. How should you set up your experiments?. 1. Use the Vertex AI SDK to create an experiment and set up Vertex ML Metadata. 2. Use the log_time_series_metrics function to track the preprocessed data, and use the log_merrics function to log loss values. 1. Use the Vertex AI SDK to create an experiment and set up Vertex ML Metadata. 2. Use the log_time_series_metrics function to track the preprocessed data, and use the log_metrics function to log loss values. 1. Create a Vertex AI TensorBoard instance and use the Vertex AI SDK to create an experiment and associate the TensorBoard instance. 2. Use the assign_input_artifact method to track the preprocessed data and use the log_time_series_metrics function to log loss values. 1. Create a Vertex AI TensorBoard instance, and use the Vertex AI SDK to create an experiment and associate the TensorBoard instance. 2. Use the log_time_series_metrics function to track the preprocessed data, and use the log_metrics function to log loss values. You are developing a recommendation engine for an online clothing store. The historical customer transaction data is stored in BigQuery and Cloud Storage. You need to perform exploratory data analysis (EDA), preprocessing and model training. You plan to rerun these EDA, preprocessing, and training steps as you experiment with different types of algorithms. You want to minimize the cost and development effort of running these steps as you experiment. How should you configure the environment?. Create a Vertex AI Workbench user-managed notebook using the default VM instance, and use the %%bigquerv magic commands in Jupyter to query the tables. Create a Vertex AI Workbench managed notebook to browse and query the tables directly from the JupyterLab interface. Create a Vertex AI Workbench user-managed notebook on a Dataproc Hub, and use the %%bigquery magic commands in Jupyter to query the tables. Create a Vertex AI Workbench managed notebook on a Dataproc cluster, and use the spark-bigquery-connector to access the tables. You recently deployed a model to a Vertex AI endpoint and set up online serving in Vertex AI Feature Store. You have configured a daily batch ingestion job to update your featurestore. During the batch ingestion jobs, you discover that CPU utilization is high in your featurestore’s online serving nodes and that feature retrieval latency is high. You need to improve online serving performance during the daily batch ingestion. What should you do?. Schedule an increase in the number of online serving nodes in your featurestore prior to the batch ingestion jobs. Enable autoscaling of the online serving nodes in your featurestore. Enable autoscaling for the prediction nodes of your DeployedModel in the Vertex AI endpoint. Increase the worker_count in the ImportFeatureValues request of your batch ingestion job. You are developing a custom TensorFlow classification model based on tabular data. Your raw data is stored in BigQuery. contains hundreds of millions of rows, and includes both categorical and numerical features. You need to use a MaxMin scaler on some numerical features, and apply a one-hot encoding to some categorical features such as SKU names. Your model will be trained over multiple epochs. You want to minimize the effort and cost of your solution. What should you do?. 1. Write a SQL query to create a separate lookup table to scale the numerical features. 2. Deploy a TensorFlow-based model from Hugging Face to BigQuery to encode the text features. 3. Feed the resulting BigQuery view into Vertex AI Training. 1. Use BigQuery to scale the numerical features. 2. Feed the features into Vertex AI Training. 3. Allow TensorFlow to perform the one-hot text encoding. 1. Use TFX components with Dataflow to encode the text features and scale the numerical features. 2. Export results to Cloud Storage as TFRecords. 3. Feed the data into Vertex AI Training. 1. Write a SQL query to create a separate lookup table to scale the numerical features. 2. Perform the one-hot text encoding in BigQuery. 3. Feed the resulting BigQuery view into Vertex AI Training. You work for a retail company. You have been tasked with building a model to determine the probability of churn for each customer. You need the predictions to be interpretable so the results can be used to develop marketing campaigns that target at-risk customers. What should you do?. Build a random forest regression model in a Vertex AI Workbench notebook instance. Configure the model to generate feature importances after the model is trained. Build an AutoML tabular regression model. Configure the model to generate explanations when it makes predictions. Build a custom TensorFlow neural network by using Vertex AI custom training. Configure the model to generate explanations when it makes predictions. Build a random forest classification model in a Vertex AI Workbench notebook instance. Configure the model to generate feature importances after the model is trained. You work for a company that is developing an application to help users with meal planning. You want to use machine learning to scan a corpus of recipes and extract each ingredient (e.g., carrot, rice, pasta) and each kitchen cookware (e.g., bowl, pot, spoon) mentioned. Each recipe is saved in an unstructured text file. What should you do?. Create a text dataset on Vertex AI for entity extraction Create two entities called “ingredient” and “cookware”, and label at least 200 examples of each entity. Train an AutoML entity extraction model to extract occurrences of these entity types. Evaluate performance on a holdout dataset. Create a multi-label text classification dataset on Vertex AI. Create a test dataset, and label each recipe that corresponds to its ingredients and cookware. Train a multi-class classification model. Evaluate the model’s performance on a holdout dataset. Use the Entity Analysis method of the Natural Language API to extract the ingredients and cookware from each recipe. Evaluate the model's performance on a prelabeled dataset. Create a text dataset on Vertex AI for entity extraction. Create as many entities as there are different ingredients and cookware. Train an AutoML entity extraction model to extract those entities. Evaluate the model’s performance on a holdout dataset. You work for an organization that operates a streaming music service. You have a custom production model that is serving a “next song” recommendation based on a user's recent listening history. Your model is deployed on a Vertex AI endpoint. You recently retrained the same model by using fresh data. The model received positive test results offline. You now want to test the new model in production while minimizing complexity. What should you do?. Create a new Vertex AI endpoint for the new model and deploy the new model to that new endpoint. Build a service to randomly send 5% of production traffic tothe new endpoint. Monitor end-user metrics such as listening time. If end-user metrics improve between models over time, gradually increase the percentage of production traffic sent to the new endpoint. Capture incoming prediction requests in BigQuery. Create an experiment in Vertex AI Experiments. Run batch predictions for both models using the captured data. Use the user’s selected song to compare the models performance side by side. If the new model’s performance metrics are better than the previous model, deploy the new model to production. Deploy the new model to the existing Vertex AI endpoint. Use traffic splitting to send 5% of production traffic to the new model. Monitor end-user metrics, such as listening time. If end-user metrics improve between models over time, gradually increase the percentage of production traffic sent to the new model. Configure a model monitoring job for the existing Vertex AI endpoint. Configure the monitoring job to detect prediction drift and set a threshold for alerts. Update the model on the endpoint from the previous model to the new model. If you receive an alert of prediction drift, revert to the previous model. You created a model that uses BigQuery ML to perform linear regression. You need to retrain the model on the cumulative data collected every week. You want to minimize the development effort and the scheduling cost. What should you do?. Use BigQuery’s scheduling service to run the model retraining query periodically. Create a pipeline in Vertex AI Pipelines that executes the retraining query, and use the Cloud Scheduler API to run the query weekly. Use Cloud Scheduler to trigger a Cloud Function every week that runs the query for retraining the model. Use the BigQuery API Connector and Cloud Scheduler to trigger Workflows every week that retrains the model. You want to migrate a scikit-learn classifier model to TensorFlow. You plan to train the TensorFlow classifier model using the same training set that was used to train the scikit-learn model, and then compare the performances using a common test set. You want to use the Vertex AI Python SDK to manually log the evaluation metrics of each model and compare them based on their F1 scores and confusion matrices. How should you log the metrics?. Use the aiplatform.log_classification_metrics function to log the F1 score, and use the aiplatform.log_metrics function to log the confusion matrix. Use the aiplatform.log_classification_metrics function to log the F1 score and the confusion matrix. Use the aiplatform.log_metrics function to log the F1 score and the confusion matrix. Use the aiplatform.log_metrics function to log the F1 score: and use the aiplatform.log_classification_metrics function to log the confusion matrix. You are developing a model to help your company create more targeted online advertising campaigns. You need to create a dataset that you will use to train the model. You want to avoid creating or reinforcing unfair bias in the model. What should you do? (Choose two.). Include a comprehensive set of demographic features. Include only the demographic groups that most frequently interact with advertisements. Collect a random sample of production traffic to build the training dataset. Collect a stratified sample of production traffic to build the training dataset. Conduct fairness tests across sensitive categories and demographics on the trained model. You are developing an ML model in a Vertex AI Workbench notebook. You want to track artifacts and compare models during experimentation using different approaches. You need to rapidly and easily transition successful experiments to production as you iterate on your model implementation. What should you do?. 1. Initialize the Vertex SDK with the name of your experiment. Log parameters and metrics for each experiment, and attach dataset and model artifacts as inputs and outputs to each execution. 2. After a successful experiment create a Vertex AI pipeline. 1. Initialize the Vertex SDK with the name of your experiment. Log parameters and metrics for each experiment, save your dataset to a Cloud Storage bucket, and upload the models to Vertex AI Model Registry. 2. After a successful experiment, create a Vertex AI pipeline. 1. Create a Vertex AI pipeline with parameters you want to track as arguments to your PipelineJob. Use the Metrics, Model, and Dataset artifact types from the Kubeflow Pipelines DSL as the inputs and outputs of the components in your pipeline. 2. Associate the pipeline with your experiment when you submit the job. 1. Create a Vertex AI pipeline. Use the Dataset and Model artifact types from the Kubeflow Pipelines DSL as the inputs and outputs of the components in your pipeline. 2. In your training component, use the Vertex AI SDK to create an experiment run. Configure the log_params and log_metrics functions to track parameters and metrics of your experiment. You recently created a new Google Cloud project. After testing that you can submit a Vertex AI Pipeline job from the Cloud Shell, you want to use a Vertex AI Workbench user-managed notebook instance to run your code from that instance. You created the instance and ran the code but this time the job fails with an insufficient permissions error. What should you do?. Ensure that the Workbench instance that you created is in the same region of the Vertex AI Pipelines resources you will use. Ensure that the Vertex AI Workbench instance is on the same subnetwork of the Vertex AI Pipeline resources that you will use. Ensure that the Vertex AI Workbench instance is assigned the Identity and Access Management (IAM) Vertex AI User role. . Ensure that the Vertex AI Workbench instance is assigned the Identity and Access Management (IAM) Notebooks Runner role. You work for a semiconductor manufacturing company. You need to create a real-time application that automates the quality control process. High-definition images of each semiconductor are taken at the end of the assembly line in real time. The photos are uploaded to a Cloud Storage bucket along with tabular data that includes each semiconductor’s batch number, serial number, dimensions, and weight. You need to configure model training and serving while maximizing model accuracy. What should you do?. Use Vertex AI Data Labeling Service to label the images, and tram an AutoML image classification model. Deploy the model, and configure Pub/Sub to publish a message when an image is categorized into the failing class. Use Vertex AI Data Labeling Service to label the images, and train an AutoML image classification model. Schedule a daily batch prediction job that publishes a Pub/Sub message when the job completes. Convert the images into an embedding representation. Import this data into BigQuery, and train a BigQuery ML K-means clustering model with two clusters. Deploy the model and configure Pub/Sub to publish a message when a semiconductor’s data is categorized into the failing cluster. Import the tabular data into BigQuery, use Vertex AI Data Labeling Service to label the data and train an AutoML tabular classification model. Deploy the model, and configure Pub/Sub to publish a message when a semiconductor’s data is categorized into the failing class. You work for a rapidly growing social media company. Your team builds TensorFlow recommender models in an on-premises CPU cluster. The data contains billions of historical user events and 100,000 categorical features. You notice that as the data increases, the model training time increases. You plan to move the models to Google Cloud. You want to use the most scalable approach that also minimizes training time. What should you do?. Deploy the training jobs by using TPU VMs with TPUv3 Pod slices, and use the TPUEmbeading API. Deploy the training jobs in an autoscaling Google Kubernetes Engine cluster with CPUs. Deploy a matrix factorization model training job by using BigQuery ML. Deploy the training jobs by using Compute Engine instances with A100 GPUs, and use the tf.nn.embedding_lookup API. You are training and deploying updated versions of a regression model with tabular data by using Vertex AI Pipelines, Vertex AI Training, Vertex AI Experiments, and Vertex AI Endpoints. The model is deployed in a Vertex AI endpoint, and your users call the model by using the Vertex AI endpoint. You want to receive an email when the feature data distribution changes significantly, so you can retrigger the training pipeline and deploy an updated version of your model. What should you do?. Use Vertex Al Model Monitoring. Enable prediction drift monitoring on the endpoint, and specify a notification email. In Cloud Logging, create a logs-based alert using the logs in the Vertex Al endpoint. Configure Cloud Logging to send an email when the alert is triggered. In Cloud Monitoring create a logs-based metric and a threshold alert for the metric. Configure Cloud Monitoring to send an email when the alert is triggered. Export the container logs of the endpoint to BigQuery. Create a Cloud Function to run a SQL query over the exported logs and send an email. Use Cloud Scheduler to trigger the Cloud Function. You have trained an XGBoost model that you plan to deploy on Vertex AI for online prediction. You are now uploading your model to Vertex AI Model Registry, and you need to configure the explanation method that will serve online prediction requests to be returned with minimal latency. You also want to be alerted when feature attributions of the model meaningfully change over time. What should you do?. 1. Specify sampled Shapley as the explanation method with a path count of 5. 2. Deploy the model to Vertex AI Endpoints. 3. Create a Model Monitoring job that uses prediction drift as the monitoring objective. 1. Specify Integrated Gradients as the explanation method with a path count of 5. 2. Deploy the model to Vertex AI Endpoints. 3. Create a Model Monitoring job that uses prediction drift as the monitoring objective. 1. Specify sampled Shapley as the explanation method with a path count of 50. 2. Deploy the model to Vertex AI Endpoints. 3. Create a Model Monitoring job that uses training-serving skew as the monitoring objective. 1. Specify Integrated Gradients as the explanation method with a path count of 50. 2. Deploy the model to Vertex AI Endpoints. 3. Create a Model Monitoring job that uses training-serving skew as the monitoring objective. You work at a gaming startup that has several terabytes of structured data in Cloud Storage. This data includes gameplay time data, user metadata, and game metadata. You want to build a model that recommends new games to users that requires the least amount of coding. What should you do?. Load the data in BigQuery. Use BigQuery ML to train an Autoencoder model. Load the data in BigQuery. Use BigQuery ML to train a matrix factorization model. Read data to a Vertex AI Workbench notebook. Use TensorFlow to train a two-tower model. Read data to a Vertex AI Workbench notebook. Use TensorFlow to train a matrix factorization model. You work for a large bank that serves customers through an application hosted in Google Cloud that is running in the US and Singapore. You have developed a PyTorch model to classify transactions as potentially fraudulent or not. The model is a three-layer perceptron that uses both numerical and categorical features as input, and hashing happens within the model. You deployed the model to the us-central1 region on nl-highcpu-16 machines, and predictions are served in real time. The model's current median response latency is 40 ms. You want to reduce latency, especially in Singapore, where some customers are experiencing the longest delays. What should you do?. Attach an NVIDIA T4 GPU to the machines being used for online inference. Change the machines being used for online inference to nl-highcpu-32. Deploy the model to Vertex AI private endpoints in the us-central1 and asia-southeast1 regions, and allow the application to choose the appropriate endpoint. Create another Vertex AI endpoint in the asia-southeast1 region, and allow the application to choose the appropriate endpoint. You need to train an XGBoost model on a small dataset. Your training code requires custom dependencies. You want to minimize the startup time of your training job. How should you set up your Vertex AI custom training job?. Store the data in a Cloud Storage bucket, and create a custom container with your training application. In your training application, read the data from Cloud Storage and train the model. Use the XGBoost prebuilt custom container. Create a Python source distribution that includes the data and installs the dependencies at runtime. In your training application, load the data into a pandas DataFrame and train the model. Create a custom container that includes the data. In your training application, load the data into a pandas DataFrame and train the model. Store the data in a Cloud Storage bucket, and use the XGBoost prebuilt custom container to run your training application. Create a Python source distribution that installs the dependencies at runtime. In your training application, read the data from Cloud Storage and train the model. You are creating an ML pipeline for data processing, model training, and model deployment that uses different Google Cloud services. You have developed code for each individual task, and you expect a high frequency of new files. You now need to create an orchestration layer on top of these tasks. You only want this orchestration pipeline to run if new files are present in your dataset in a Cloud Storage bucket. You also want to minimize the compute node costs. What should you do?. Create a pipeline in Vertex AI Pipelines. Configure the first step to compare the contents of the bucket to the last time the pipeline was run. Use the scheduler API to run the pipeline periodically. Create a Cloud Function that uses a Cloud Storage trigger and deploys a Cloud Composer directed acyclic graph (DAG). Create a pipeline in Vertex AI Pipelines. Create a Cloud Function that uses a Cloud Storage trigger and deploys the pipeline. Deploy a Cloud Composer directed acyclic graph (DAG) with a GCSObjectUpdateSensor class that detects when a new file is added to the Cloud Storage bucket. You are using Kubeflow Pipelines to develop an end-to-end PyTorch-based MLOps pipeline. The pipeline reads data from BigQuery, processes the data, conducts feature engineering, model training, model evaluation, and deploys the model as a binary file to Cloud Storage. You are writing code for several different versions of the feature engineering and model training steps, and running each new version in Vertex AI Pipelines. Each pipeline run is taking over an hour to complete. You want to speed up the pipeline execution to reduce your development time, and you want to avoid additional costs. What should you do?. Comment out the part of the pipeline that you are not currently updating. Enable caching in all the steps of the Kubeflow pipeline. Delegate feature engineering to BigQuery and remove it from the pipeline. Add a GPU to the model training step. You work at a large organization that recently decided to move their ML and data workloads to Google Cloud. The data engineering team has exported the structured data to a Cloud Storage bucket in Avro format. You need to propose a workflow that performs analytics, creates features, and hosts the features that your ML models use for online prediction. How should you configure the pipeline?. Ingest the Avro files into Cloud Spanner to perform analytics. Use a Dataflow pipeline to create the features, and store them in Vertex AI Feature Store for online prediction. Ingest the Avro files into BigQuery to perform analytics. Use a Dataflow pipeline to create the features, and store them in Vertex AI Feature Store for online prediction. Ingest the Avro files into Cloud Spanner to perform analytics. Use a Dataflow pipeline to create the features, and store them in BigQuery for online prediction. Ingest the Avro files into BigQuery to perform analytics. Use BigQuery SQL tocreate features and store them in a separate BigQuery table for online prediction. You work at an organization that maintains a cloud-based communication platform that integrates conventional chat, voice, and video conferencing into one platform. The audio recordings are stored in Cloud Storage. All recordings have an 8 kHz sample rate and are more than one minute long. You need to implement a new feature in the platform that will automatically transcribe voice call recordings into a text for future applications, such as call summarization and sentiment analysis. How should you implement the voice call transcription feature following Google-recommended best practices?. Use the original audio sampling rate, and transcribe the audio by using the Speech-to-Text API with synchronous recognition. Use the original audio sampling rate, and transcribe the audio by using the Speech-to-Text API with asynchronous recognition. Upsample the audio recordings to 16 kHz, and transcribe the audio by using the Speech-to-Text API with synchronous recognition. Upsample the audio recordings to 16 kHz, and transcribe the audio by using the Speech-to-Text API with asynchronous recognition. You work for a multinational organization that has recently begun operations in Spain. Teams within your organization will need to work with various Spanish documents, such as business, legal, and financial documents. You want to use machine learning to help your organization get accurate translations quickly and with the least effort. Your organization does not require domain-specific terms or jargon. What should you do?. Create a Vertex AI Workbench notebook instance. In the notebook, extract sentences from the documents, and train a custom AutoML text model. Use Google Translate to translate 1,000 phrases from Spanish to English. Using these translated pairs, train a custom AutoML Translation model. Use the Document Translation feature of the Cloud Translation API to translate the documents. Create a Vertex AI Workbench notebook instance. In the notebook, convert the Spanish documents into plain text, and create a custom TensorFlow seq2seq translation model. You have a custom job that runs on Vertex AI on a weekly basis. The job is implemented using a proprietary ML workflow that produces the datasets, models, and custom artifacts, and sends them to a Cloud Storage bucket. Many different versions of the datasets and models were created. Due to compliance requirements, your company needs to track which model was used for making a particular prediction, and needs access to the artifacts for each model. How should you configure your workflows to meet these requirements?. Use the Vertex AI Metadata API inside the custom job to create context, execution, and artifacts for each model, and use events to link them together. Create a Vertex AI experiment, and enable autologging inside the custom job. Configure a TensorFlow Extended (TFX) ML Metadata database, and use the ML Metadata API. Register each model in Vertex AI Model Registry, and use model labels to store the related dataset and model information. You have recently developed a custom model for image classification by using a neural network. You need to automatically identify the values for learning rate, number of layers, and kernel size. To do this, you plan to run multiple jobs in parallel to identify the parameters that optimize performance. You want to minimize custom code development and infrastructure management. What should you do?. Train an AutoML image classification model. Create a custom training job that uses the Vertex AI Vizier SDK for parameter optimization. Create a Vertex AI hyperparameter tuning job. Create a Vertex AI pipeline that runs different model training jobs in parallel. You work for a small company that has deployed an ML model with autoscaling on Vertex AI to serve online predictions in a production environment. The current model receives about 20 prediction requests per hour with an average response time of one second. You have retrained the same model on a new batch of data, and now you are canary testing it, sending ~10% of production traffic to the new model. During this canary test, you notice that prediction requests for your new model are taking between 30 and 180 seconds to complete. What should you do?. Submit a request to raise your project quota to ensure that multiple prediction services can run concurrently. Turn off auto-scaling for the online prediction service of your new model. Use manual scaling with one node always available. Remove your new model from the production environment. Compare the new model and existing model codes to identify the cause of the performance bottleneck. Remove your new model from the production environment. For a short trial period, send all incoming prediction requests to BigQuery. Request batch predictions from your new model, and then use the Data Labeling Service to validate your model’s performance before promoting it to production. You are an ML engineer at a retail company. You have built a model that predicts a coupon to offer an ecommerce customer at checkout based on the items in their cart. When a customer goes to checkout, your serving pipeline, which is hosted on Google Cloud, joins the customer's existing cart with a row in a BigQuery table that contains the customers' historic purchase behavior and uses that as the model's input. The web team is reporting that your model is returning predictions too slowly to load the coupon offer with the rest of the web page. How should you speed up your model's predictions?. Attach an NVIDIA P100 GPU to your deployed model’s instance. Use a low latency database for the customers’ historic purchase behavior. Deploy your model to more instances behind a load balancer to distribute traffic. Create a materialized view in BigQuery with the necessary data for predictions. You built a custom ML model using scikit-learn. Training time is taking longer than expected. You decide to migrate your model to Vertex AI Training, and you want to improve the model’s training time. What should you try out first?. Train your model in a distributed mode using multiple Compute Engine VMs. Train your model using Vertex AI Training with CPUs. Migrate your model to TensorFlow, and train it using Vertex AI Training. Train your model using Vertex AI Training with GPUs. You are an ML engineer at a manufacturing company. You are creating a classification model for a predictive maintenance use case. You need to predict whether a crucial machine will fail in the next three days so that the repair crew has enough time to fix the machine before it breaks. Regular maintenance of the machine is relatively inexpensive, but a failure would be very costly. You have trained several binary classifiers to predict whether the machine will fail, where a prediction of 1 means that the ML model predicts a failure. You are now evaluating each model on an evaluation dataset. You want to choose a model that prioritizes detection while ensuring that more than 50% of the maintenance jobs triggered by your model address an imminent machine failure. Which model should you choose?. he model with the highest area under the receiver operating characteristic curve (AUC ROC) and precision greater than 0.5. The model with the lowest root mean squared error (RMSE) and recall greater than 0.5. The model with the highest recall where precision is greater than 0.5. The model with the highest precision where recall is greater than 0.5. You work for a magazine distributor and need to build a model that predicts which customers will renew their subscriptions for the upcoming year. Using yourcompany’s historical data as your training set, you created a TensorFlow model and deployed it to Vertex AI. You need to determine which customer attribute has the most predictive power for each prediction served by the model. What should you do?. Stream prediction results to BigQuery. Use BigQuery’s CORR(X1, X2) function to calculate the Pearson correlation coefficient between each feature and the target variable. Use Vertex Explainable AI. Submit each prediction request with the explain' keyword to retrieve feature attributions using the sampled Shapley method. Use Vertex AI Workbench user-managed notebooks to perform a Lasso regression analysis on your model, which will eliminate features that do not provide a strong signal. Use the What-If tool in Google Cloud to determine how your model will perform when individual features are excluded. Rank the feature importance in order of those that caused the most significant performance drop when removed from the model. Your organization manages an online message board. A few months ago, you discovered an increase in toxic language and bullying on the message board. You deployed an automated text classifier that flags certain comments as toxic or harmful. Now some users are reporting that benign comments referencing their religion are being misclassified as abusive. Upon further inspection, you find that your classifier's false positive rate is higher for comments that reference certain underrepresented religious groups. Your team has a limited budget and is already overextended. What should you do?. Add synthetic training data where those phrases are used in non-toxic ways. Remove the model and replace it with human moderation. Replace your model with a different text classifier. Raise the threshold for comments to be considered toxic or harmful. You are a data scientist at an industrial equipment manufacturing company. You are developing a regression model to estimate the power consumption in the company’s manufacturing plants based on sensor data collected from all of the plants. The sensors collect tens of millions of records every day. You need to schedule daily training runs for your model that use all the data collected up to the current date. You want your model to scale smoothly and require minimal development work. What should you do?. Develop a custom TensorFlow regression model, and optimize it using Vertex AI Training. Develop a regression model using BigQuery ML. Develop a custom scikit-learn regression model, and optimize it using Vertex AI Training. Develop a custom PyTorch regression model, and optimize it using Vertex AI Training. You recently developed a deep learning model. To test your new model, you trained it for a few epochs on a large dataset. You observe that the training and validation losses barely changed during the training run. You want to quickly debug your model. What should you do first?. Verify that your model can obtain a low loss on a small subset of the dataset. Add handcrafted features to inject your domain knowledge into the model. Use the Vertex AI hyperparameter tuning service to identify a better learning rate. Use hardware accelerators and train your model for more epochs. You need to build classification workflows over several structured datasets currently stored in BigQuery. Because you will be performing the classification several times, you want to complete the following steps without writing code: exploratory data analysis, feature selection, model building, training, and hyperparameter tuning and serving. What should you do?. Train a TensorFlow model on Vertex AI. Train a classification Vertex AutoML model. Run a logistic regression job on BigQuery ML. Use scikit-learn in Vertex AI Workbench user-managed notebooks with pandas library. You are training an object detection machine learning model on a dataset that consists of three million X-ray images, each roughly 2 GB in size. You are using Vertex AI Training to run a custom training application on a Compute Engine instance with 32-cores, 128 GB of RAM, and 1 NVIDIA P100 GPU. You notice that model training is taking a very long time. You want to decrease training time without sacrificing model performance. What should you do?. Increase the instance memory to 512 GB, and increase the batch size. Replace the NVIDIA P100 GPU with a K80 GPU in the training job. Enable early stopping in your Vertex AI Training job. Use the tf.distribute.Strategy API and run a distributed training job. You are developing a classification model to support predictions for your company’s various products. The dataset you were given for model development has class imbalance You need to minimize false positives and false negatives What evaluation metric should you use to properly train the model?. F1 score. Recall. Accuracy. Precision. You work for a bank and are building a random forest model for fraud detection. You have a dataset that includes transactions, of which 1% are identified as fraudulent. Which data transformation strategy would likely improve the performance of your classifier?. Modify the target variable using the Box-Cox transformation. Z-normalize all the numeric features. Oversample the fraudulent transaction 10 times. Log transform all numeric features. You are developing an ML model using a dataset with categorical input variables. You have randomly split half of the data into training and test sets. After applying one-hot encoding on the categorical variables in the training set, you discover that one categorical variable is missing from the test set. What should you do?. Use sparse representation in the test set. Randomly redistribute the data, with 70% for the training set and 30% for the test set. Apply one-hot encoding on the categorical variables in the test data. Collect more data representing all categories. While running a model training pipeline on Vertex Al, you discover that the evaluation step is failing because of an out-of-memory error. You are currently using TensorFlow Model Analysis (TFMA) with a standard Evaluator TensorFlow Extended (TFX) pipeline component for the evaluation step. You want to stabilize the pipeline without downgrading the evaluation quality while minimizing infrastructure overhead. What should you do?. Include the flag -runner=DataflowRunner in beam_pipeline_args to run the evaluation step on Dataflow. Move the evaluation step out of your pipeline and run it on custom Compute Engine VMs with sufficient memory. Migrate your pipeline to Kubeflow hosted on Google Kubernetes Engine, and specify the appropriate node parameters for the evaluation step. Add tfma.MetricsSpec () to limit the number of metrics in the evaluation step. You are working on a binary classification ML algorithm that detects whether an image of a classified scanned document contains a company’s logo. In the dataset, 96% of examples don’t have the logo, so the dataset is very skewed. Which metric would give you the most confidence in your model?. Precision. Recall. RMSE. F1 score. |