9) ML Engineer
![]() |
![]() |
![]() |
Título del Test:![]() 9) ML Engineer Descripción: 9) ML Engineer |




Comentarios |
---|
NO HAY REGISTROS |
You work for a bank. You have been asked to develop an ML model that will support loan application decisions. You need to determine which Vertex AI services to include in the workflow. You want to track the model’s training parameters and the metrics per training epoch. You plan to compare the performance of each version of the model to determine the best model based on your chosen metrics. Which Vertex AI services should you use?. A. Vertex ML Metadata, Vertex AI Feature Store, and Vertex AI Vizier. B. Vertex AI Pipelines, Vertex AI Experiments, and Vertex AI Vizier. C. Vertex ML Metadata, Vertex AI Experiments, and Vertex AI TensorBoard. D. Vertex AI Pipelines, Vertex AI Feature Store, and Vertex AI TensorBoard. You work for an auto insurance company. You are preparing a proof-of-concept ML application that uses images of damaged vehicles to infer damaged parts. Your team has assembled a set of annotated images from damage claim documents in the company’s database. The annotations associated with each image consist of a bounding box for each identified damaged part and the part name. You have been given a sufficient budget to train models on Google Cloud. You need to quickly create an initial model. What should you do?. A. Download a pre-trained object detection model from TensorFlow Hub. Fine-tune the model in Vertex AI Workbench by using the annotated image data. B. Train an object detection model in AutoML by using the annotated image data. C. Create a pipeline in Vertex AI Pipelines and configure the AutoMLTrainingJobRunOp component to train a custom object detection model by using the annotated image data. D. Train an object detection model in Vertex AI custom training by using the annotated image data. You are analyzing customer data for a healthcare organization that is stored in Cloud Storage. The data contains personally identifiable information (PII). You need to perform data exploration and preprocessing while ensuring the security and privacy of sensitive fields. What should you do?. A. Use the Cloud Data Loss Prevention (DLP) API to de-identify the PII before performing data exploration and preprocessing. B. Use customer-managed encryption keys (CMEK) to encrypt the PII data at rest, and decrypt the PII data during data exploration and preprocessing. C. Use a VM inside a VPC Service Controls security perimeter to perform data exploration and preprocessing. D. Use Google-managed encryption keys to encrypt the PII data at rest, and decrypt the PII data during data exploration and preprocessing. You are building a predictive maintenance model to preemptively detect part defects in bridges. You plan to use high definition images of the bridges as model inputs. You need to explain the output of the model to the relevant stakeholders so they can take appropriate action. How should you build the model?. A. Use scikit-learn to build a tree-based model, and use SHAP values to explain the model output. B. Use scikit-learn to build a tree-based model, and use partial dependence plots (PDP) to explain the model output. C. Use TensorFlow to create a deep learning-based model, and use Integrated Gradients to explain the model output. D. Use TensorFlow to create a deep learning-based model, and use the sampled Shapley method to explain the model output. You work for a hospital that wants to optimize how it schedules operations. You need to create a model that uses the relationship between the number of surgeries scheduled and beds used. You want to predict how many beds will be needed for patients each day in advance based on the scheduled surgeries. You have one year of data for the hospital organized in 365 rows. The data includes the following variables for each day: • Number of scheduled surgeries • Number of beds occupied • Date You want to maximize the speed of model development and testing. What should you do?. A. Create a BigQuery table. Use BigQuery ML to build a regression model, with number of beds as the target variable, and number of scheduled surgeries and date features (such as day of week) as the predictors. B. Create a BigQuery table. Use BigQuery ML to build an ARIMA model, with number of beds as the target variable, and date as the time variable. C. Create a Vertex AI tabular dataset. Train an AutoML regression model, with number of beds as the target variable, and number of scheduled minor surgeries and date features (such as day of the week) as the predictors. D. Create a Vertex AI tabular dataset. Train a Vertex AI AutoML Forecasting model, with number of beds as the target variable, number of scheduled surgeries as a covariate and date as the time variable. You recently developed a wide and deep model in TensorFlow. You generated training datasets using a SQL script that preprocessed raw data in BigQuery by performing instance-level transformations of the data. You need to create a training pipeline to retrain the model on a weekly basis. The trained model will be used to generate daily recommendations. You want to minimize model development and training time. How should you develop the training pipeline?. A. Use the Kubeflow Pipelines SDK to implement the pipeline. Use the BigQueryJobOp component to run the preprocessing script and the CustomTrainingJobOp component to launch a Vertex AI training job. B. Use the Kubeflow Pipelines SDK to implement the pipeline. Use the DataflowPythonJobOp component to preprocess the data and the CustomTrainingJobOp component to launch a Vertex AI training job. C. Use the TensorFlow Extended SDK to implement the pipeline Use the ExampleGen component with the BigQuery executor to ingest the data the Transform component to preprocess the data, and the Trainer component to launch a Vertex AI training job. D. Use the TensorFlow Extended SDK to implement the pipeline Implement the preprocessing steps as part of the input_fn of the model. Use the ExampleGen component with the BigQuery executor to ingest the data and the Trainer component to launch a Vertex AI training job. You are training a custom language model for your company using a large dataset. You plan to use the Reduction Server strategy on Vertex AI. You need to configure the worker pools of the distributed training job. What should you do?. A. Configure the machines of the first two worker pools to have GPUs, and to use a container image where your training code runs. Configure the third worker pool to have GPUs, and use the reductionserver container image. B. Configure the machines of the first two worker pools to have GPUs and to use a container image where your training code runs. Configure the third worker pool to use the reductionserver container image without accelerators, and choose a machine type that prioritizes bandwidth. C. Configure the machines of the first two worker pools to have TPUs and to use a container image where your training code runs. Configure the third worker pool without accelerators, and use the reductionserver container image without accelerators, and choose a machine type that prioritizes bandwidth. D. Configure the machines of the first two pools to have TPUs, and to use a container image where your training code runs. Configure the third pool to have TPUs, and use the reductionserver container image. You have trained a model by using data that was preprocessed in a batch Dataflow pipeline. Your use case requires real-time inference. You want to ensure that the data preprocessing logic is applied consistently between training and serving. What should you do?. A. Perform data validation to ensure that the input data to the pipeline is the same format as the input data to the endpoint. B. Refactor the transformation code in the batch data pipeline so that it can be used outside of the pipeline. Use the same code in the endpoint. C. Refactor the transformation code in the batch data pipeline so that it can be used outside of the pipeline. Share this code with the end users of the endpoint. D. Batch the real-time requests by using a time window and then use the Dataflow pipeline to preprocess the batched requests. Send the preprocessed requests to the endpoint. You need to develop a custom TensorFlow model that will be used for online predictions. The training data is stored in BigQuery You need to apply instance-level data transformations to the data for model training and serving. You want to use the same preprocessing routine during model training and serving. How should you configure the preprocessing routine?. A. Create a BigQuery script to preprocess the data, and write the result to another BigQuery table. B. Create a pipeline in Vertex AI Pipelines to read the data from BigQuery and preprocess it using a custom preprocessing component. C. Create a preprocessing function that reads and transforms the data from BigQuery. Create a Vertex AI custom prediction routine that calls the preprocessing function at serving time. D. Create an Apache Beam pipeline to read the data from BigQuery and preprocess it by using TensorFlow Transform and Dataflow. You are pre-training a large language model on Google Cloud. This model includes custom TensorFlow operations in the training loop. Model training will use a large batch size, and you expect training to take several weeks. You need to configure a training architecture that minimizes both training time and compute costs. What should you do?. A. Implement 8 workers of a2-megagpu-16g machines by using tf.distribute.MultiWorkerMirroredStrategy. B. Implement a TPU Pod slice with -accelerator-type=v4-l28 by using tf.distribute.TPUStrategy. C. Implement 16 workers of c2d-highcpu-32 machines by using tf.distribute.MirroredStrategy. D. Implement 16 workers of a2-highgpu-8g machines by using tf.distribute.MultiWorkerMirroredStrategy. You are building a TensorFlow text-to-image generative model by using a dataset that contains billions of images with their respective captions. You want to create a low maintenance, automated workflow that reads the data from a Cloud Storage bucket collects statistics, splits the dataset into training/validation/test datasets performs data transformations trains the model using the training/validation datasets, and validates the model by using the test dataset. What should you do?. A. Use the Apache Airflow SDK to create multiple operators that use Dataflow and Vertex AI services. Deploy the workflow on Cloud Composer. B. Use the MLFlow SDK and deploy it on a Google Kubernetes Engine cluster. Create multiple components that use Dataflow and Vertex AI services. C. Use the Kubeflow Pipelines (KFP) SDK to create multiple components that use Dataflow and Vertex AI services. Deploy the workflow on Vertex AI Pipelines. D. Use the TensorFlow Extended (TFX) SDK to create multiple components that use Dataflow and Vertex AI services. Deploy the workflow on Vertex AI Pipelines. You are developing an ML pipeline using Vertex AI Pipelines. You want your pipeline to upload a new version of the XGBoost model to Vertex AI Model Registry and deploy it to Vertex AI Endpoints for online inference. You want to use the simplest approach. What should you do?. A. Use the Vertex AI REST API within a custom component based on a vertex-ai/prediction/xgboost-cpu image. B. Use the Vertex AI ModelEvaluationOp component to evaluate the model. C. Use the Vertex AI SDK for Python within a custom component based on a python:3.10 image. D. Chain the Vertex AI ModelUploadOp and ModelDeployOp components together. You work for an online retailer. Your company has a few thousand short lifecycle products. Your company has five years of sales data stored in BigQuery. You have been asked to build a model that will make monthly sales predictions for each product. You want to use a solution that can be implemented quickly with minimal effort. What should you do?. A. Use Prophet on Vertex AI Training to build a custom model. B. Use Vertex AI Forecast to build a NN-based model. C. Use BigQuery ML to build a statistical ARIMA_PLUS model. D. Use TensorFlow on Vertex AI Training to build a custom model. You are creating a model training pipeline to predict sentiment scores from text-based product reviews. You want to have control over how the model parameters are tuned, and you will deploy the model to an endpoint after it has been trained. You will use Vertex AI Pipelines to run the pipeline. You need to decide which Google Cloud pipeline components to use. What components should you choose?. A. TabularDatasetCreateOp, CustomTrainingJobOp, and EndpointCreateOp. B. TextDatasetCreateOp, AutoMLTextTrainingOp, and EndpointCreateOp. C. TabularDatasetCreateOp. AutoMLTextTrainingOp, and ModelDeployOp. D. TextDatasetCreateOp, CustomTrainingJobOp, and ModelDeployOp. Your team frequently creates new ML models and runs experiments. Your team pushes code to a single repository hosted on Cloud Source Repositories. You want to create a continuous integration pipeline that automatically retrains the models whenever there is any modification of the code. What should be your first step to set up the CI pipeline?. A. Configure a Cloud Build trigger with the event set as "Pull Request". B. Configure a Cloud Build trigger with the event set as "Push to a branch". C. Configure a Cloud Function that builds the repository each time there is a code change. D. Configure a Cloud Function that builds the repository each time a new branch is created. You have built a custom model that performs several memory-intensive preprocessing tasks before it makes a prediction. You deployed the model to a Vertex AI endpoint, and validated that results were received in a reasonable amount of time. After routing user traffic to the endpoint, you discover that the endpoint does not autoscale as expected when receiving multiple requests. What should you do?. A. Use a machine type with more memory. B. Decrease the number of workers per machine. C. Increase the CPU utilization target in the autoscaling configurations. D. Decrease the CPU utilization target in the autoscaling configurations. Your company manages an ecommerce website. You developed an ML model that recommends additional products to users in near real time based on items currently in the user’s cart. The workflow will include the following processes: 1. The website will send a Pub/Sub message with the relevant data and then receive a message with the prediction from Pub/Sub 2. Predictions will be stored in BigQuery 3. The model will be stored in a Cloud Storage bucket and will be updated frequently You want to minimize prediction latency and the effort required to update the model. How should you reconfigure the architecture?. A. Write a Cloud Function that loads the model into memory for prediction. Configure the function to be triggered when messages are sent to Pub/Sub. B. Create a pipeline in Vertex AI Pipelines that performs preprocessing, prediction, and postprocessing. Configure the pipeline to be triggered by a Cloud Function when messages are sent to Pub/Sub. C. Expose the model as a Vertex AI endpoint. Write a custom DoFn in a Dataflow job that calls the endpoint for prediction. D. Use the RunInference API with WatchFilePattern in a Dataflow job that wraps around the model and serves predictions. You are collaborating on a model prototype with your team. You need to create a Vertex AI Workbench environment for the members of your team and also limit access to other employees in your project. What should you do?. A. 1. Create a new service account and grant it the Notebook Viewer role 2. Grant the Service Account User role to each team member on the service account 3. Grant the Vertex AI User role to each team member 4. Provision a Vertex AI Workbench user-managed notebook instance that uses the new service account. B. 1. Grant the Vertex AI User role to the default Compute Engine service account 2. Grant the Service Account User role to each team member on the default Compute Engine service account 3. Provision a Vertex AI Workbench user-managed notebook instance that uses the default Compute Engine service account. C. 1. Create a new service account and grant it the Vertex AI User role 2. Grant the Service Account User role to each team member on the service account 3. Grant the Notebook Viewer role to each team member. 4. Provision a Vertex AI Workbench user-managed notebook instance that uses the new service account. D. 1. Grant the Vertex AI User role to the primary team member 2. Grant the Notebook Viewer role to the other team members 3. Provision a Vertex AI Workbench user-managed notebook instance that uses the primary user’s account. You work at a leading healthcare firm developing state-of-the-art algorithms for various use cases. You have unstructured textual data with custom labels. You need to extract and classify various medical phrases with these labels. What should you do?. A. Use the Healthcare Natural Language API to extract medical entities. B. Use a BERT-based model to fine-tune a medical entity extraction model. C. Use AutoML Entity Extraction to train a medical entity extraction model. D. Use TensorFlow to build a custom medical entity extraction model. You developed a custom model by using Vertex AI to predict your application's user churn rate. You are using Vertex AI Model Monitoring for skew detection. The training data stored in BigQuery contains two sets of features - demographic and behavioral. You later discover that two separate models trained on each set perform better than the original model. You need to configure a new model monitoring pipeline that splits traffic among the two models. You want to use the same prediction-sampling-rate and monitoring-frequency for each model. You also want to minimize management effort. What should you do?. A. Keep the training dataset as is. Deploy the models to two separate endpoints, and submit two Vertex AI Model Monitoring jobs with appropriately selected feature-thresholds parameters. B. Keep the training dataset as is. Deploy both models to the same endpoint and submit a Vertex AI Model Monitoring job with a monitoring-config-from-file parameter that accounts for the model IDs and feature selections. C. Separate the training dataset into two tables based on demographic and behavioral features. Deploy the models to two separate endpoints, and submit two Vertex AI Model Monitoring jobs. D. Separate the training dataset into two tables based on demographic and behavioral features. Deploy both models to the same endpoint, and submit a Vertex AI Model Monitoring job with a monitoring-config-from-file parameter that accounts for the model IDs and training datasets. You work for a pharmaceutical company based in Canada. Your team developed a BigQuery ML model to predict the number of flu infections for the next month in Canada. Weather data is published weekly, and flu infection statistics are published monthly. You need to configure a model retraining policy that minimizes cost. What should you do?. A. Download the weather and flu data each week. Configure Cloud Scheduler to execute a Vertex AI pipeline to retrain the model weekly. B. Download the weather and flu data each month. Configure Cloud Scheduler to execute a Vertex AI pipeline to retrain the model monthly. C. Download the weather and flu data each week. Configure Cloud Scheduler to execute a Vertex AI pipeline to retrain the model every month. D. Download the weather data each week, and download the flu data each month. Deploy the model to a Vertex AI endpoint with feature drift monitoring, and retrain the model if a monitoring alert is detected. You are building a MLOps platform to automate your company’s ML experiments and model retraining. You need to organize the artifacts for dozens of pipelines. How should you store the pipelines’ artifacts?. A. Store parameters in Cloud SQL, and store the models’ source code and binaries in GitHub. B. Store parameters in Cloud SQL, store the models’ source code in GitHub, and store the models’ binaries in Cloud Storage. C. Store parameters in Vertex ML Metadata, store the models’ source code in GitHub, and store the models’ binaries in Cloud Storage. D. Store parameters in Vertex ML Metadata and store the models’ source code and binaries in GitHub. You work for a telecommunications company. You’re building a model to predict which customers may fail to pay their next phone bill. The purpose of this model is to proactively offer at-risk customers assistance such as service discounts and bill deadline extensions. The data is stored in BigQuery and the predictive features that are available for model training include: - Customer_id - Age - Salary (measured in local currency) - Sex - Average bill value (measured in local currency) - Number of phone calls in the last month (integer) - Average duration of phone calls (measured in minutes) You need to investigate and mitigate potential bias against disadvantaged groups, while preserving model accuracy. What should you do?. A. Determine whether there is a meaningful correlation between the sensitive features and the other features. Train a BigQuery ML boosted trees classification model and exclude the sensitive features and any meaningfully correlated features. B. Train a BigQuery ML boosted trees classification model with all features. Use the ML.GLOBAL_EXPLAIN method to calculate the global attribution values for each feature of the model. If the feature importance value for any of the sensitive features exceeds a threshold, discard the model and tram without this feature. C. Train a BigQuery ML boosted trees classification model with all features. Use the ML.EXPLAIN_PREDICT method to calculate the attribution values for each feature for each customer in a test set. If for any individual customer, the importance value for any feature exceeds a predefined threshold, discard the model and train the model again without this feature. D. Define a fairness metric that is represented by accuracy across the sensitive features. Train a BigQuery ML boosted trees classification model with all features. Use the trained model to make predictions on a test set. Join the data back with the sensitive features, and calculate a fairness metric to investigate whether it meets your requirements. You recently trained a XGBoost model that you plan to deploy to production for online inference. Before sending a predict request to your model’s binary, you need to perform a simple data preprocessing step. This step exposes a REST API that accepts requests in your internal VPC Service Controls and returns predictions. You want to configure this preprocessing step while minimizing cost and effort. What should you do?. A. Store a pickled model in Cloud Storage. Build a Flask-based app, package the app in a custom container image, and deploy the model to Vertex AI Endpoints. B. Build a Flask-based app, package the app and a pickled model in a custom container image, and deploy the model to Vertex AI Endpoints. C. Build a custom predictor class based on XGBoost Predictor from the Vertex AI SDK, package it and a pickled model in a custom container image based on a Vertex built-in image, and deploy the model to Vertex AI Endpoints. D. Build a custom predictor class based on XGBoost Predictor from the Vertex AI SDK, and package the handler in a custom container image based on a Vertex built-in container image. Store a pickled model in Cloud Storage, and deploy the model to Vertex AI Endpoints. You work at a bank. You need to develop a credit risk model to support loan application decisions. You decide to implement the model by using a neural network in TensorFlow. Due to regulatory requirements, you need to be able to explain the model’s predictions based on its features. When the model is deployed, you also want to monitor the model’s performance over time. You decided to use Vertex AI for both model development and deployment. What should you do?. A. Use Vertex Explainable AI with the sampled Shapley method, and enable Vertex AI Model Monitoring to check for feature distribution drift. B. Use Vertex Explainable AI with the sampled Shapley method, and enable Vertex AI Model Monitoring to check for feature distribution skew. C. Use Vertex Explainable AI with the XRAI method, and enable Vertex AI Model Monitoring to check for feature distribution drift. D. Use Vertex Explainable AI with the XRAI method, and enable Vertex AI Model Monitoring to check for feature distribution skew. |