From 790202e6671d148b7381565480eb3d72379b68ad Mon Sep 17 00:00:00 2001 From: "docsalot-app[bot]" <207601912+docsalot-app[bot]@users.noreply.github.com> Date: Thu, 6 Nov 2025 18:26:41 +0000 Subject: [PATCH 01/11] docs: update installation.mdx for changes #1762453598998 --- installation.mdx | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/installation.mdx b/installation.mdx index 1d843eb..e8a2f49 100644 --- a/installation.mdx +++ b/installation.mdx @@ -3,11 +3,16 @@ title: Installation description: Configure Magemaker for your cloud provider --- +## Prerequisites + +- **Python 3.11+** is required +- **Python 3.13 is not supported** due to compatibility issues with Azure SDK - For Macs, maxOS >= 13.6.6 is required. Apply Silicon devices (M1) must use Rosetta terminal. You can verify, your terminals architecture by running `arch`. It should print `i386` for Rosetta terminal. + For Macs, macOS >= 13.6.6 is required. Apple Silicon devices (M1/M2/M3) must use Rosetta terminal. You can verify your terminal's architecture by running `arch`. It should print `i386` for Rosetta terminal. +## Installation Install via pip: @@ -47,7 +52,7 @@ magemaker --cloud gcp ### Azure Configuration - Follow this detailed guide for setting up Azure credentials: - [GCP Setup Guide](/configuration/Azure) + [Azure Setup Guide](/configuration/Azure) Once you have your Azure credentials, you can configure Magemaker by running: From 26981cd6197814c1d1723cb641e77dea761e564e Mon Sep 17 00:00:00 2001 From: "docsalot-app[bot]" <207601912+docsalot-app[bot]@users.noreply.github.com> Date: Thu, 6 Nov 2025 18:26:41 +0000 Subject: [PATCH 02/11] docs: update mint.json for changes #1762453598998 --- mint.json | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/mint.json b/mint.json index ccb1843..a3737f8 100644 --- a/mint.json +++ b/mint.json @@ -64,8 +64,15 @@ "pages": [ "concepts/deployment", "concepts/models", + "concepts/fine-tuning", "concepts/contributing" ] + }, + { + "group": "Reference", + "pages": [ + "reference/yaml-schema" + ] } ], "footerSocials": { From 5c13e525760f3a53630b71c791e7c14ccb9b03e8 Mon Sep 17 00:00:00 2001 From: "docsalot-app[bot]" <207601912+docsalot-app[bot]@users.noreply.github.com> Date: Thu, 6 Nov 2025 18:26:42 +0000 Subject: [PATCH 03/11] docs: update quick-start.mdx for changes #1762453598998 --- quick-start.mdx | 38 ++++++++++++++++++-------------------- 1 file changed, 18 insertions(+), 20 deletions(-) diff --git a/quick-start.mdx b/quick-start.mdx index 5853ef8..93c06e6 100644 --- a/quick-start.mdx +++ b/quick-start.mdx @@ -54,20 +54,15 @@ Example YAML for AWS deployment: ```yaml deployment: !Deployment - destination: aws + destination: aws endpoint_name: facebook-opt-test instance_count: 1 instance_type: ml.m5.xlarge - num_gpus: null - quantization: null + models: - !Model id: facebook/opt-125m - location: null - predict: null source: huggingface - task: text-generation - version: null ``` For GCP Vertex AI: @@ -76,20 +71,15 @@ For GCP Vertex AI: deployment: !Deployment destination: gcp endpoint_name: facebook-opt-test - accelerator_count: 1 + instance_count: 1 instance_type: g2-standard-12 accelerator_type: NVIDIA_L4 - num_gpus: null - quantization: null + accelerator_count: 1 models: - !Model id: facebook/opt-125m - location: null - predict: null source: huggingface - task: null - version: null ``` For Azure ML: @@ -100,14 +90,11 @@ deployment: !Deployment endpoint_name: facebook-opt-test instance_count: 1 instance_type: Standard_DS3_v2 + models: - !Model id: facebook--opt-125m - location: null - predict: null source: huggingface - task: text-generation - version: null ``` The model ids for Azure are different from AWS and GCP. Make sure to use the one provided by Azure in the Azure Model Catalog. @@ -131,6 +118,10 @@ models: ### Model Fine-tuning + +Fine-tuning is currently only available for AWS SageMaker with SageMaker JumpStart models. + + Fine-tune models using the `train` command: ```sh @@ -141,14 +132,21 @@ Example training configuration: ```yaml training: !Training - destination: aws # or gcp, azure - instance_type: ml.p3.2xlarge # varies by cloud provider + destination: aws + instance_type: ml.p3.2xlarge instance_count: 1 training_input_path: s3://your-bucket/data.csv + output_path: s3://your-bucket/output hyperparameters: !Hyperparameters epochs: 3 per_device_train_batch_size: 32 learning_rate: 2e-5 + +models: +- !Model + id: tensorflow-tc-bert-en-uncased-L-12-H-768-A-12-2 + version: 1.0.0 + source: sagemaker ``` {/* ### Recommended Models From 3e41bf430fd4817acc8a1a4b7d206069ed347847 Mon Sep 17 00:00:00 2001 From: "docsalot-app[bot]" <207601912+docsalot-app[bot]@users.noreply.github.com> Date: Thu, 6 Nov 2025 18:26:43 +0000 Subject: [PATCH 04/11] docs: update tutorials/deploying-llama-3-to-aws.mdx for changes #1762453598998 --- tutorials/deploying-llama-3-to-aws.mdx | 5 ----- 1 file changed, 5 deletions(-) diff --git a/tutorials/deploying-llama-3-to-aws.mdx b/tutorials/deploying-llama-3-to-aws.mdx index 46f0659..6411a5f 100644 --- a/tutorials/deploying-llama-3-to-aws.mdx +++ b/tutorials/deploying-llama-3-to-aws.mdx @@ -32,16 +32,11 @@ deployment: !Deployment instance_count: 1 instance_type: ml.g5.2xlarge num_gpus: 1 - quantization: null models: - !Model id: meta-llama/Meta-Llama-3-8B-Instruct - location: null - predict: null source: huggingface - task: text-generation - version: null ``` From 6f16e888ce4de2fe088dd235ec808ff9779a903d Mon Sep 17 00:00:00 2001 From: "docsalot-app[bot]" <207601912+docsalot-app[bot]@users.noreply.github.com> Date: Thu, 6 Nov 2025 18:26:44 +0000 Subject: [PATCH 05/11] docs: update tutorials/deploying-llama-3-to-azure.mdx for changes #1762453598998 --- tutorials/deploying-llama-3-to-azure.mdx | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/tutorials/deploying-llama-3-to-azure.mdx b/tutorials/deploying-llama-3-to-azure.mdx index 679ba23..1747b10 100644 --- a/tutorials/deploying-llama-3-to-azure.mdx +++ b/tutorials/deploying-llama-3-to-azure.mdx @@ -40,16 +40,12 @@ deployment: !Deployment destination: azure endpoint_name: llama3-endpoint instance_count: 1 - instance_type: Standard_NC24ads_A100_v4 + instance_type: Standard_NC24ads_A100_v4 models: - !Model id: meta-llama-meta-llama-3-8b-instruct - location: null - predict: null source: huggingface - task: text-generation - version: null ``` From 72fe32de6955205ec5abf13d120134e58d17e194 Mon Sep 17 00:00:00 2001 From: "docsalot-app[bot]" <207601912+docsalot-app[bot]@users.noreply.github.com> Date: Thu, 6 Nov 2025 18:26:44 +0000 Subject: [PATCH 06/11] docs: update tutorials/deploying-llama-3-to-gcp.mdx for changes #1762453598998 --- tutorials/deploying-llama-3-to-gcp.mdx | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/tutorials/deploying-llama-3-to-gcp.mdx b/tutorials/deploying-llama-3-to-gcp.mdx index a94d616..cbb6509 100644 --- a/tutorials/deploying-llama-3-to-gcp.mdx +++ b/tutorials/deploying-llama-3-to-gcp.mdx @@ -33,20 +33,15 @@ Example YAML for GCP deployment: deployment: !Deployment destination: gcp endpoint_name: llama3-endpoint - accelerator_count: 1 + instance_count: 1 instance_type: n1-standard-8 accelerator_type: NVIDIA_T4 - num_gpus: 1 - quantization: null + accelerator_count: 1 models: - !Model id: meta-llama/Meta-Llama-3-8B-Instruct - location: null - predict: null source: huggingface - task: text-generation - version: null ``` For gated models like llama from Meta, you have to accept terms of use for model on hugging face and adding Hugging face token to the environment are necessary for deployment to go through. From 51933f00da65add00de6b8e2f82601c5261d3b46 Mon Sep 17 00:00:00 2001 From: "docsalot-app[bot]" <207601912+docsalot-app[bot]@users.noreply.github.com> Date: Thu, 6 Nov 2025 18:26:45 +0000 Subject: [PATCH 07/11] docs: update reference/yaml-schema.mdx for changes #1762453598998 --- reference/yaml-schema.mdx | 297 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 297 insertions(+) create mode 100644 reference/yaml-schema.mdx diff --git a/reference/yaml-schema.mdx b/reference/yaml-schema.mdx new file mode 100644 index 0000000..1ff194a --- /dev/null +++ b/reference/yaml-schema.mdx @@ -0,0 +1,297 @@ +--- +title: YAML Configuration Reference +description: Complete reference for Magemaker YAML configuration files +--- + +## Overview + +Magemaker uses YAML configuration files for reproducible model deployments and training jobs. This page documents all available configuration options. + +## Deployment Configuration + +### Deployment Schema + +```yaml +deployment: !Deployment + destination: aws|gcp|azure # Required: Target cloud provider + instance_type: string # Required: Machine/instance type + instance_count: integer # Optional: Number of instances (default: 1) + endpoint_name: string # Optional: Custom endpoint name + accelerator_type: string # Optional: GPU type (GCP only) + accelerator_count: integer # Optional: Number of GPUs (GCP only) + min_replica_count: integer # Optional: Min replicas (GCP only) + max_replica_count: integer # Optional: Max replicas (GCP only) + num_gpus: integer # Optional: Number of GPUs (AWS only) + quantization: string # Optional: Quantization method (AWS only) +``` + +### Field Descriptions + +#### destination +- **Type**: String (enum) +- **Required**: Yes +- **Values**: `aws`, `gcp`, `azure` +- **Description**: The cloud provider where the model will be deployed + +#### instance_type +- **Type**: String +- **Required**: Yes +- **Description**: The machine/instance type to use for deployment +- **Examples**: + - AWS: `ml.m5.xlarge`, `ml.g5.12xlarge` + - GCP: `n1-standard-4`, `g2-standard-12` + - Azure: `Standard_DS3_v2`, `Standard_NC24ads_A100_v4` + +#### instance_count +- **Type**: Integer +- **Required**: No (default: 1) +- **Description**: Number of instances to deploy + +#### endpoint_name +- **Type**: String +- **Required**: No +- **Description**: Custom name for the endpoint. If not provided, a unique name will be generated automatically + +#### accelerator_type +- **Type**: String +- **Required**: No (GCP only) +- **Description**: Type of GPU accelerator to attach +- **Examples**: `NVIDIA_L4`, `NVIDIA_TESLA_T4`, `NVIDIA_A100_80GB` + +#### accelerator_count +- **Type**: Integer +- **Required**: No (GCP only) +- **Description**: Number of GPU accelerators to attach + +#### min_replica_count / max_replica_count +- **Type**: Integer +- **Required**: No (GCP only) +- **Description**: Minimum and maximum number of replicas for autoscaling + +#### num_gpus +- **Type**: Integer +- **Required**: No (AWS only) +- **Description**: Number of GPUs to use on the instance + +#### quantization +- **Type**: String +- **Required**: No (AWS only) +- **Description**: Quantization method for model optimization +- **Examples**: `bitsandbytes`, `gptq` + +## Model Configuration + +### Model Schema + +```yaml +models: +- !Model + id: string # Required: Model identifier + source: huggingface|sagemaker|custom # Required: Model source + task: string # Optional: Model task type + version: string # Optional: Model version + location: string # Optional: S3 URI or local path (custom models only) + predict: object # Optional: Prediction parameters +``` + +### Field Descriptions + +#### id +- **Type**: String +- **Required**: Yes +- **Description**: Model identifier +- **Examples**: + - Hugging Face (AWS/GCP): `facebook/opt-125m`, `meta-llama/Meta-Llama-3-8B-Instruct` + - Hugging Face (Azure): `facebook--opt-125m`, `meta-llama-meta-llama-3-8b-instruct` + - SageMaker: `huggingface-tc-bert-large-cased`, `tensorflow-tc-bert-en-uncased-L-12-H-768-A-12-2` + + +Azure uses different model IDs with double hyphens instead of slashes. Check the Azure Model Catalog for the correct ID. + + +#### source +- **Type**: String (enum) +- **Required**: Yes +- **Values**: `huggingface`, `sagemaker`, `custom` +- **Description**: Where the model comes from + - `huggingface`: Models from Hugging Face Hub (all cloud providers) + - `sagemaker`: AWS SageMaker JumpStart models (AWS only) + - `custom`: Your own fine-tuned models (AWS only) + +#### task +- **Type**: String +- **Required**: No +- **Description**: Model task type (auto-detected for Hugging Face models) +- **Examples**: `text-generation`, `text-classification`, `feature-extraction` + +#### version +- **Type**: String +- **Required**: No +- **Description**: Specific model version (primarily for SageMaker models) + +#### location +- **Type**: String +- **Required**: Only for custom models +- **Description**: S3 URI or local file path to the model artifacts +- **Examples**: `s3://my-bucket/models/my-model.tar.gz`, `/path/to/local/model` + +#### predict +- **Type**: Object +- **Required**: No +- **Description**: Model inference parameters +- **Example**: + ```yaml + predict: + temperature: 0.9 + top_p: 0.9 + top_k: 20 + max_new_tokens: 250 + ``` + +## Training Configuration + + +Training is currently only available for AWS SageMaker with SageMaker JumpStart models. + + +### Training Schema + +```yaml +training: !Training + destination: aws # Required: Must be 'aws' + instance_type: string # Required: Training instance type + instance_count: integer # Required: Number of training instances + training_input_path: string # Required: S3 path to training data + output_path: string # Optional: S3 path for model output + hyperparameters: !Hyperparameters # Optional: Training hyperparameters + epochs: integer + per_device_train_batch_size: integer + learning_rate: float +``` + +### Field Descriptions + +#### destination +- **Type**: String +- **Required**: Yes +- **Values**: `aws` (only) +- **Description**: Must be AWS for training + +#### instance_type +- **Type**: String +- **Required**: Yes +- **Description**: EC2 instance type for training +- **Examples**: `ml.p3.2xlarge`, `ml.p3.8xlarge`, `ml.p3.16xlarge` + +#### instance_count +- **Type**: Integer +- **Required**: Yes +- **Description**: Number of training instances + +#### training_input_path +- **Type**: String +- **Required**: Yes +- **Description**: S3 URI to training data +- **Example**: `s3://my-bucket/training-data/data.csv` + +#### output_path +- **Type**: String +- **Required**: No +- **Description**: S3 URI where trained model artifacts will be saved +- **Example**: `s3://my-bucket/model-output/` + +#### hyperparameters +- **Type**: Object +- **Required**: No +- **Description**: Training hyperparameters (overrides model defaults) +- **Fields**: + - `epochs`: Number of training epochs + - `per_device_train_batch_size`: Batch size per device + - `learning_rate`: Learning rate for optimizer + +## Complete Examples + +### AWS Deployment + +```yaml +deployment: !Deployment + destination: aws + endpoint_name: my-bert-model + instance_count: 1 + instance_type: ml.m5.xlarge + +models: +- !Model + id: google-bert/bert-base-uncased + source: huggingface +``` + +### GCP Deployment with GPU + +```yaml +deployment: !Deployment + destination: gcp + endpoint_name: my-llama-model + instance_count: 1 + instance_type: g2-standard-12 + accelerator_type: NVIDIA_L4 + accelerator_count: 1 + min_replica_count: 1 + max_replica_count: 3 + +models: +- !Model + id: meta-llama/Meta-Llama-3-8B-Instruct + source: huggingface +``` + +### Azure Deployment + +```yaml +deployment: !Deployment + destination: azure + endpoint_name: my-model-endpoint + instance_count: 1 + instance_type: Standard_DS3_v2 + +models: +- !Model + id: facebook--opt-125m + source: huggingface +``` + +### Custom Model Deployment (AWS) + +```yaml +deployment: !Deployment + destination: aws + instance_type: ml.m5.xlarge + instance_count: 1 + +models: +- !Model + id: google-bert/bert-base-uncased + source: custom + location: s3://my-bucket/fine-tuned-model.tar.gz +``` + +### Training Configuration + +```yaml +training: !Training + destination: aws + instance_type: ml.p3.2xlarge + instance_count: 1 + training_input_path: s3://my-bucket/training-data.csv + output_path: s3://my-bucket/model-output/ + hyperparameters: !Hyperparameters + epochs: 3 + per_device_train_batch_size: 32 + learning_rate: 2e-5 + +models: +- !Model + id: tensorflow-tc-bert-en-uncased-L-12-H-768-A-12-2 + version: 1.0.0 + source: sagemaker +``` From f3e8cd79d45ab06af430659b80bd265c95d82827 Mon Sep 17 00:00:00 2001 From: "docsalot-app[bot]" <207601912+docsalot-app[bot]@users.noreply.github.com> Date: Thu, 6 Nov 2025 18:26:46 +0000 Subject: [PATCH 08/11] docs: update configuration/AWS.mdx for changes #1762453598998 --- configuration/AWS.mdx | 2 -- 1 file changed, 2 deletions(-) diff --git a/configuration/AWS.mdx b/configuration/AWS.mdx index cdc4b9f..b4fc46f 100644 --- a/configuration/AWS.mdx +++ b/configuration/AWS.mdx @@ -4,8 +4,6 @@ title: AWS ### AWS CLI -To install Azure SDK on MacOS, you need to have the latest OS and you need to use Rosetta terminal. Also, make sure you have the latest version of Xcode tools installed. - Follow this guide to install the latest AWS CLI https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html From 7e18e40fab5d1a779eea9d72d02fe3c134a9433d Mon Sep 17 00:00:00 2001 From: "docsalot-app[bot]" <207601912+docsalot-app[bot]@users.noreply.github.com> Date: Thu, 6 Nov 2025 18:26:47 +0000 Subject: [PATCH 09/11] docs: update concepts/deployment.mdx for changes #1762453598998 --- concepts/deployment.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/concepts/deployment.mdx b/concepts/deployment.mdx index 66ca7a9..c224880 100644 --- a/concepts/deployment.mdx +++ b/concepts/deployment.mdx @@ -62,7 +62,7 @@ deployment: !Deployment destination: gcp endpoint_name: opt-125m-gcp instance_count: 1 - machine_type: n1-standard-4 + instance_type: n1-standard-4 accelerator_type: NVIDIA_TESLA_T4 accelerator_count: 1 From 26d67b0bbe9d5544253ec9962fa2db8ec02132b7 Mon Sep 17 00:00:00 2001 From: "docsalot-app[bot]" <207601912+docsalot-app[bot]@users.noreply.github.com> Date: Thu, 6 Nov 2025 18:26:47 +0000 Subject: [PATCH 10/11] docs: update concepts/fine-tuning.mdx for changes #1762453598998 --- concepts/fine-tuning.mdx | 46 ++++++++++++++++++++++------------------ 1 file changed, 25 insertions(+), 21 deletions(-) diff --git a/concepts/fine-tuning.mdx b/concepts/fine-tuning.mdx index 88835aa..745de89 100644 --- a/concepts/fine-tuning.mdx +++ b/concepts/fine-tuning.mdx @@ -5,6 +5,10 @@ description: Guide to fine-tuning models with Magemaker ## Fine-tuning Overview + +Fine-tuning is currently only available for **AWS SageMaker**. Support for GCP and Azure is coming soon. + + Fine-tuning allows you to adapt pre-trained models to your specific use case. Magemaker simplifies this process through YAML configuration. ### Basic Command @@ -26,10 +30,15 @@ training: !Training models: - !Model - id: your-model-id - source: huggingface + id: tensorflow-tc-bert-en-uncased-L-12-H-768-A-12-2 + version: 1.0.0 + source: sagemaker ``` + +Currently, fine-tuning only supports **SageMaker JumpStart models** (source: `sagemaker`). Support for Hugging Face models is not yet implemented. + + ### Advanced Configuration ```yaml @@ -38,15 +47,17 @@ training: !Training instance_type: ml.p3.2xlarge instance_count: 1 training_input_path: s3://your-bucket/data.csv + output_path: s3://your-bucket/output hyperparameters: !Hyperparameters epochs: 3 per_device_train_batch_size: 32 learning_rate: 2e-5 - weight_decay: 0.01 - warmup_steps: 500 - evaluation_strategy: "steps" - eval_steps: 500 - save_steps: 1000 + +models: +- !Model + id: tensorflow-tc-bert-en-uncased-L-12-H-768-A-12-2 + version: 1.0.0 + source: sagemaker ``` ## Data Preparation @@ -98,27 +109,20 @@ Popular choices: ## Hyperparameter Tuning -### Basic Parameters +### Supported Hyperparameters + +The following hyperparameters can be configured (all optional): ```yaml hyperparameters: !Hyperparameters epochs: 3 + per_device_train_batch_size: 32 learning_rate: 2e-5 - batch_size: 32 ``` -### Advanced Tuning - -```yaml -hyperparameters: !Hyperparameters - epochs: 3 - learning_rate: - min: 1e-5 - max: 1e-4 - scaling: log - batch_size: - values: [16, 32, 64] -``` + +These hyperparameters override the default values for the SageMaker JumpStart model. Any hyperparameter not specified will use the model's default value. + ## Monitoring Training From 38c70e6023a904165f4e5245d65ec4f042fa2846 Mon Sep 17 00:00:00 2001 From: "docsalot-app[bot]" <207601912+docsalot-app[bot]@users.noreply.github.com> Date: Thu, 6 Nov 2025 18:26:48 +0000 Subject: [PATCH 11/11] docs: update concepts/models.mdx for changes #1762453598998 --- concepts/models.mdx | 68 +++++++++++++++++++++++++++++++++++++-------- 1 file changed, 56 insertions(+), 12 deletions(-) diff --git a/concepts/models.mdx b/concepts/models.mdx index 0161380..fe46cbe 100644 --- a/concepts/models.mdx +++ b/concepts/models.mdx @@ -5,12 +5,12 @@ description: Guide to supported models and their requirements ## Supported Models - -Currently, Magemaker supports deployment of Hugging Face models only. Support for cloud provider marketplace models is coming soon! - +Magemaker supports multiple model sources depending on your cloud provider: ### Hugging Face Models +Hugging Face models can be deployed to **all three cloud providers** (AWS, GCP, Azure): + - LLaMA @@ -26,19 +26,63 @@ Currently, Magemaker supports deployment of Hugging Face models only. Support fo +### AWS SageMaker JumpStart Models + + +SageMaker JumpStart models are available when using the **interactive deployment menu** with `magemaker --cloud aws`. + + +AWS SageMaker JumpStart provides pre-trained, open-source models from various frameworks: + +- **Hugging Face** models +- **Meta** models (e.g., Llama) +- **TensorFlow** models +- **PyTorch** models +- **MXNet** models + +To deploy a SageMaker JumpStart model: +1. Run `magemaker --cloud aws` +2. Select "Deploy a model endpoint" +3. Choose "Deploy a Sagemaker model" +4. Search and select from available models + +Example YAML configuration for SageMaker models: +```yaml +deployment: !Deployment + destination: aws + instance_type: ml.m5.xlarge + +models: +- !Model + id: huggingface-tc-bert-large-cased + source: sagemaker +``` + +### Custom Models + +You can deploy your own fine-tuned models (currently AWS only): + +```yaml +deployment: !Deployment + destination: aws + instance_type: ml.m5.xlarge + +models: +- !Model + id: google-bert/bert-base-uncased # base model + source: custom + location: s3://your-bucket/model.tar.gz # or local path +``` + ### Future Support -We plan to add support for the following model sources: +We plan to add support for: - - Models from AWS Marketplace and SageMaker built-in algorithms - - Models from Vertex AI Model Garden and Foundation Models - + Models from Azure ML Model Catalog and Azure OpenAI @@ -65,17 +109,17 @@ We plan to add support for the following model sources: #### GCP Vertex AI 1. **Small Models** (n1-standard-4) ```yaml - machine_type: n1-standard-4 + instance_type: n1-standard-4 ``` 2. **Medium Models** (n1-standard-8 + GPU) ```yaml - machine_type: n1-standard-8 + instance_type: n1-standard-8 accelerator_type: NVIDIA_TESLA_T4 accelerator_count: 1 ``` 3. **Large Models** (a2-highgpu-1g) ```yaml - machine_type: a2-highgpu-1g + instance_type: a2-highgpu-1g ``` #### Azure ML