diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index a70cce7..53d3da3 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -12,8 +12,11 @@ We love your input! We want to make contributing to Magemaker as easy and transp - Discussing the current state of the code - Submitting a fix - Proposing new features +- Improving the documentation - Becoming a maintainer +--- + ## Ways to Contribute ### 1. Report Issues @@ -21,8 +24,8 @@ We love your input! We want to make contributing to Magemaker as easy and transp If you encounter any bugs or have feature requests: 1. Go to our GitHub repository -2. Click on "Issues" -3. Click "New Issue" +2. Click on **Issues** +3. Click **New Issue** 4. Choose the appropriate template (Bug Report or Feature Request) 5. Fill out the template with as much detail as possible @@ -33,36 +36,158 @@ If you encounter any bugs or have feature requests: ### 2. Submit Pull Requests 1. Fork the repo and create your branch from `main` -2. If you've added code that should be tested, add tests -3. If you've changed APIs, update the documentation -4. Ensure the test suite passes -5. Make sure your code lints +2. If you've added code that should be tested, add **unit and/or integration tests** (see *Testing* below) +3. If you've changed public-facing APIs or added a new feature, update **documentation** (docs live under the `docs/` folder) +4. Ensure the test suite passes: `pytest -q` +5. Run the auto-formatters and linters (see *Code Style*) 6. Issue that pull request! Create a Pull Request to propose and collaborate on changes to a repository. -## Development Process +--- + +## Local Development Setup + + + + ```bash + git clone https://github.com/YOUR_USERNAME/magemaker.git + cd magemaker + ``` + + + ```bash + python -m venv .venv + source .venv/bin/activate # Linux / macOS + .venv\Scripts\activate # Windows + ``` + + + ```bash + pip install -e ".[dev]" + ``` + + + ```bash + pytest -q + ``` + + + If you are working on the FastAPI server (`server.py`) or the OpenAI-compatible proxy, you can spin it up locally: + ```bash + uvicorn server:app --reload --port 8000 + ``` + This requires AWS credentials in a `.env` file (see `configuration/Environment`). + + + +--- + +## Code Style + +We use **Black**, **isort**, and **flake8** to maintain code quality. + +```bash +black . +isort . +flake8 +``` + +A pre-commit configuration is provided. You can install the Git hooks with: -1. Fork the repo -2. Create a new branch: `git checkout -b my-feature-branch` -3. Make your changes -4. Push to your fork and submit a pull request -5. Wait for a review and address any comments +```bash +pre-commit install +``` -## Pull Request Guidelines +This will automatically run the formatters and linters before every commit. -- Update documentation as needed -- Add tests if applicable -- Follow the existing code style -- Keep PRs small and focused -- Write clear commit messages +--- + +## Testing + +All new features **must** include tests. We use **pytest** for our test suite: + +- **Unit tests** live next to the module being tested (e.g. `magemaker/gcp/test_*.py`). +- **Integration tests** should be marked with `@pytest.mark.integration` so they can be skipped in CI if needed. +- When adding a new cloud provider or API route, include at least one happy-path test and one failure test. + +Run the full test suite: + +```bash +pytest +``` + +Run only fast unit tests: + +```bash +pytest -m "not integration" +``` + +--- + +## Documentation + +Documentation lives in the `docs/` folder and is built with [Mintlify](https://www.mintlify.com/). + + + + ```bash + npm install -g mintlify + ``` + + + ```bash + mintlify dev + ``` + + + Create or edit `.mdx`/`.md` files under `docs/` following the existing structure. Make sure to add your page to `mint.json` **navigation** if it should appear in the sidebar. + + + +When you introduce a new user-facing capability (e.g., the FastAPI server or a new cloud deployment target), **create a dedicated docs page** and cross-link it from relevant sections. + +--- + +## Pull Request Checklist + + + Code builds & tests pass (`pytest`) + Documentation updated + Linters/formatters run + Commits follow Conventional Commits + + +--- + +## Commit Message Convention + +We follow the [Conventional Commits](https://www.conventionalcommits.org/) specification: + +- `feat:` New feature +- `fix:` Bug fix +- `docs:` Documentation changes +- `style:` Code style (formatting) changes +- `refactor:` Code refactor (no behaviour change) +- `test:` Adding or updating tests +- `chore:` Build processes or auxiliary tooling + +Example: + +```bash +feat(api): add OpenAI-compatible /chat/completions route +``` + +--- ## License -By contributing, you agree that your contributions will be licensed under the Apache 2.0 License. +By contributing, you agree that your contributions will be licensed under the **Apache 2.0 License**. + +--- ## Questions? -Feel free to contact us at [support@slashml.com](mailto:support@slashml.com) if you have any questions about contributing! \ No newline at end of file +Feel free to contact us at [support@slashml.com](mailto:support@slashml.com) or join the discussion on GitHub. diff --git a/README.md b/README.md index 862eb3c..7972a2c 100644 --- a/README.md +++ b/README.md @@ -1,24 +1,80 @@ -### These are docs for the [Magemaker-Docs](https://magemaker.slashml.com) documentation site. +### Magemaker Documentation -The source code of magemaker is located at [Magemaker](https://github.com/slashml/magemaker) +This repository contains the **documentation site** for [Magemaker](https://github.com/slashml/magemaker). +If you are looking for the actual source-code, visit the Magemaker repo above. -### Development +--- -Install the [Mintlify CLI](https://www.npmjs.com/package/mintlify) to preview the documentation changes locally. To install, use the following command +## Local Development Workflow -``` -npm i -g mintlify -``` +Follow the steps below to preview documentation updates or to add new pages. -you need to have Node.js installed to use npm. +### 1. Install Mintlify CLI -Run the following command at the root of your documentation (where mint.json is) +We use the [Mintlify CLI](https://www.npmjs.com/package/mintlify) to run the docs locally. +```bash +npm i -g mintlify # requires Node >= 16 ``` + + +If you receive a "command not found" error after installation, make sure your global npm +bin folder is in your $PATH. + + +### 2. Start the Docs Site + +Run the following command from the root of the documentation repository (the folder that +contains `mint.json`). + +```bash mintlify dev ``` -#### Troubleshooting +This spins-up a hot-reloading server at `http://localhost:3000` where you can preview your +changes. + +### 3. Update / Add Pages + +1. Create or edit `.mdx` / `.md` files inside the docs tree (see `mint.json` for the + navigation structure). +2. Follow the existing component conventions (front-matter, ``, ``, etc.). +3. Commit the new/updated files and open a Pull Request. + +### Troubleshooting + +• **`mintlify dev` isn’t running** – Execute `mintlify install` to re-install + dependencies. +• **404 after launch** – Make sure you are in the directory that contains + `mint.json`. +• **Styles/components look off** – Delete the `node_modules` folder and run + `mintlify install` again. + + +--- + +## Contributing Docs + +See the main project’s **Contributing** guide for coding standards. When updating docs: + +1. Ensure every new feature in the code-base has at least one corresponding docs page. +2. Cross-link related pages for better discoverability. +3. Run the full docs site locally (`mintlify dev`) and verify: + • Navigation entry exists. + • No broken links / images. + • Dark-mode renders correctly. + + +Never commit secrets or `.env` files to the documentation repository. + + + +## Related Resources + +• **Production Docs Site:** +• **API Proxy Docs:** Newly added – see "Core Concepts → API Proxy" in the left sidebar for + information on the FastAPI server included with Magemaker. + +--- -- Mintlify dev isn't running - Run `mintlify install` it'll re-install dependencies. -- Page loads as a 404 - Make sure you are running in a folder with `mint.json` \ No newline at end of file +Happy documenting! 🎉 \ No newline at end of file diff --git a/about.mdx b/about.mdx index d9c04a4..2d350f3 100644 --- a/about.mdx +++ b/about.mdx @@ -1,43 +1,65 @@ --- title: About -description: Deploy open source AI models to AWS, GCP, and Azure in minutes +description: Deploy open-source AI models to AWS, GCP, and Azure in minutes "og:title": "Magemaker" --- ## About Magemaker -Magemaker is a Python tool that simplifies the process of deploying open source AI models to your preferred cloud provider. Instead of spending hours digging through documentation, Magemaker lets you deploy Hugging Face models directly to AWS SageMaker, Google Cloud Vertex AI, or Azure Machine Learning. +Magemaker is a Python-based DevOps toolkit that lets you turn any open-source AI model into a production-ready, fully managed endpoint on **AWS SageMaker, GCP Vertex AI, or Azure Machine Learning**—all from a single CLI. -## What we're working on next +Key capabilities: -- More robust error handling for various edge cases -- Verbose logging -- Enabling / disabling autoscaling -- Enhanced multi-cloud support features + + + Deploy to AWS, GCP, or Azure with a single YAML file or interactive CLI. + + + One-command fine-tuning pipeline for Hugging Face models on SageMaker. + + + Built-in FastAPI server (`server.py`) that exposes your SageMaker endpoints via `/chat/completions`—drop-in replacement for the OpenAI API. + + -Do submit your feature requests at https://magemaker.featurebase.app/ +## What’s New 🎉 -## Known issues +1. **OpenAI-Compatible Proxy** + Run `python server.py` to expose any deployed SageMaker endpoint at `/chat/completions` (see the new [API Proxy Guide](/concepts/api-proxy)). +2. **Environment-Driven Config** + Most settings (cloud credentials, custom `CONFIG_DIR`, default region, etc.) are now auto-detected from `.env`. -- Querying within Magemaker currently only works with text-based models -- Deleting a model is not instant, it may show up briefly after deletion -- Deploying the same model within the same minute will break -- Hugging-face models on Azure have different Ids than their Hugging-face counterparts. Follow the steps specified in the quick-start guide to find the relevant models -- For Azure deploying models other than Hugging-face is not supported yet. -- Python3.13 is not supported because of an open-issue by Azure. https://github.com/Azure/azure-sdk-for-python/issues/37600 +## Roadmap +- More robust error handling for edge cases +- Verbose / structured logging +- Autoscaling enable / disable per deployment +- Enhanced multi-cloud feature parity (fine-tuning & managed datasets on GCP/Azure) +- API proxy hardening (streaming, function-calling, rate-limits) -If there is anything we missed, do point them out at https://magemaker.featurebase.app/ +Have a feature request? Add it at https://magemaker.featurebase.app/ +## Known Issues + + +These are temporary limitations. Check the GitHub Issues board for the latest status. + + +- Only text-based pipelines are supported for querying (no vision/multimodal yet) +- Endpoint deletion is asynchronous; the name may linger for ~60 s after deletion +- Deploying the **same** model within the **same minute** can fail due to name collision +- Azure model IDs differ from Hugging Face IDs (see [Quick Start](/quick-start)) +- Azure: only Hugging Face models are currently supported +- Python 3.13 is not yet supported (Azure SDK issue [#37600](https://github.com/Azure/azure-sdk-for-python/issues/37600)) + +If we missed something, let us know on https://magemaker.featurebase.app/ ## License -Distributed under the Apache 2.0 License. See `LICENSE` for more information. +Distributed under the Apache 2.0 License. See `LICENSE` for details. ## Contact -You can reach us, faizan & jneid, at [faizan|jneid@slashml.com](mailto:support@slashml.com). - -You can give feedback at https://magemaker.featurebase.app/ +Questions or feedback? Reach out to **Faizan & Jneid** at . -We'd love to hear from you! We're excited to learn how we can make this more valuable for the community and welcome any and all feedback and suggestions. +We’d love to hear your ideas for making Magemaker even better! diff --git a/concepts/api-proxy.mdx b/concepts/api-proxy.mdx new file mode 100644 index 0000000..8165d6c --- /dev/null +++ b/concepts/api-proxy.mdx @@ -0,0 +1,126 @@ +--- +title: Local API & OpenAI-Compatible Proxy +description: Run Magemaker as a FastAPI server and query your endpoints through a REST or OpenAI-style interface +--- + +## Overview + +Magemaker ships with a lightweight **FastAPI** server (`server.py`) that lets you: + +1. Query SageMaker (or other cloud) endpoints via a simple REST API +2. Expose an **OpenAI-compatible** `/chat/completions` route (powered by [LiteLLM](https://github.com/BerriAI/litellm)) so you can drop Magemaker into any tooling that expects the OpenAI API. + + +This server is optional – you only need it if you want a local HTTP interface or OpenAI proxy. Standard `magemaker --deploy` workflows continue to work without it. + + +--- + +## Running the Server + +```bash +uvicorn server:app --host 0.0.0.0 --port 8000 --reload +``` + +Requirements: + +- A valid `.env` with at least your **AWS credentials** (see [Environment Variables](/configuration/Environment)) because the server uses the same Magemaker session to talk to SageMaker. +- Python ≥ 3.11 (same as Magemaker). + +--- + +## REST Endpoints + +| Method | Path | Description | +| ------ | ---- | ----------- | +| `GET` | `/endpoint/{endpoint_name}` | Returns metadata for the specified SageMaker endpoint. | +| `POST` | `/endpoint/{endpoint_name}/query` | Submit a `Query` payload (see schema below) and receive the model response. | +| `POST` | `/chat/completions` | OpenAI-style chat completion endpoint (see *OpenAI Compatibility*). | + +### Query Schema (`POST /endpoint/{endpoint_name}/query`) + +```jsonc +{ + "inputs": "Your prompt here", + "parameters": { + "max_new_tokens": 100, + "temperature": 0.7 + }, + "context": "optional-system-context" +} +``` + +The schema matches `magemaker.schemas.query.Query` and supports text generation as well as other Hugging Face tasks. + +--- + +## OpenAI Compatibility + +The `/chat/completions` route follows the OpenAI v1 spec. Example request: + +```bash +curl http://localhost:8000/chat/completions \ + -H "Content-Type: application/json" \ + -d '{ + "model": "meta-llama/Meta-Llama-3-8B-Instruct", + "messages": [{"role": "user", "content": "Hello!"}], + "temperature": 0.9 + }' +``` + +How it works: + +1. We look up **all** deployed endpoints that serve the requested `model`. +2. The first matching endpoint is proxied via `litellm.completion()` using the custom `sagemaker/{endpoint_name}` LiteLLM adapter. +3. The JSON response mirrors OpenAI's format, so libraries like `openai` or `LangChain` work out-of-the-box: + +```python +from openai import OpenAI +client = OpenAI(base_url="http://localhost:8000", api_key="unused") + +chat = client.chat.completions.create( + model="meta-llama/Meta-Llama-3-8B-Instruct", + messages=[{"role": "user", "content": "Tell me a joke."}] +) +print(chat.choices[0].message.content) +``` + + +If the model has **no deployed endpoints**, the server raises `NotDeployedException` (HTTP 500). Deploy the model first with `magemaker --deploy`. + + +--- + +## Authentication & Security + +The local server **does not** implement auth by default – it is intended for local prototyping behind your firewall. If you plan to expose it publicly, we recommend putting it behind an API Gateway / reverse proxy with authentication. + +--- + +## Environment Variables + +The server automatically sets `AWS_REGION_NAME` based on your Magemaker session. Ensure the following are available in `.env`: + +```bash +AWS_ACCESS_KEY_ID=... +AWS_SECRET_ACCESS_KEY=... +SAGEMAKER_ROLE=arn:aws:iam::123456789012:role/... +# Optional: Hugging Face token for gated models +HUGGING_FACE_HUB_KEY=hf_... +``` + +--- + +## Troubleshooting + +1. **`NotDeployedException`** – No SageMaker endpoint found for the requested model. Deploy first. +2. **`botocore.exceptions.NoCredentialsError`** – Check AWS credentials in `.env`. +3. **CORS errors** – Use a proxy or set appropriate CORS headers in `server.py` if calling from a browser. + +--- + +## Next Steps + +- Deploy an endpoint with `magemaker --deploy your-config.yaml`. +- Start the local server and integrate Magemaker with any OpenAI-compatible client. +- Contribute! Improvements to the proxy (streaming, async, multi-endpoint routing) are welcome – see the [Contributing](/contributing) guide. diff --git a/concepts/cli-reference.mdx b/concepts/cli-reference.mdx new file mode 100644 index 0000000..026bc54 --- /dev/null +++ b/concepts/cli-reference.mdx @@ -0,0 +1,132 @@ +--- +title: CLI Reference +description: Command-line interface for Magemaker +--- + +## Overview + +`magemaker` ships with a rich CLI that wraps common deployment, training, and resource-management operations. This page documents every top-level flag and sub-command. + +### Basic Syntax + +```bash +magemaker [global-opts] [operation] [operation-opts] +``` + +Global options: + +| Flag | Description | +|------|-------------| +| `--cloud ` | Configure credentials & default region for the provider(s) | +| `--config-dir ` | Override the default `.magemaker_config` folder | +| `--verbose` | Enable debug logging | + + +--- + +## Operations + +### 1. `--deploy ` +Deploy the model(s) described in the YAML file. + +```bash +magemaker --deploy .magemaker_config/bert.yaml +``` + +*Important flags* +| Flag | Description | +|------|-------------| +| `--dry-run` | Validate YAML & quotas without creating resources | +| `--wait` | Block until the endpoint is *InService* | + +### 2. `--train ` +Kick off a fine-tuning job using the training block in the YAML file. + +```bash +magemaker --train .magemaker_config/train-bert.yaml +``` + +### 3. `--delete ` +Delete one or multiple endpoints. + +```bash +magemaker --delete bert-base-uncased-dev +``` + +If no endpoint name is supplied the interactive menu opens so you can select multiple endpoints at once. + +### 4. `--list` +List active endpoints for the configured provider. + +```bash +magemaker --list +``` + +### 5. `--query ` *(non-interactive)* +Send a JSON query to an endpoint (bypasses the interactive prompt). + +```bash +echo '{"inputs":"Hello world"}' | magemaker --query llama3-dev +``` + +--- + +## Exit Codes + +| Code | Meaning | +|------|---------| +| `0` | Success | +| `10` | Validation error (bad YAML, missing env vars, etc.) | +| `20` | Cloud provider error (quota, auth, etc.) | + +--- + +## Environment Variables + +| Variable | Purpose | +|----------|---------| +| `AWS_ACCESS_KEY_ID` | AWS deployments | +| `PROJECT_ID`, `GCLOUD_REGION` | GCP deployments | +| `AZURE_SUBSCRIPTION_ID`, `AZURE_RESOURCE_GROUP` | Azure deployments | +| `HUGGING_FACE_HUB_KEY` | Access to gated HF models | + + +Never commit `.env` to source control. + + +--- + +## Examples + +1. **Deploy**, wait until ready, then **query**: + +```bash +magemaker --deploy llama3.yaml --wait \ + && echo '{"inputs":"Tell me a joke"}' | magemaker --query llama3-prod +``` + +2. **CI/CD:** deploy on `main` merge only if YAML passes dry-run validation. + +```yaml +jobs: + deploy: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + - name: Validate YAML + run: magemaker --deploy ml.yaml --dry-run + - name: Deploy + run: magemaker --deploy ml.yaml --wait +``` + +## Troubleshooting + +- **`No credentials found`** – run `magemaker --cloud aws` (or gcp/azure) first. +- **`Quota exceeded`** – request a service quota increase or pick a smaller instance. +- **Other issues** – run with `--verbose` and attach logs when opening a GitHub issue. + +--- + +## Changelog + +This page documents CLI features as of **Magemaker v0.4.0**. Use `magemaker --version` to check your local version. diff --git a/concepts/contributing.mdx b/concepts/contributing.mdx index 8c61908..d01a5ac 100644 --- a/concepts/contributing.mdx +++ b/concepts/contributing.mdx @@ -3,126 +3,141 @@ title: Contributing description: Guide to contributing to Magemaker --- -## Welcome to Magemaker Contributing Guide +## Welcome to the Magemaker Contributing Guide -We're excited that you're interested in contributing to Magemaker! This document will guide you through the process of contributing to the project. +We're excited that you're interested in contributing to Magemaker! This document will guide you through everything you need to know—from cloning the repo to submitting a pull-request. + + + If you discover a bug or would like to request a feature **but don't have time to submit a PR**, please open a GitHub Issue instead. We track all community requests there. + ## Ways to Contribute - Create issues for bugs you encounter while using Magemaker + Create issues for bugs you encounter while using Magemaker. - Suggest new features or improvements + Suggest new features or improvements. - Help improve our documentation + Help improve our documentation. All docs live in the `/docs` folder and are written in MDX. - Submit pull requests with bug fixes or new features + Submit pull requests with bug fixes or new features. +--- + ## Development Setup - - ```bash - git clone https://github.com/YOUR_USERNAME/magemaker.git - cd magemaker - ``` + + + ```bash + git clone https://github.com/YOUR_USERNAME/magemaker.git + cd magemaker + ``` - - ```bash - pip install -e ".[dev]" - ``` + + + We ship optional extras for development: + + ```bash + python -m venv .venv + source .venv/bin/activate + pip install -e ".[dev]" + ``` - - ```bash - git checkout -b feature/your-feature-name - ``` + + + Some tests and the local API proxy rely on cloud credentials. Copy `.env.template` to `.env` and populate the values you need (see [Environment Variables](/configuration/Environment)). + + + + ```bash + git checkout -b feat/my-awesome-change + ``` -## Development Guidelines +--- -### Code Style +## Running the Test-Suite -We use the following tools to maintain code quality: -- Black for Python code formatting -- isort for import sorting -- flake8 for style guide enforcement +We use **pytest** for unit & integration tests. + +```bash +pytest -q +``` + + +Integration tests that touch the cloud are marked with `@pytest.mark.integration`. Use `pytest -m integration` to run them explicitly. + + +--- + +## Linting & Formatting + +We enforce a strict style to keep the codebase painless to review: -Run the following before committing: ```bash black . isort . flake8 ``` -### Testing + +CI will fail if your PR doesn’t pass linting & type-checks. + - -All new features should include tests. We use pytest for our test suite. - +--- + +## Working on the API Proxy 🛰️ + +Magemaker ships a **FastAPI** server (`server.py`) that acts as: + +1. A thin REST wrapper around SageMaker endpoints, and +2. An **OpenAI-compatible** `/chat/completions` proxy. + +### Local Usage -Run tests locally: ```bash -pytest tests/ +uvicorn server:app --reload --host 0.0.0.0 --port 8000 ``` -### Documentation +Make sure your `.env` contains valid AWS creds **and** `AWS_REGION_NAME` (or let Magemaker create it via `magemaker --cloud aws`). See the brand-new [API Proxy docs](/concepts/api-proxy) for a full walkthrough. -When adding new features, please update the relevant documentation: +### Tests -1. Update the README.md if needed -2. Add/update docstrings for new functions/classes -3. Create/update relevant .mdx files in the docs directory +All proxy logic lives in `magemaker/sagemaker/query_endpoint.py` and `server.py`. Corresponding unit tests are in `tests/test_openai_proxy.py`. -## Pull Request Process +--- - - - Create a new branch for your changes: - ```bash - git checkout -b feature/your-feature - ``` - - - Make your changes and commit them with clear commit messages: - ```bash - git add . - git commit -m "feat: add new deployment option" - ``` - - - Push your changes to your fork: - ```bash - git push origin feature/your-feature - ``` - - - Open a Pull Request against the main repository - - +## Documentation Workflow -### Pull Request Guidelines +1. All docs pages live in the `/docs` folder and are written in MDX. +2. Install Mintlify to preview docs: - - - Provide a clear description of your changes - - - Include relevant tests for new features - - - Update documentation as needed - - - Keep commits focused and clean - - + ```bash + npm i -g mintlify + mintlify dev + ``` +3. **Every new feature must ship with docs.** + +--- + +## Pull-Request Checklist + +- [ ] Linted & formatted (Black / isort / flake8) +- [ ] All tests pass (`pytest`) +- [ ] Added/updated unit tests +- [ ] Added/updated documentation pages (Mintlify) +- [ ] PR title follows **Conventional Commits** +- [ ] Linked to any relevant issues + +--- ## Commit Message Convention @@ -131,38 +146,41 @@ We follow the [Conventional Commits](https://www.conventionalcommits.org/) speci - `feat:` New feature - `fix:` Bug fix - `docs:` Documentation changes -- `style:` Code style changes -- `refactor:` Code refactoring -- `test:` Adding missing tests -- `chore:` Maintenance tasks +- `style:` Code style changes (non-functional) +- `refactor:` Code refactoring (no feature change) +- `test:` Adding or refactoring tests +- `chore:` Build tasks, CI, etc. Example: + ```bash -feat(deployment): add support for custom docker images +feat(api-proxy): add support for streaming responses ``` -## Getting Help +--- -If you need help with your contribution: +## Getting Help - Join our Discord server for real-time discussions + Join our Discord for real-time discussions and help. - - Start a discussion in our GitHub repository + Start a discussion in our GitHub repo. - - Contact us at support@slashml.com + Contact us at support@slashml.com. +--- + ## Code of Conduct We are committed to providing a welcoming and inclusive experience for everyone. Please read our [Code of Conduct](https://github.com/slashml/magemaker/CODE_OF_CONDUCT.md) before participating. +--- + ## License -By contributing to Magemaker, you agree that your contributions will be licensed under the Apache 2.0 License. \ No newline at end of file +By contributing to Magemaker, you agree that your contributions will be licensed under the **Apache 2.0 License**. diff --git a/concepts/deployment.mdx b/concepts/deployment.mdx index 66ca7a9..b29f6fe 100644 --- a/concepts/deployment.mdx +++ b/concepts/deployment.mdx @@ -1,53 +1,61 @@ --- title: Deployment -description: Learn how to deploy models using Magemaker +description: Learn how to deploy, update, and remove model endpoints with Magemaker --- -## Deployment Methods +## Deployment Workflows -Magemaker offers multiple ways to deploy your models to AWS, GCP and Azure. Choose the method that best fits your workflow. +Magemaker supports two primary workflows for spinning-up model endpoints on the three major clouds (AWS, GCP, Azure): -### Interactive Deployment +1. **Interactive CLI** – best for ad-hoc experimentation. +2. **YAML-first (IaC)** – best for reproducibility, CI/CD pipelines, and team hand-offs. -When you run the `magemaker --cloud [aws|gcp|azure|all]` command, you'll get an interactive menu that walks you through the deployment process: +### 1. Interactive CLI + +Run the following command and pick a cloud provider from the prompt: ```sh magemaker --cloud [aws|gcp|azure|all] ``` -This method is great for: +The interactive menu lets you: + +- Deploy a new model +- List active endpoints +- Query an endpoint +- Delete (deactivate) one or more endpoints +- Fine-tune a model (if the selected provider supports it) -- First-time users -- Exploring available models -- Testing different configurations + +The same binary is also used for non-interactive operations (`--deploy`, `--train`, `--delete`, etc.). See the new [CLI Reference](/concepts/cli-reference) for the full command matrix. + -### YAML-based Deployment +### 2. YAML-based Deployment *(recommended)* -For reproducible deployments and CI/CD integration, use YAML configuration files: +YAML files give you **Infrastructure-as-Code** super-powers and plug nicely into GitHub Actions, GitLab CI, or any other pipeline tool. ```sh magemaker --deploy .magemaker_config/your-model.yaml ``` -This is recommended for: +Benefits: -- Production deployments -- CI/CD pipelines -- Infrastructure as Code (IaC) -- Team collaborations +- Version control for model + infra settings +- Repeatable environments (dev / staging / prod) +- Easier peer reviews & audits -## Multi-Cloud Deployment +## Multi-Cloud Examples -Magemaker supports deployment to AWS SageMaker, GCP Vertex AI, and Azure ML. Here's how to deploy the same model (facebook/opt-125m) to different cloud providers: +Below is the **same Hugging Face model** (`facebook/opt-125m`) deployed to the three supported clouds. -### AWS (SageMaker) +### AWS SageMaker ```yaml deployment: !Deployment destination: aws endpoint_name: opt-125m-aws instance_count: 1 - instance_type: ml.m5.xlarge + instance_type: ml.m5.xlarge # CPU; good for small models models: - !Model @@ -55,7 +63,7 @@ models: source: huggingface ``` -### GCP (Vertex AI) +### Google Vertex AI ```yaml deployment: !Deployment @@ -83,19 +91,18 @@ deployment: !Deployment models: - !Model - id: facebook-opt-125m + id: facebook-opt-125m # ⚠️ Azure uses a different HF ID scheme source: huggingface ``` -## YAML Configuration Reference +## YAML Schema Cheat-Sheet -### Basic Deployment +### Minimal Deployment ```yaml deployment: !Deployment destination: aws - endpoint_name: test-bert-uncased - instance_count: 1 + endpoint_name: demo-bert instance_type: ml.m5.xlarge models: @@ -104,125 +111,89 @@ models: source: huggingface ``` -### Advanced Configuration +### Full Deployment (advanced) ```yaml deployment: !Deployment destination: aws - endpoint_name: test-llama3-8b + endpoint_name: llama3-8b-prod + instance_type: ml.g5.12xlarge # 4×A10G instance_count: 1 - instance_type: ml.g5.12xlarge num_gpus: 4 + quantization: bitsandbytes # optional models: - !Model id: meta-llama/Meta-Llama-3-8B-Instruct source: huggingface predict: - temperature: 0.9 + temperature: 0.7 top_p: 0.9 - top_k: 20 - max_new_tokens: 250 + max_new_tokens: 256 ``` -## Cloud-Specific Instance Types + +`quantization` is only respected on AWS today. Support for GCP & Azure is on the roadmap. + -### AWS SageMaker Types +## Instance-Type Cheat-Sheet -Choose your instance type based on your model's requirements: +### AWS SageMaker - Good for smaller models like BERT-base - - 4 vCPU - - 16 GB Memory - - Available in free tier + Ideal for small BERT-size models
+ 4 vCPU / 16 GiB RAM (CPU-only)
- - - Required for larger models like LLaMA - - 48 vCPU - - 192 GB Memory - - 4 NVIDIA A10G GPUs + + Required for 7-13 B parameter LLaMA
+ 4 × A10G GPU / 192 GiB RAM
- - Remember to deactivate unused endpoints to avoid unnecessary charges! - - -### GCP Vertex AI Types +### Google Vertex AI - Good for smaller models - - 4 vCPU - - 15 GB Memory - - Cost-effective option + Entry-level CPU option - - - For larger models - - 12 vCPU - - 85 GB Memory - - 1 NVIDIA A100 GPU + + Single A100 GPU – great for 7-13 B models -### Azure ML Types +### Azure ML - Good for smaller models - - 4 vCPU - - 14 GB Memory - - Balanced performance + 4 vCPU / 14 GiB RAM (CPU-only) - - - For GPU workloads - - 6 vCPU - - 112 GB Memory - - 1 NVIDIA V100 GPU + + 1 × V100 GPU – balanced price/perf -## Deployment Best Practices - -1. Use meaningful endpoint names that include: - - - Model name/version - - Environment (dev/staging/prod) - - Team identifier - -2. Start with smaller instance types and scale up as needed - -3. Always version your YAML configurations - -4. Set up monitoring and alerting for your endpoints - -Make sure you setup budget monitory and alerts to avoid unexpected charges. +Always delete or stop endpoints you’re not actively using—cloud providers bill **by the minute**. +## Best Practices Checklist -## Troubleshooting Deployments - -Common issues and their solutions: - -1. **Deployment Timeout** - - - Check instance quota limits - - Verify network connectivity +1. **Meaningful endpoint names:** `--` +2. **Start small, then scale up** once latency or throughput is a problem. +3. **Version-control YAML** in the same repo as your application. +4. **Enable budget alerts** in your cloud account. +5. **Monitor logs & metrics:** CloudWatch, Stackdriver, or Azure Monitor. -2. **Instance Not Available** +## Troubleshooting - - Try a different region - - Request quota increase - - Use an alternative instance type +| Symptom | Likely Cause | Fix | +|---------|--------------|-----| +| *CreateModel* times out | Quota limits / GPU unavailable | Request quota increase or change region | +| Endpoint stuck in *Creating* | Incompatible instance type | Pick a larger instance or one with GPU | +| 403 from HF Hub | Gated model | Accept license and set `HUGGING_FACE_HUB_KEY` in `.env` | +| Deployment fails after model download | Out-of-memory | Choose instance with more RAM / GPU | -3. **Model Loading Failure** - - Verify model ID and version - - Check instance memory requirements - - Validate Hugging Face token if required - - Endpoing deployed but deployment failed. Check the logs, and do report this to us if you see this issue. + +Still stuck? Join our Discord or file an issue with logs & your YAML file. + diff --git a/concepts/fine-tuning.mdx b/concepts/fine-tuning.mdx index 88835aa..bcd1b3f 100644 --- a/concepts/fine-tuning.mdx +++ b/concepts/fine-tuning.mdx @@ -5,39 +5,54 @@ description: Guide to fine-tuning models with Magemaker ## Fine-tuning Overview -Fine-tuning allows you to adapt pre-trained models to your specific use case. Magemaker simplifies this process through YAML configuration. +Fine-tuning allows you to adapt pre-trained models to your specific use case. +**Today, fine-tuning is only supported on AWS SageMaker** (Vertex AI & Azure ML support are on the roadmap). -### Basic Command + + You’ll need sufficient **SageMaker training quota** in the region where you plan to train. GPU instances (e.g. `ml.p3.*`) are not enabled by default on new AWS accounts. + -```sh +--- + +## Basic Workflow + +1. Prepare & upload your dataset to **S3** (CSV or JSONL – see below). +2. Create a YAML config that contains a `!Training` block and at least one `!Model` block. +3. Run: + +```bash magemaker --train .magemaker_config/train-config.yaml ``` -## Configuration +Magemaker will validate the YAML, start an **SageMaker Training Job** and stream logs to your terminal. + +--- + +## Configuration Reference -### Basic Training Configuration +### Minimal Training Configuration ```yaml training: !Training - destination: aws - instance_type: ml.p3.2xlarge - instance_count: 1 - training_input_path: s3://your-bucket/training-data.csv + destination: aws # only "aws" is supported for now + instance_type: ml.p3.2xlarge # 1× V100 GPU + instance_count: 1 # distributed training coming soon + training_input_path: s3://my-bucket/my-data.csv # dataset location models: - !Model - id: your-model-id + id: google-bert/bert-base-uncased # Hugging Face model ID source: huggingface ``` -### Advanced Configuration +### Advanced Example ```yaml training: !Training destination: aws - instance_type: ml.p3.2xlarge + instance_type: ml.p3.8xlarge # 4× V100 GPUs instance_count: 1 - training_input_path: s3://your-bucket/data.csv + training_input_path: s3://my-bucket/data.jsonl hyperparameters: !Hyperparameters epochs: 3 per_device_train_batch_size: 32 @@ -47,84 +62,91 @@ training: !Training evaluation_strategy: "steps" eval_steps: 500 save_steps: 1000 + +# You can fine-tune **any** model that supports text training on HF +models: +- !Model + id: meta-llama/Meta-Llama-3-8B-Instruct + source: huggingface + task: text-generation ``` + + `training_input_path` **must** start with `s3://` – local paths are not yet supported. A helper to automatically upload local files is planned. + + +--- + ## Data Preparation ### Supported Formats - - - Simple tabular data - - Easy to prepare - - Good for classification tasks + + - Simple tabular format – one row per training example + - Ideal for classification & regression - - - - Flexible data format - - Good for complex inputs - - Supports nested structures + + - One JSON object **per line** + - Flexible structure – recommended for generative tasks -### Data Upload +### Uploading to S3 - Format your data according to model requirements + Ensure the dataset matches the expected input format for your model / task. - Use AWS CLI or console to upload data + ```bash + aws s3 cp my-data.csv s3://my-bucket/my-data.csv + ``` - - Specify S3 path in training configuration + + Use the full `s3://…` URI in `training_input_path`. -## Instance Selection +--- -### Training Instance Types +## Choosing Instance Types -Choose based on: -- Dataset size -- Model size -- Training time requirements -- Cost constraints +| Instance | GPUs | Typical Use-case | +|----------|------|------------------| +| `ml.p3.2xlarge` | 1× V100 | small-to-medium datasets | +| `ml.p3.8xlarge` | 4× V100 | larger datasets / faster training | +| `ml.p3.16xlarge` | 8× V100 | distributed training | -Popular choices: -- ml.p3.2xlarge (1 GPU) -- ml.p3.8xlarge (4 GPUs) -- ml.p3.16xlarge (8 GPUs) + + GPU capacity is region-specific. If you hit *CapacityError* try another region or request a quota increase. + -## Hyperparameter Tuning +--- -### Basic Parameters +## Hyperparameter Shortcuts -```yaml -hyperparameters: !Hyperparameters - epochs: 3 - learning_rate: 2e-5 - batch_size: 32 -``` +If you omit `hyperparameters`, Magemaker will call `get_hyperparameters_for_model()` to generate sensible defaults for common text tasks (batch size, learning-rate schedule, etc.). You can override **any** of them in the YAML file. -### Advanced Tuning +--- -```yaml -hyperparameters: !Hyperparameters - epochs: 3 - learning_rate: - min: 1e-5 - max: 1e-4 - scaling: log - batch_size: - values: [16, 32, 64] -``` +## Monitoring Your Training Job + +1. Real-time logs stream to your terminal. +2. Detailed metrics are pushed to **CloudWatch** → *SageMaker / TrainingJobs* namespace: + - `TrainingLoss` + - `ValidationLoss` + - `GPUUtilization` + +Set up CloudWatch alarms to catch OOM or stalled jobs. + +--- -## Monitoring Training +## Next Steps -### CloudWatch Metrics +- Use the trained model artefact to **deploy a new endpoint** with another YAML file (replace `training:` with `deployment:`). +- Automate training & deployment in CI/CD by chaining `--train` and `--deploy` commands. -Available metrics: -- Loss -- Learning rate -- GPU utilization \ No newline at end of file + + Vertex AI & Azure ML fine-tuning support is under active development. Follow the progress on our [GitHub Issues](https://github.com/slashml/magemaker/issues). + diff --git a/concepts/models.mdx b/concepts/models.mdx index 0161380..2464aaf 100644 --- a/concepts/models.mdx +++ b/concepts/models.mdx @@ -1,205 +1,131 @@ --- title: Models -description: Guide to supported models and their requirements +description: Overview of supported model sources and hardware requirements --- -## Supported Models +## Supported Model Sources - -Currently, Magemaker supports deployment of Hugging Face models only. Support for cloud provider marketplace models is coming soon! - - -### Hugging Face Models +Magemaker can deploy models from **multiple sources**. The exact options depend on the cloud provider. - - - LLaMA - - BERT - - GPT-2 - - T5 + + All public Hugging Face models are supported on AWS, GCP & Azure. + + + Browse and deploy SageMaker JumpStart models (classification, text-gen, vision, etc.) **directly from the CLI**. - - - - Sentence Transformers - - CLIP - - DPR + + Point Magemaker at a local directory or an S3 URI – handy for fine-tuned checkpoints or proprietary models (AWS only for now). -### Future Support + + JumpStart & custom S3 models are currently **AWS-only**. GCP Model Garden / Azure ML Registry integration is in progress. + + +--- -We plan to add support for the following model sources: +## Tasks & Example Models - - - Models from AWS Marketplace and SageMaker built-in algorithms - - - - Models from Vertex AI Model Garden and Foundation Models - - - - Models from Azure ML Model Catalog and Azure OpenAI - - -## Model Requirements - -### Instance Type Recommendations by Cloud Provider - -#### AWS SageMaker -1. **Small Models** (ml.m5.xlarge) - ```yaml - instance_type: ml.m5.xlarge - ``` -2. **Medium Models** (ml.g4dn.xlarge) - ```yaml - instance_type: ml.g4dn.xlarge - ``` -3. **Large Models** (ml.g5.12xlarge) - ```yaml - instance_type: ml.g5.12xlarge - num_gpus: 4 - ``` - -#### GCP Vertex AI -1. **Small Models** (n1-standard-4) - ```yaml - machine_type: n1-standard-4 - ``` -2. **Medium Models** (n1-standard-8 + GPU) - ```yaml - machine_type: n1-standard-8 - accelerator_type: NVIDIA_TESLA_T4 - accelerator_count: 1 - ``` -3. **Large Models** (a2-highgpu-1g) - ```yaml - machine_type: a2-highgpu-1g - ``` - -#### Azure ML -1. **Small Models** (Standard_DS3_v2) - ```yaml - instance_type: Standard_DS3_v2 - ``` -2. **Medium Models** (Standard_NC6s_v3) - ```yaml - instance_type: Standard_NC6s_v3 - ``` -3. **Large Models** (Standard_ND40rs_v2) - ```yaml - instance_type: Standard_ND40rs_v2 - ``` +| Task | Popular HF Models | +|------|-------------------| +| Text Generation | `meta-llama/Meta-Llama-3-8B-Instruct`, `gpt2`, `tiiuae/falcon-7b` | +| Text Classification | `facebook/bart-large-mnli`, `distilbert-base-uncased` | +| Embeddings / Feature Extraction | `sentence-transformers/all-MiniLM-L6-v2`, `intfloat/e5-small-v2` | -## Example Deployments +--- + +## Instance Recommendations + +Below are **rule-of-thumb** suggestions. Always check actual GPU/CPU requirements of your model. + +### AWS SageMaker + +| Model Size | Instance | GPUs | Notes | +|-------------------|-------------------------|------|-------| +| ≤ 1 B parameters | `ml.m5.xlarge` | – | CPU-only, cheapest | +| 1–7 B parameters | `ml.g4dn.xlarge` | 1× T4 | good balance | +| 7–70 B parameters | `ml.g5.12xlarge` | 4× A10G | HF Llama-3-8B works here | + +### GCP Vertex AI + +| Model Size | Machine Type | GPU | +|------------|--------------------|----------------| +| small | `n1-standard-4` | – | +| medium | `n1-standard-8` | 1× T4 | +| large | `a2-highgpu-1g` | 1× A100 | -### Example Hugging Face Model Deployment +### Azure ML -Deploy the same Hugging Face model to different cloud providers: +| Model Size | VM Size | GPU | +|------------|------------------------|------------| +| small | `Standard_DS3_v2` | – | +| medium | `Standard_NC6s_v3` | 1× V100 | +| large | `Standard_NC24ads_A100_v4` | 4× A100 | + + + GPU names differ across clouds (A10G ≈ L4 ≈ V100). Always double check memory & cost. + + +--- + +## Example Deployments -AWS SageMaker: ```yaml +# Hugging Face model → AWS models: - !Model id: facebook/opt-125m source: huggingface + deployment: !Deployment destination: aws + instance_type: ml.g4dn.xlarge ``` -GCP Vertex AI: ```yaml +# JumpStart model → AWS (interactive CLI can search for IDs) models: - !Model - id: facebook/opt-125m - source: huggingface + id: jumpstart-lang-textclassification-bert-base + source: sagemaker + deployment: !Deployment - destination: gcp + destination: aws ``` -Azure ML: ```yaml +# Hugging Face model → GCP models: - !Model - id: facebook-opt-125m + id: meta-llama/Meta-Llama-3-8B-Instruct source: huggingface + deployment: !Deployment - destination: azure + destination: gcp + machine_type: a2-highgpu-1g + accelerator_type: NVIDIA_A100 + accelerator_count: 1 ``` - The model ids for Azure are different from AWS and GCP. Make sure to use the one provided by Azure in the Azure Model Catalog. - - To find the relevnt model id, follow the following steps - - - Find the workpsace in the Azure portal and click on the studio url provided. Click on the `Model Catalog` on the left side bar - ![Azure ML Creation](../Images/workspace-studio.png) - - - - Select Hugging-Face from the collections list. The id of the model card is the id you need to use in the yaml file - ![Azure ML Creation](../Images/hugging-face.png) - - - + Azure uses *different* model IDs for Hugging Face. Use the Azure ML **Model Catalog** to copy the correct ID. - -## Model Configuration - -### Basic Parameters - -```yaml -models: -- !Model - id: your-model-id - source: huggingface|sagemaker # we don't support vertex and azure specific models yet - revision: latest # Optional: specify model version -``` - -### Advanced Parameters - -```yaml -models: -- !Model - id: your-model-id - source: huggingface - predict: - temperature: 0.7 - top_p: 0.9 - top_k: 50 - max_new_tokens: 500 - do_sample: true -``` +--- ## Best Practices -1. **Model Selection** - - Compare pricing across cloud providers - - Consider data residency requirements - - Test latency from different regions +1. **Start small** – prototype with CPU instances, then scale up. +2. **Monitor costs & quotas** – big GPUs can be expensive and scarce. +3. **Version your YAML** – keep configs in Git for reproducibility. -3. **Cost Management** - - Compare instance pricing - - Make sure you set up the relevant alerting +--- ## Troubleshooting -Common model-related issues: - -1. **Cloud-Specific Issues** - - Check quota limits - - Verify regional availability - - Review cloud-specific logs - -2. **Performance Issues** - - Compare cross-cloud latencies - - Check network connectivity - - Monitor resource utilization - -3. **Authentication Issues** - - Verify cloud credentials - - Check model access permissions - - Validate API keys \ No newline at end of file +| Issue | Possible Cause | Fix | +|-------|----------------|-----| +| `RuntimeError: CUDA out of memory` | Instance has insufficient GPU RAM | Pick a bigger instance / enable quantization | +| Deployment stuck in *Creating* | Region capacity, missing quota | Try another region or request quota increase | +| 403 accessing model | Gated HF model | Accept terms on HF & set `HUGGING_FACE_HUB_KEY` in `.env` | diff --git a/configuration/AWS.mdx b/configuration/AWS.mdx index cdc4b9f..f91d79f 100644 --- a/configuration/AWS.mdx +++ b/configuration/AWS.mdx @@ -2,81 +2,100 @@ title: AWS --- -### AWS CLI +## Overview -To install Azure SDK on MacOS, you need to have the latest OS and you need to use Rosetta terminal. Also, make sure you have the latest version of Xcode tools installed. +This guide walks you through setting up **AWS credentials, IAM roles, and quotas** so Magemaker can: -Follow this guide to install the latest AWS CLI +1. Create or reuse a **SageMaker execution role** (`SAGEMAKER_ROLE`). +2. Spin-up training & inference instances. +3. Access your S3 buckets for data & model artefacts. -https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html + + All steps can be completed via the AWS Console **or** the CLI. Pick whichever you’re more comfortable with. + +--- -Once you have the CLI installed and working, follow these steps - +### 1. Install & Configure the AWS CLI -### AWS Account +Follow the official instructions: + - - -Register for an [AWS account](https://aws.amazon.com/) and sign-in to the [console](https://console.aws.amazon.com/). - +```bash +# After installation +aws configure # enter Access Key, Secret, default region +``` - -From the console, use the Search bar to find and select IAM (***do not use IAM Identity Center***, which is confusingly similar but a totally different system). +We recommend `us-east-1` because most JumpStart models are available there, but any region works as long as you have SageMaker quota. -![Enter image alt description](../Images/muJ_Image_1.png) +--- -You should see the following screen after clicking IAM. +### 2. Create an IAM User (Console) -![Enter image alt description](../Images/ldC_Image_2.png) - + + + Open the console and search for **IAM** (not *IAM Identity Center*). + + + 1. Create a new user (or select an existing one). + 2. Attach the following *managed policies*: + - `AmazonSageMakerFullAccess` + - `IAMFullAccess` (needed once to create the execution role) + - `ServiceQuotasFullAccess` (for quota look-ups) + + + Go to *Security Credentials* → *Access Keys* → **Create access key**. + Store **both** the *Access Key ID* and *Secret Access Key* – you’ll need them for `aws configure` *and* Magemaker. + + + + + Do **not** commit your keys to GitHub. Use a `.env` file (see Environment Variables). + - -1. Select `Users` in the side panel - -![Enter image alt description](../Images/QX4_Image_3.png) +--- -2. Create a user if you don't already have one +### 3. Create / Verify a SageMaker Execution Role -![Enter image alt description](../Images/ly3_Image_4.png) - +Magemaker expects an **ARN** in `SAGEMAKER_ROLE` that has `AmazonSageMakerFullAccess`. - -1. Click on "Add permissions" - -![Enter image alt description](../Images/E7x_Image_5.png) +```bash +aws iam create-role \ + --role-name magemaker-sagemaker-role \ + --assume-role-policy-document file://trust-policy.json -2. Select "Attach policies directly". Under permission policies, search for and tick the boxes for: - - `AmazonSagemakerFullAccess` - - `IAMFullAccess` - - `ServiceQuotasFullAccess` +aws iam attach-role-policy \ + --role-name magemaker-sagemaker-role \ + --policy-arn arn:aws:iam::aws:policy/AmazonSageMakerFullAccess +``` -Then click Next. +Copy the resulting role ARN into your `.env`. -![Enter image alt description](../Images/01X_Image_6.png) +--- -The final list should look like the following: +### 4. Check / Request Service Quotas -![Enter image alt description](../Images/Dfp_Image_7.png) +1. Open **Service Quotas** in the console. +2. Filter by *SageMaker*. +3. Verify you have at least **2 × `ml.m5.xlarge` instances** (default). +4. Request additional quota for GPU instances (e.g. `ml.g5.*`, `ml.p3.*`) if you plan to deploy large models. -Click "Create user" on the following screen. - + + Approval can take anywhere from minutes to 48 h depending on your AWS account history. + - -1. Click the name of the user you've just created (or one that already exists) -2. Go to "Security Credentials" tab -3. Scroll down to "Access Keys" section -4. Click "Create access key" -5. Select Command Line Interface then click next +--- -![Enter image alt description](../Images/BPP_Image_8.png) +## Final Checklist -Enter a description (this is optional, can leave blank). Then click next. +- [x] AWS CLI installed & `aws configure` completed +- [x] `.env` contains `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_REGION_NAME`, `SAGEMAKER_ROLE` +- [x] SageMaker quota for the instance types you plan to use -![Enter image alt description](../Images/gMD_Image_9.png) +You’re now ready to run: -**Store BOTH the Access Key and the Secret access key for the next step. Once you've saved both keys, click Done.** +```bash +magemaker --cloud aws +``` -![Enter image alt description](../Images/Gjw_Image_10.png) - - \ No newline at end of file +This command validates your credentials, writes/updates the `.env`, and caches your default region. diff --git a/configuration/Azure.mdx b/configuration/Azure.mdx index 3c1104a..df93262 100644 --- a/configuration/Azure.mdx +++ b/configuration/Azure.mdx @@ -1,84 +1,106 @@ --- title: Azure -description: Configure Magemaker for your cloud providers +description: Configure Magemaker for Azure Machine Learning --- -### Azure CLI +## Overview -To install Azure SDK on MacOS, you need to have the latest OS and you need to use Rosetta terminal. Also, make sure you have the latest version of Xcode tools installed. +Magemaker can deploy and query endpoints on **Azure Machine Learning**. Before you start, ensure you have: +1. An Azure subscription with billing enabled. +2. A resource group & AML workspace. +3. Quota for the VM sizes required by your model (e.g. `Standard_NC24ads_A100_v4`). -To install the latest Azure CLI, run: + + Python 3.13 is **not** supported by the Azure SDK (see [Azure#37600](https://github.com/Azure/azure-sdk-for-python/issues/37600)). Use Python 3.11 or 3.12. + + +--- + +### 1. Install the Azure CLI + +```bash +brew install azure-cli # macOS +# or follow the official docs for Windows / Linux +``` + +Check login & subscription: + +```bash +az login +az account show # verify the active subscription +``` + +--- + +### 2. Create a Resource Group & Workspace (CLI) + +```bash +# variables – replace <> with your values +AZ_RG=ml-resources +AZ_REGION=eastus +AZ_WS=ml-workspace + +az group create -n $AZ_RG -l $AZ_REGION +az ml workspace create -n $AZ_WS -g $AZ_RG +``` + +You can also do this via the Azure Portal → *Azure Machine Learning* → **Create new workspace**. + +--- + +### 3. Register Required Resource Providers + +The first time you use AML, register providers – Magemaker will fail with `MissingSubscriptionRegistration` if these are not active. ```bash -brew update && brew install azure-cli +az provider register --namespace Microsoft.MachineLearningServices +az provider register --namespace Microsoft.ContainerRegistry +az provider register --namespace Microsoft.KeyVault +az provider register --namespace Microsoft.Storage +az provider register --namespace Microsoft.Insights +az provider register --namespace Microsoft.ContainerService +az provider register --namespace Microsoft.PolicyInsights +az provider register --namespace Microsoft.Cdn ``` -Alternatively, follow this official guide from Azure -- [https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-macos](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-macos) - -Once you have installed azure CLI, follow these steps - - -### Azure Account -Step 1: Create azure cloud account - -- [https://azure.microsoft.com/en-ca](null) - - - - - ```bash - az login - ``` - - - ```bash - az account set --subscription - ``` - - - - From the terminal - ```bash - az group create --name --location - ``` - - From the Azure Portal - ![Enter image alt description](../Images/XzN_Image_12.png) - - - - From the terminal - ```bash - az ml workspace create -n -g - ``` - -From the Azure portal -1. Search for `Azure Machine Learning` in the search bar. - ![Azure ML Creation](../Images/AzureML.png) - -2. Inside the `Azure Machine Learning` portal. Click on Create, and select `New Workspce` from the drop down - ![workspace creation](../Images/workspace_creation.png) - - - - ```bash - # Register all required providers: THIS STEP IS IMPORTANT - az provider register --namespace Microsoft.MachineLearningServices - az provider register --namespace Microsoft.ContainerRegistry - az provider register --namespace Microsoft.KeyVault - az provider register --namespace Microsoft.Storage - az provider register --namespace Microsoft.Insights - az provider register --namespace Microsoft.ContainerService - az provider register --namespace Microsoft.PolicyInsights - az provider register --namespace Microsoft.Cdn - ``` - - - Registration can take up to 10 minutes. Check status with: ```bash az - provider show -n Microsoft.MachineLearningServices ``` - - - - +Registration can take **up to 10 minutes** – check status: + +```bash +az provider show -n Microsoft.MachineLearningServices --query "registrationState" +``` + +--- + +### 4. Add Environment Variables + +Add the following to `.env` (Magemaker will attempt to create/update the file when you run `--cloud azure`): + +```bash +AZURE_SUBSCRIPTION_ID="" +AZURE_RESOURCE_GROUP="ml-resources" +AZURE_WORKSPACE_NAME="ml-workspace" +AZURE_REGION="eastus" +``` + +Optionally add your Hugging Face token if deploying gated models: + +```bash +HUGGING_FACE_HUB_KEY="hf_…" +``` + +--- + +### 5. Verify Setup with Magemaker + +```bash +magemaker --cloud azure +``` + +The wizard checks credentials, writes the `.env`, and lists your AML workspace details. + +You’re ready to deploy: + +```bash +magemaker --deploy .magemaker_config/llama3-azure.yaml +``` diff --git a/configuration/Environment.mdx b/configuration/Environment.mdx index 0781ec3..fbde10c 100644 --- a/configuration/Environment.mdx +++ b/configuration/Environment.mdx @@ -2,29 +2,52 @@ title: Environment Variables --- -### Required Config File -A `.env` file is automatically created when you run `magemaker --cloud `. This file contains the necessary environment variables for your cloud provider(s). +## Overview -By default, Magemaker will look for a `.env` file in your project root with the following variables based on which cloud provider(s) you plan to use: +Magemaker uses a simple **`.env`** file (loaded via `python-dotenv`) to store cloud credentials and tool settings. The file is created/updated automatically when you run: ```bash -# AWS Configuration -AWS_ACCESS_KEY_ID="your-access-key" # Required for AWS -AWS_SECRET_ACCESS_KEY="your-secret-key" # Required for AWS -SAGEMAKER_ROLE="arn:aws:iam::..." # Required for AWS - -# GCP Configuration -PROJECT_ID="your-project-id" # Required for GCP -GCLOUD_REGION="us-central1" # Required for GCP - -# Azure Configuration -AZURE_SUBSCRIPTION_ID="your-sub-id" # Required for Azure -AZURE_RESOURCE_GROUP="ml-resources" # Required for Azure -AZURE_WORKSPACE_NAME="ml-workspace" # Required for Azure -AZURE_REGION="eastus" # Required for Azure - -# Optional configurations -HUGGING_FACE_HUB_KEY="your-hf-token" # Required for gated HF models like llama +magemaker --cloud ``` -Never commit your .env file to version control! +If you prefer manual setup, copy the template below and adjust per provider. + +--- + +### Template + +```bash +# ─── AWS ────────────────────────────────────────────────────── +AWS_ACCESS_KEY_ID="…" +AWS_SECRET_ACCESS_KEY="…" +AWS_REGION_NAME="us-east-1" # default region for deployments +SAGEMAKER_ROLE="arn:aws:iam::123456789012:role/magemaker-sagemaker-role" + +# ─── GCP ────────────────────────────────────────────────────── +PROJECT_ID="my-gcp-project" +GCLOUD_REGION="us-central1" +# Optional: path to a service-account JSON for Vertex AI queries +GOOGLE_APPLICATION_CREDENTIALS="/path/to/key.json" + +# ─── AZURE ──────────────────────────────────────────────────── +AZURE_SUBSCRIPTION_ID="" +AZURE_RESOURCE_GROUP="ml-resources" +AZURE_WORKSPACE_NAME="ml-workspace" +AZURE_REGION="eastus" + +# ─── GENERAL ───────────────────────────────────────────────── +HUGGING_FACE_HUB_KEY="hf_…" # required for gated models (e.g. Llama 3) +CONFIG_DIR=".magemaker_config" # override default config folder (optional) +``` + + + **Never** commit `.env` to version control. Add it to `.gitignore`. + + +--- + +## Tips + +1. Use **separate AWS accounts** or IAM roles for dev vs prod to keep things isolated. +2. Rotate credentials regularly and prefer short-lived tokens (e.g. AWS SSO). +3. When running Magemaker in CI/CD, inject env vars via your secrets manager instead of committing `.env`. diff --git a/configuration/GCP.mdx b/configuration/GCP.mdx index c9cd369..1e4adda 100644 --- a/configuration/GCP.mdx +++ b/configuration/GCP.mdx @@ -1,38 +1,111 @@ --- -title: GCP +title: GCP (Vertex AI) +description: Configure Magemaker for Google Cloud Platform --- + +## Overview +This page walks you through **all** the steps required to use Magemaker with **Google Cloud Vertex AI**: + +1. Creating / selecting a GCP project +2. Installing & initializing the **gcloud** CLI +3. Enabling **Vertex AI** and related APIs +4. Setting the required **environment variables** (`PROJECT_ID`, `GCLOUD_REGION`, …) +5. (Optional) Creating a dedicated **service-account** for CI/CD deployments + + +Magemaker supports Python ≥ 3.11 (3.12 is currently not supported) and uses the official Vertex AI SDK under the hood. + + +--- + - -Visit [Google Cloud Console](https://cloud.google.com/?hl=en) to create your account. - + + Visit the Google Cloud Console and sign up / sign in. + + + + If this is your first time, the default project is “My First Project”. Use the project-picker at the top of the console to create a new project if you prefer a clean slate. + + ![Create GCP Project](../Images/google_new_project.png) + + + + 1. Follow the official installation guide for your OS.
+ 2. Initialize the SDK: + ```bash + gcloud init + ``` +
- - Once you have created your account, create a new project. If this is your first time the default project is "My First Project". You can create a new project by clicking this button and then selecting "New Project". + + During gcloud init you will: + - Log in with your browser + - Pick the project you created earlier - ![Enter image alt description](../Images/google_new_project.png) + Verify the default credentials work: + ```bash + gcloud auth application-default login + ``` + - + + Open APIs & Services → Library and enable **Vertex AI API** for your project. + + ![Enable Vertex AI](../Images/QrB_Image_11.png) - -1. Follow the installation guide at [Google Cloud SDK Installation Documentation](https://cloud.google.com/sdk/docs/install-sdk) -2. Initialize the SDK by running: - ```bash - gcloud init - ``` - + + For training or custom container deployments you may also need Cloud Storage, Artifact Registry, and Cloud Build. + + -3. During initialization: - - Create login credentials when prompted - - Create a new project or select an existing one - To make sure the initialization worked, run: - ```bash - gcloud auth application-default login - ``` + + For automated pipelines: + ```bash + gcloud iam service-accounts create magemaker-sa \ + --description="CI/CD deployments" \ + --display-name="magemaker-sa" - -Navigate to the APIs & Services on the dashboard and enable the Vertex AI API for your project. + gcloud projects add-iam-policy-binding $PROJECT_ID \ + --member="serviceAccount:magemaker-sa@$PROJECT_ID.iam.gserviceaccount.com" \ + --role="roles/aiplatform.admin" + ``` + Download the JSON key and reference it via GOOGLE_APPLICATION_CREDENTIALS. + +
+ +--- -![Enter image alt description](../Images/QrB_Image_11.png) - +## Environment Variables (.env) +Magemaker loads cloud-specific settings from a **.env** file in your project root. Make sure the following exist for GCP deployments: + +```bash +# GCP Configuration +PROJECT_ID="your-project-id" # gcloud config get-value project +GCLOUD_REGION="us-central1" # default region for Vertex AI + +# (Optional) Path to a service-account JSON key +GOOGLE_APPLICATION_CREDENTIALS="/path/to/magemaker-sa.json" +``` + + +Never commit your .env file to version control! + + +--- + +## Verifying Your Setup +```bash +# 1. Configure Magemaker for GCP +magemaker --cloud gcp + +# 2. List existing Vertex AI endpoints +magemaker --list +``` +If you see an empty list instead of an error, your credentials and environment variables are correct. + +--- -
\ No newline at end of file +## Next Steps +• Learn how to deploy a model via YAML. +• Explore the CLI reference for advanced flags. +• Try the local API proxy to query endpoints through an OpenAI-compatible API. diff --git a/getting_started.md b/getting_started.md index 0bc86fa..b830be2 100644 --- a/getting_started.md +++ b/getting_started.md @@ -1,184 +1,88 @@ # Getting Started with Magemaker -Magemaker is a Python tool that simplifies the process of deploying an open source AI model to your own cloud. +Magemaker is a Python toolkit that lets you **deploy, query, and fine-tune open-source AI models on your own cloud** (AWS SageMaker, GCP Vertex AI, or Azure ML) in minutes—not hours. -Deploy from an interactive menu in the terminal or from a simple YAML file. +• **Deploy** from an interactive TUI or a declarative YAML file. +• **Query** directly from the CLI *or* through an OpenAI-compatible proxy server. +• **Fine-tune** models with a single `--train` command. -Instead of spending hours digging through documentation to figure out how to get AWS working, Magemaker lets you deploy Hugging Face models directly to AWS SageMaker, Google Cloud Vertex AI, or Azure Machine Learning, from the command line or a simple YAML file. +--- -Choose a model from Hugging Face, and Magemaker will spin up an instance with a ready-to-query endpoint of the model in minutes. +## Supported Providers +✅ AWS SageMaker ✅ GCP Vertex AI ✅ Azure ML -## Getting Started + +Python 3.11 is required (3.12 is not yet supported due to upstream SDK issues). + -Magemaker works with the three major cloud providers AWS, Azure and GCP! +--- -To get a local copy up and running follow these simple steps. +## 1 – Prerequisites -### Prerequisites +1. Cloud account (AWS / GCP / Azure) with sufficient **instance quotas**. +2. Corresponding cloud **CLI** installed (aws, gcloud, or az). +3. (Optional) **Hugging Face token** for gated models like Llama 3. -* Python 3.11 (3.13 is not supported because of azure) -* Cloud Configuration - * An account to your preferred cloud provider, AWS, GCP and Azure. - * Each cloud requires slightly different accesses, Magemaker will guide you through getting the necessary credentials to the selected cloud provider - * Here's a guide on how to configure AWS and get the credentials [Google Doc](https://docs.google.com/document/d/1NvA6uZmppsYzaOdkcgNTRl7Nb4LbpP9Koc4H_t5xNSg/edit?tab=t.0#heading=h.farbxuv3zrzm) - * Quota approval for instances you require for the AI model - * By default, you get some free instances, example with AWS you are pre-approved for 2 ml.m5.xlarge instances with 16gb of RAM each +--- - * An installation and configuration of your selected cloud CLI tool(s) - * Magemaker will prompt you to install the CLI of the selected cloud provider, if not installed already. - * Magemaker will prompt you to add the necesssary credentials. - -* Certain Hugging Face models (e.g. Llama2) require an access token ([hf docs](https://huggingface.co/docs/hub/en/models-gated#access-gated-models-as-a-user)) - - -## Installation - -1. Install Magemaker using pip: - - ```sh - pip install magemaker - ``` - -2. Run Magemaker: - - ```sh - magemaker --cloud [aws|gcp|azure|all] - ``` - - If this is your first time running this command, It will configure the selected cloud so you’re ready to start deploying models. - - In the case of AWS, it’ll prompt you to enter your Access Key and Secret. You can also specify your AWS region. The default is us-east-1. You only need to change this if your SageMaker instance quota is in a different region. - - Once configured, it will create a `.env` file and save the credentials there. You can also add your Hugging Face Hub Token to this file if you have one. - - ```sh - HUGGING_FACE_HUB_KEY="KeyValueHere" - ``` - - -
- -## Using Magemaker - -### Interactive deployment - -Run `magemaker --cloud [gcp|azure|aws|all]` to access an interactive menu where you can: +## 2 – Installation +```bash +pip install magemaker +``` -* Choose your cloud provider -* Select from available models -* Configure deployment settings -* Monitor deployment progress +--- -#### YAML-based Deployment -For reproducible deployments, use YAML configuration: +## 3 – Initial Configuration +Run the built-in wizard once per provider: +```bash +magemaker --cloud [aws|gcp|azure|all] ``` -magemaker --deploy .magemaker_config/bert-base-uncased.yaml -``` +The wizard: +• Validates your CLI credentials +• Writes a project-local **.env** with the required keys +• Confirms region and default settings -Following is a sample yaml file for deploying a model the same google bert model mentioned above to AWS: +--- -```yaml -deployment: !Deployment - destination: aws - # Endpoint name matches model_id for querying atm. - endpoint_name: test-bert-uncased - instance_count: 1 - instance_type: ml.m5.xlarge +## 4 – Typical Workflows -models: -- !Model - id: google-bert/bert-base-uncased - source: huggingface +### a) Interactive Deployment +```bash +magemaker --cloud aws # opens an interactive menu ``` +Pick a model, choose an instance type, and watch the progress bar 🚀. -Following is a yaml file for deploying a facebook model to GCP Vertex AI: -```yaml -deployment: !Deployment - destination: gcp - endpoint_name: test-endpoint-12 - accelerator_count: 1 - instance_type: g2-standard-12 - accelerator_type: NVIDIA_L4 - num_gpus: null - quantization: null - -models: -- !Model - id: facebook/opt-125m - location: null - predict: null - source: huggingface - task: null - version: null - -``` -For Azure ML: -```yaml -deployment: !Deployment - destination: azure - endpoint_name: facebook--opt-125m-202410251736 - instance_count: 1 - instance_type: Standard_DS3_v2 -models: -- !Model - id: facebook-opt-125m - location: null - predict: null - source: huggingface - task: text-generation - version: null +### b) YAML-based Deployment (CI-friendly) +```bash +magemaker --deploy .magemaker_config/bert.yaml ``` +See the Deployment and CLI Reference pages for all available options. -#### Fine-tuning a model using a yaml file - -You can also fine-tune a model using a yaml file, by using the `train` option in the command and passing path to the yaml file - -` +### c) Fine-tuning +```bash magemaker --train .magemaker_config/train-bert.yaml -` - -Here is an example yaml file for fine-tuning a hugging-face model: - -```yaml -training: !Training - destination: aws # or gcp, azure - instance_type: ml.p3.2xlarge # varies by cloud provider - instance_count: 1 - training_input_path: s3://your-bucket/data.csv - hyperparameters: !Hyperparameters - epochs: 3 - per_device_train_batch_size: 32 - learning_rate: 2e-5 - ``` +### d) OpenAI-Compatible Proxy +Spin up a local FastAPI server that makes your endpoints accessible via `/chat/completions`: +```bash +uvicorn server:app --reload +``` +Full details on the API Proxy page. -
-
- -If you’re using the `ml.m5.xlarge` instance type, here are some small Hugging Face models that work great: -
-
- -**Model: [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased)** - -- **Type:** Fill Mask: tries to complete your sentence like Madlibs -- **Query format:** text string with `[MASK]` somewhere in it that you wish for the transformer to fill -- -
-
- -**Model: [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)** - -- **Type:** Feature extraction: turns text into a 384d vector embedding for semantic search / clustering -- **Query format:** "*type out a sentence like this one.*" - -
-
- +--- +## 5 – Cleaning Up +Endpoints keep running until you delete them: +```bash +magemaker --delete my-endpoint +``` +Remember to delete endpoints you no longer need to avoid cloud charges. -## Deactivating Models +--- -Any model endpoints you spin up will run continuously unless you deactivate them! Make sure to delete endpoints you’re no longer using so you don’t keep getting charged for your SageMaker instance. +## Next Steps +1. Dive into the Quick Start for hands-on examples. +2. Browse the full CLI reference. +3. Read about fine-tuning and supported models. diff --git a/installation.mdx b/installation.mdx index 1d843eb..41ff8e8 100644 --- a/installation.mdx +++ b/installation.mdx @@ -1,158 +1,82 @@ --- title: Installation -description: Configure Magemaker for your cloud provider +description: Install Magemaker and configure cloud credentials --- - - For Macs, maxOS >= 13.6.6 is required. Apply Silicon devices (M1) must use Rosetta terminal. You can verify, your terminals architecture by running `arch`. It should print `i386` for Rosetta terminal. +• macOS ≥ 13.6.6 is required. +• Apple Silicon users must run the terminal in **Rosetta** (`arch` should print i386). - -Install via pip: - -```sh +## 1 – Install the Package +```bash pip install magemaker ``` +--- -## Cloud Account Setup - -### AWS Configuration - -- Follow this detailed guide for setting up AWS credentials: - [AWS Setup Guide](/configuration/AWS) - -Once you have your AWS credentials, you can configure Magemaker by running: +## 2 – Configure Cloud Providers +Use the built-in wizard—no manual JSON editing required: ```bash +# AWS magemaker --cloud aws -``` - -It will prompt you for aws credentials and set up the necessary configurations. - - -### GCP (Vertex AI) Configuration -- Follow this detailed guide for setting up GCP credentials: - [GCP Setup Guide](/configuration/GCP) - - -once you have your GCP credentials, you can configure Magemaker by running: - -```bash +# GCP magemaker --cloud gcp -``` - -### Azure Configuration -- Follow this detailed guide for setting up Azure credentials: - [GCP Setup Guide](/configuration/Azure) - - -Once you have your Azure credentials, you can configure Magemaker by running: - -```bash +# Azure magemaker --cloud azure -``` - -### All three cloud providers - -If you have configured all three cloud providers, you can verify your configuration by running: - -```bash +# All at once (creates a consolidated .env) magemaker --cloud all ``` +The wizard will: +1. Verify your CLI installation (aws, gcloud, or az) +2. Prompt for any missing credentials / regions +3. Write a **.env** with the keys shown below -### Required Config File -By default, Magemaker will look for a `.env` file in your project root with the following variables based on which cloud provider(s) you plan to use: +--- +## 3 – Environment Variables (.env) +Magemaker automatically loads a .env from your project root: ```bash -# AWS Configuration -AWS_ACCESS_KEY_ID="your-access-key" # Required for AWS -AWS_SECRET_ACCESS_KEY="your-secret-key" # Required for AWS -SAGEMAKER_ROLE="arn:aws:iam::..." # Required for AWS - -# GCP Configuration -PROJECT_ID="your-project-id" # Required for GCP -GCLOUD_REGION="us-central1" # Required for GCP - -# Azure Configuration -AZURE_SUBSCRIPTION_ID="your-sub-id" # Required for Azure -AZURE_RESOURCE_GROUP="ml-resources" # Required for Azure -AZURE_WORKSPACE_NAME="ml-workspace" # Required for Azure -AZURE_REGION="eastus" # Required for Azure - -# Optional configurations -HUGGING_FACE_HUB_KEY="your-hf-token" # Required for gated HF models like llama +# AWS +AWS_ACCESS_KEY_ID="AKIA..." +AWS_SECRET_ACCESS_KEY="..." +SAGEMAKER_ROLE="arn:aws:iam::123456789012:role/MageMakerRole" +AWS_REGION_NAME="us-east-1" + +# GCP +PROJECT_ID="your-project-id" +GCLOUD_REGION="us-central1" +# Optional – service-account key for CI/CD +GOOGLE_APPLICATION_CREDENTIALS="/path/to/key.json" + +# Azure +AZURE_SUBSCRIPTION_ID="..." +AZURE_RESOURCE_GROUP="ml-resources" +AZURE_WORKSPACE_NAME="ml-workspace" +AZURE_REGION="eastus" + +# Hugging Face (optional but required for gated models) +HUGGING_FACE_HUB_KEY="hf_..." ``` -Never commit your .env file to version control! - - - For gated models like llama-3.1 from Meta, you might have to accept terms of use for model on hugging face and adding Hugging face token to the environment are necessary for deployment to go through. - - -{/* ## Verification +Never commit .env to version control. -To verify your configuration: +--- +## 4 – Verify Your Setup ```bash -magemaker verify -``` */} - -## Best Practices - -1. **Resource Management** - - Monitor quota limits - - Clean up unused resources - - Set up cost alerts - -2. **Environment Management** - - - Use separate configurations for dev/prod - - Regularly rotate access keys - - Use environment-specific roles - -3. **Security** - - - Follow principle of least privilege - - Use service accounts where possible - - Enable audit logging - - - -## Troubleshooting - -Common configuration issues: - -1. **AWS Issues** - - - Check IAM role permissions - - Verify SageMaker quota - - Confirm region settings - -2. **GCP Issues** +magemaker --list # should show endpoints or an empty list, not an error +``` - - Verify service account permissions - - Check Vertex AI API enablement - - Confirm project ID +--- -3. **Azure Issues** - - Check resource provider registration status: - ```bash - az provider show -n Microsoft.MachineLearningServices - az provider show -n Microsoft.ContainerRegistry - az provider show -n Microsoft.KeyVault - az provider show -n Microsoft.Storage - az provider show -n Microsoft.Insights - az provider show -n Microsoft.ContainerService - az provider show -n Microsoft.PolicyInsights - az provider show -n Microsoft.Cdn - ``` - - Verify workspace access - - Confirm subscription status - - Ensure all required providers are registered +## 5 – Next Steps +• Follow the Quick Start guide to deploy your first model. +• Explore the CLI reference for advanced flags like --dry-run and --wait. +• Check out the API Proxy page to expose endpoints via an OpenAI-compatible API. diff --git a/mint.json b/mint.json index ccb1843..c71c3d7 100644 --- a/mint.json +++ b/mint.json @@ -38,9 +38,13 @@ "mode": "auto" }, "navigation": [ - { + { "group": "Getting Started", - "pages": ["about", "installation", "quick-start"] + "pages": [ + "about", + "installation", + "quick-start" + ] }, { "group": "Tutorials", @@ -63,6 +67,8 @@ "group": "Core Concepts", "pages": [ "concepts/deployment", + "concepts/cli-reference", + "concepts/api-proxy", "concepts/models", "concepts/contributing" ] @@ -77,17 +83,29 @@ { "title": "Documentation", "links": [ - { "label": "Getting Started", "url": "/" }, - { "label": "Contributing", "url": "/contributing" } + { + "label": "Getting Started", + "url": "/" + }, + { + "label": "Contributing", + "url": "/contributing" + } ] }, { "title": "Resources", "links": [ - { "label": "GitHub", "url": "https://github.com/slashml/magemaker" }, - { "label": "Support", "url": "mailto:support@slashml.com" } + { + "label": "GitHub", + "url": "https://github.com/slashml/magemaker" + }, + { + "label": "Support", + "url": "mailto:support@slashml.com" + } ] } ] } -} \ No newline at end of file +} diff --git a/quick-start.mdx b/quick-start.mdx index 5853ef8..de30061 100644 --- a/quick-start.mdx +++ b/quick-start.mdx @@ -3,184 +3,88 @@ title: Quick Start "og:title": "Magemaker" --- - Make sure you have followed the [installation](installation) steps before proceeding. - + +Make sure you have completed the installation & initial magemaker --cloud ... configuration before proceeding. + -## Interactive View +--- -1. Run Magemaker with your desired cloud provider: +## 1 – Interactive Mode -```sh +```bash magemaker --cloud [aws|gcp|azure|all] ``` -Supported providers: - -- `--cloud aws` AWS SageMaker deployment -- `--cloud gcp` Google Cloud Vertex AI deployment -- `--cloud azure` Azure Machine Learning deployment -- `--cloud all` Configure all three providers at the same time - - -### List Models - -From the dropdown, select `Show Acitve Models` to see the list of endpoints deployed. - -![Acitve Endpoints](../Images/active-1.png) - -### Delete Models - -From the dropdown, select `Delete a Model Endpoint` to see the list of models endpoints. Press space to select the endpoints you want to delete - -![Delete Endpoints](../Images/delete-1.png) - +The TUI lets you: +• Deploy new models +• List active endpoints +• Query or delete endpoints -### Querying Models - -From the dropdown, select `Query a Model Endpoint` to see the list of models endpoints. Press space to select the endpoints you want to query. Enter the query in the text box and press enter to get the response. - -![Query Endpoints](../Images/query-1.png) - - -### YAML-based Deployment (Recommended) +--- -For reproducible deployments, use YAML configuration: +## 2 – YAML-Based Deployment (Recommended) -```sh +```bash magemaker --deploy .magemaker_config/your-model.yaml ``` -Example YAML for AWS deployment: - +Example (AWS): ```yaml deployment: !Deployment - destination: aws - endpoint_name: facebook-opt-test + destination: aws + endpoint_name: bert-base-demo instance_count: 1 instance_type: ml.m5.xlarge - num_gpus: null - quantization: null -models: - - !Model - id: facebook/opt-125m - location: null - predict: null - source: huggingface - task: text-generation - version: null -``` - -For GCP Vertex AI: - -```yaml -deployment: !Deployment - destination: gcp - endpoint_name: facebook-opt-test - accelerator_count: 1 - instance_type: g2-standard-12 - accelerator_type: NVIDIA_L4 - num_gpus: null - quantization: null models: - !Model - id: facebook/opt-125m - location: null - predict: null + id: google-bert/bert-base-uncased source: huggingface - task: null - version: null ``` -For Azure ML: +Full schema & advanced flags are documented in the CLI reference. -```yaml -deployment: !Deployment - destination: azure - endpoint_name: facebook-opt-test - instance_count: 1 - instance_type: Standard_DS3_v2 -models: - - !Model - id: facebook--opt-125m - location: null - predict: null - source: huggingface - task: text-generation - version: null -``` - - The model ids for Azure are different from AWS and GCP. Make sure to use the one provided by Azure in the Azure Model Catalog. - - To find the relevant model id, follow the following steps - - - Find the workpsace in the Azure portal and click on the studio url provided. Click on the `Model Catalog` on the left side bar - ![Azure ML Creation](../Images/workspace-studio.png) - - - - Select Hugging-Face from the collections list. The id of the model card is the id you need to use in the yaml file - ![Azure ML Creation](../Images/hugging-face.png) - - - - +--- +## 3 – Fine-Tuning +```bash +magemaker --train .magemaker_config/train-bert.yaml +``` +See Fine-tuning for details. -### Model Fine-tuning +--- -Fine-tune models using the `train` command: +## 4 – OpenAI-Compatible Proxy +Run a local FastAPI server and plug Magemaker into any OpenAI SDK: -```sh -magemaker --train .magemaker_config/train-config.yaml +```bash +uvicorn server:app --reload # default http://localhost:8000 ``` -Example training configuration: +```python +from openai import OpenAI +client = OpenAI(base_url="http://localhost:8000", api_key="unused") -```yaml -training: !Training - destination: aws # or gcp, azure - instance_type: ml.p3.2xlarge # varies by cloud provider - instance_count: 1 - training_input_path: s3://your-bucket/data.csv - hyperparameters: !Hyperparameters - epochs: 3 - per_device_train_batch_size: 32 - learning_rate: 2e-5 +chat = client.chat.completions.create( + model="meta-llama/Meta-Llama-3-8B-Instruct", + messages=[{"role": "user", "content": "Hello!"}] +) +print(chat.choices[0].message.content) ``` -{/* -### Recommended Models - - - - Fill Mask: tries to complete your sentence like Madlibs. Query format: text - string with [MASK] somewhere in it. - - - - Feature extraction: turns text into a 384d vector embedding for semantic - search / clustering. Query format: "type out a sentence like this one." - - */} +More on the API Proxy page. - - Remember to deactivate unused endpoints to avoid unnecessary charges! - - - -## Contact - -You can reach us, faizan & jneid, at [support@slashml.com](mailto:support@slashml.com). +--- +## 5 – Clean Up & Cost Control +```bash +magemaker --delete my-endpoint +``` +Endpoints incur charges while running—delete what you don’t use! -If anything doesn't make sense or you have suggestions, do point them out at [magemaker.featurebase.app](https://magemaker.featurebase.app/). +--- -We'd love to hear from you! We're excited to learn how we can make this more valuable for the community and welcome any and all feedback and suggestions. +## Next Steps +• Explore the full CLI. +• Read the Deployment guide for best practices. +• Need help? Join our Discord or open an issue on GitHub.