Skip to content

Commit

Permalink
Progress.
Browse files Browse the repository at this point in the history
  • Loading branch information
squidarth committed Oct 9, 2023
1 parent 4929713 commit 1037f45
Show file tree
Hide file tree
Showing 9 changed files with 261 additions and 64 deletions.
43 changes: 15 additions & 28 deletions bin/generate_truss_examples.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,13 @@
```
"""
import enum
import itertools
import json
import os
import shutil
import subprocess
import sys
from pathlib import Path
from typing import Iterator, List, Optional, Tuple
from typing import List, Optional, Tuple

import yaml

Expand Down Expand Up @@ -209,6 +208,14 @@ def _generate_truss_example(truss_directory: str):
description: "{doc_information["description"]}"
---
"""

path_in_examples_repo = "/".join(Path(truss_directory).parts[1:])
link_to_github = f"""
<Card
title="View on Github"
icon="github" href="{TRUSS_EXAMPLES_REPO}/tree/main/{path_in_examples_repo}">
</Card>
"""
files_to_scrape = doc_information["files"]

full_content, code_blocks = zip(
Expand All @@ -222,7 +229,7 @@ def _generate_truss_example(truss_directory: str):
file_content = "\n".join(full_content) + _generate_request_example_block(
full_code_block
)
example_content = f"""{header}\n{file_content}"""
example_content = f"""{header}\n{link_to_github}\n{file_content}"""
path_to_example = Path(example_destination)
path_to_example.parent.mkdir(parents=True, exist_ok=True)

Expand All @@ -246,17 +253,6 @@ def _format_group_name(group_name: str) -> str:
return lowercase_name[0].upper() + lowercase_name[1:]


def _toc_section(
example_group_name: str, example_group: Iterator[Tuple[str, ...]]
) -> dict:
return {
"group": _format_group_name(example_group_name),
"pages": [
f"examples/{example[0]}/{example[1]}" for example in list(example_group)
],
}


def update_toc(example_dirs: List[str]):
"""
Update the table of contents in the README.md file.
Expand All @@ -273,21 +269,12 @@ def update_toc(example_dirs: List[str]):

examples_section = [item for item in navigation if item["group"] == "Examples"][0]

# Group together by the parent directory. ie:
#
# * 3_llms/llm
# * 3_llms/llm-streaming
#
# will be grouped together with they key "3_llms". This allows us to have proper
# nesting in the table of contents.
grouped_examples = itertools.groupby(
sorted(transformed_example_paths, key=lambda example: example[0]),
key=lambda example: example[0],
)

# Sort examples by the group name
examples_section["pages"] = [
_toc_section(example_group_name, example_group)
for example_group_name, example_group in grouped_examples
f"examples/{example_path[0]}/{example_path[1]}"
for example_path in sorted(
transformed_example_paths, key=lambda example: example[0]
)
]

serialized_mint_config = json.dumps(mint_config, indent=2)
Expand Down
14 changes: 12 additions & 2 deletions docs/examples/1_introduction/getting-started-bert.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,12 @@ title: "Getting Started: Text Classification"
description: "Building your first Truss"
---


<Card
title="View on Github"
icon="github" href="https://github.com/basetenlabs/truss-examples-2/tree/main/1_introduction/getting-started-bert">
</Card>

In this example, we go through building your first Truss model. We'll be using the HuggingFace transformers
library to build a text classification model that can detect sentiment of text.

Expand Down Expand Up @@ -63,7 +69,9 @@ such as the name, and the Python version to build with.
```yaml config.yaml
model_name: bert
python_version: py310
model_metadata: {}
model_metadata:
example_model_input: { "text": "Hello my name is {MASK}" }


```
### Set up python requirements
Expand Down Expand Up @@ -135,7 +143,9 @@ class Model:
```yaml config.yaml
model_name: bert
python_version: py310
model_metadata: {}
model_metadata:
example_model_input: { "text": "Hello my name is {MASK}" }


requirements:
- torch==2.0.1
Expand Down
181 changes: 181 additions & 0 deletions docs/examples/2_image_classification/clip.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,181 @@
---
title: "Image Classification with CLIP"
description: "Deploy a CLIP model to classify images"
---


<Card
title="View on Github"
icon="github" href="https://github.com/basetenlabs/truss-examples-2/tree/main/2_image_classification/clip">
</Card>

In this example, we create a Truss that uses [CLIP](https://openai.com/research/clip) to classify images,
using some pre-defined labels. The input to this Truss will be an image, the output will be a classification.

One of the major things to note about this example is that since the inputs are images, we need to have
some mechanism for downloading the image. To accomplish this, we have the user pass a downloadable URL to
the Truss, and in the Truss code, download the image. To do this efficiently, we will make use of the
`preprocess` method in Truss.

# Set up imports and constants

For our CLIP Truss, we will be using the Hugging Face transformers library, as well as
`pillow` for image processing.

```python model/model.py
import requests
from typing import Dict
from PIL import Image
from transformers import CLIPProcessor, CLIPModel

```
This is the CLIP model from Hugging Face that we will use for this example.

```python model/model.py
CHECKPOINT = "openai/clip-vit-base-patch32"

```
# Define the Truss

In the `load` method, we load in the pretrained CLIP model from the
Hugging Face checkpoint specified above.

```python model/model.py
class Model:
def __init__(self, **kwargs) -> None:
self._processor = None
self._model = None

def load(self):
"""
Loads the CLIP model and processor checkpoints.
"""
self._model = CLIPModel.from_pretrained(CHECKPOINT)
self._processor = CLIPProcessor.from_pretrained(CHECKPOINT)

```
In the `preprocess` method, we download the image from the url and preprocess it.
This method is a part of the Truss class, and is designed to be used for any logic
involving IO, like in this case, downloading an image.

It is called before the predict method in a separate thread, and is not subject to the same
concurrency limits as the predict method, so can be called many times in parallel.
This makes it such that the predict method is not unnecessarily blocked on IO-bound
tasks, and helps improve the throughput of the Truss. See our [guide to concurrency](../guides/concurrency)
for more info.

```python model/model.py
def preprocess(self, request: Dict) -> Dict:

image = Image.open(requests.get(request.pop("url"), stream=True).raw)
request["inputs"] = self._processor(
text=["a photo of a cat", "a photo of a dog"], # Define preset labels to use
images=image,
return_tensors="pt",
padding=True
)
return request

```
The `predict` method performs the actual inference, and outputs a probability associated
with each of the labels defined earlier.

```python model/model.py
def predict(self, request: Dict) -> Dict:
"""
This performs the actual classification. The predict method is subject to
the predict concurrency constraints.
"""
outputs = self._model(**request["inputs"])
logits_per_image = outputs.logits_per_image
return logits_per_image.softmax(dim=1).tolist()
```

# Set up the config.yaml

The main section that needs to be filled out
to run CLIP is the `requirements` section, where we need
to include `transformers`, for the model pipeline, and `pillow`,
for image processing.

```yaml config.yaml
model_name: clip-example
requirements:
- transformers==4.32.0
- pillow==10.0.0
- torch==2.0.1
model_metadata:
example_model_input: {"url": "https://images.pexels.com/photos/1170986/pexels-photo-1170986.jpeg?auto=compress&cs=tinysrgb&w=1600"}
resources:
cpu: "3"
memory: 14Gi
use_gpu: true
accelerator: A10G
```
# Deploy the model
Deploy the CLIP model like you would other Trusses, with:
```bash
$ truss push
```
You can then invoke the model with:
```bash
$ truss predict -d '{"image_url": "https://source.unsplash.com/gKXKBY-C-Dk/300x300""]}' --published
```

<RequestExample>
```python model/model.py
import requests
from typing import Dict
from PIL import Image
from transformers import CLIPProcessor, CLIPModel

CHECKPOINT = "openai/clip-vit-base-patch32"

class Model:
def __init__(self, **kwargs) -> None:
self._processor = None
self._model = None

def load(self):
"""
Loads the CLIP model and processor checkpoints.
"""
self._model = CLIPModel.from_pretrained(CHECKPOINT)
self._processor = CLIPProcessor.from_pretrained(CHECKPOINT)

def preprocess(self, request: Dict) -> Dict:

image = Image.open(requests.get(request.pop("url"), stream=True).raw)
request["inputs"] = self._processor(
text=["a photo of a cat", "a photo of a dog"], # Define preset labels to use
images=image,
return_tensors="pt",
padding=True
)
return request

def predict(self, request: Dict) -> Dict:
"""
This performs the actual classification. The predict method is subject to
the predict concurrency constraints.
"""
outputs = self._model(**request["inputs"])
logits_per_image = outputs.logits_per_image
return logits_per_image.softmax(dim=1).tolist()
```
```yaml config.yaml
model_name: clip-example
requirements:
- transformers==4.32.0
- pillow==10.0.0
- torch==2.0.1
model_metadata:
example_model_input: {"url": "https://images.pexels.com/photos/1170986/pexels-photo-1170986.jpeg?auto=compress&cs=tinysrgb&w=1600"}
resources:
cpu: "3"
memory: 14Gi
use_gpu: true
accelerator: A10G
```
</RequestExample>
14 changes: 12 additions & 2 deletions docs/examples/3_LLMs/llm-with-streaming.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,12 @@ title: "LLM with Streaming"
description: "Building an LLM with streaming output"
---


<Card
title="View on Github"
icon="github" href="https://github.com/basetenlabs/truss-examples-2/tree/main/3_LLMs/llm-with-streaming">
</Card>

In this example, we go through a Truss that serves an LLM, and streams the output to the client.

# Why Streaming?
Expand Down Expand Up @@ -143,11 +149,13 @@ and a few other related libraries.

```yaml config.yaml
model_name: "LLM with Streaming"
model_metadata:
example_model_input: {"prompt": "what is the meaning of life"}
requirements:
- torch==2.0.1
- peft==0.4.0
- scipy==1.11.1
- sentencepiece==1.11.1
- sentencepiece==0.1.99
- accelerate==0.21.0
- bitsandbytes==0.41.1
- einops==0.6.1
Expand Down Expand Up @@ -235,11 +243,13 @@ class Model:
```
```yaml config.yaml
model_name: "LLM with Streaming"
model_metadata:
example_model_input: {"prompt": "what is the meaning of life"}
requirements:
- torch==2.0.1
- peft==0.4.0
- scipy==1.11.1
- sentencepiece==1.11.1
- sentencepiece==0.1.99
- accelerate==0.21.0
- bitsandbytes==0.41.1
- einops==0.6.1
Expand Down
16 changes: 12 additions & 4 deletions docs/examples/6_high_performance/tgi.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,12 @@ title: "High Performance LLMs with TGI"
description: "Deploy a language model with TGI"
---


<Card
title="View on Github"
icon="github" href="https://github.com/basetenlabs/truss-examples-2/tree/main/6_high_performance/tgi">
</Card>

[TGI](https://github.com/huggingface/text-generation-inference/tree/main) is a model server optimized for
language models. In this example, we put together a Truss that serves the model Falcon 7B using TGI.

Expand All @@ -24,7 +30,7 @@ The endpoint argument has two options:
Select the model that you'd like to use with TGI
```yaml config.yaml
model: tiiuae/falcon-7b
model_id: tiiuae/falcon-7b
```
The `model_server` parameter allows you to specify a supported backend (in this example, TGI)

Expand All @@ -45,7 +51,8 @@ The remaining config options listed are standard Truss Config options.
```yaml config.yaml
environment_variables: {}
external_package_dirs: []
model_metadata: {}
model_metadata:
example_model_input: {"inputs": "what is the meaning of life"}
model_name: Falcon-TGI
python_version: py39
requirements: []
Expand Down Expand Up @@ -73,13 +80,14 @@ $ truss predict -d '{"inputs": "What is a large language model?", "parameters":
build:
arguments:
endpoint: generate_stream
model: tiiuae/falcon-7b
model_id: tiiuae/falcon-7b
model_server: TGI
runtime:
predict_concurrency: 128
environment_variables: {}
external_package_dirs: []
model_metadata: {}
model_metadata:
example_model_input: {"inputs": "what is the meaning of life"}
model_name: Falcon-TGI
python_version: py39
requirements: []
Expand Down
Loading

0 comments on commit 1037f45

Please sign in to comment.