viam-modules · nathan-contino · Jul 8, 2025 · Jun 16, 2025 · Jun 16, 2025
diff --git a/README.md b/README.md
@@ -1,17 +1,22 @@
 # Viam Torchvision Module
 
-
-This is a [Viam module](https://docs.viam.com/extend/modular-resources/) providing a model of vision service for [TorchVision's New Multi-Weight Support API](https://pytorch.org/blog/introducing-torchvision-new-multi-weight-support-api/).
+This is a [Viam module](https://docs.viam.com/extend/modular-resources/) providing a model of [vision service](https://docs.viam.com/services/vision/#api) for [TorchVision's New Multi-Weight Support API](https://pytorch.org/blog/introducing-torchvision-new-multi-weight-support-api/).
 <p align="center">
- <img src="https://pytorch.org/assets/images/torchvision_gif.gif" width=80%, height=70%>
- </p>
-
+<img src="https://pytorch.org/assets/images/torchvision_gif.gif" width=80%, height=70%>
+</p>
 
- For a given model architecture (e.g. *ResNet50*), multiple weights can be available and each of those weights comes with Metadata (preprocessing and labels). 
+For a given model architecture (e.g. *ResNet50*), multiple weights can be available. Each of those weights comes with preprocessing and label metadata. 
 
 ## Getting started
 
+First, [create a machine](https://docs.viam.com/how-tos/configure/) in Viam.
+
 To use this module, follow these instructions to [add a module from the Viam Registry](https://docs.viam.com/modular-resources/configure/#add-a-module-from-the-viam-registry) and select the `viam:vision:torchvision` model from the [`torchvision` module](https://app.viam.com/module/viam/torchvision).
+
+Navigate to the [**CONFIGURE** tab](https://docs.viam.com/configure/) of your [machine](https://docs.viam.com/fleet/machines/) in the [Viam app](https://app.viam.com/).
+
+[Add vision / torchvision to your machine](https://docs.viam.com/configure/#components).
+
 Depending on the type of models configured, the module implements:
 
 - For detectors:
@@ -22,20 +27,58 @@ Depending on the type of models configured, the module implements:
   - `GetClassifications()`
   - `GetClassificationsFromCamera()`
 
-> [!NOTE]  
->See [vision service API](https://docs.viam.com/services/vision/#api) for more details.
+## viam:vision:torchvision
 
-## Configure your `torchvision` vision service
+To configure the `torchvision` model, use the following template:
 
-> [!NOTE]  
-> Before configuring your vision service, you must [create a machine](https://docs.viam.com/how-tos/configure/).
+```json
+{
+  "model_name": <string>,
+  "labels_confidences": {
+    <label1>: <float>,
+    <label2>: <float>
+  },
+  "default_minimum_confidence": <float>
+}
+```
 
-Navigate to the [**CONFIGURE** tab](https://docs.viam.com/configure/) of your [machine](https://docs.viam.com/fleet/machines/) in the [Viam app](https://app.viam.com/).
-[Add vision / torchvision to your machine](https://docs.viam.com/configure/#components).
+### Attributes
+
+The only **required attribute** to configure your torchvision vision service is a `model_name`:
 
-### Example configuration with a camera and transform camera
+
+| Name         | Type   | Inclusion    | Default | Description                                                                                                                                                                       |
+| ------------ | ------ | ------------ | ------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `model_name` | string | **Required** |         | Vision model name as expected by the method [get_model()](https://pytorch.org/vision/main/models.html#listing-and-retrieving-available-models) from torchvision multi-weight API. |
+
+
+### Optional attributes
+
+| Name                         | Type                  | Inclusion | Default     | Description |
+| ---------------------------- | --------------------- | --------- | ----------- | ----------- |
+| `weights`                    | string                | Optional  | `DEFAULT`   | Weights model name as expected by the method [get_model()](https://pytorch.org/vision/main/models.html#listing-and-retrieving-available-models) from torchvision multi-weight API. |
+| `default_minimum_confidence` | float                 | Optional  |             | Default minimum confidence for filtering all labels that are not specified in `label_confidences`. |
+| `labels_confidences`         | dict[str, float]      | Optional  |             | Dictionary specifying minimum confidence thresholds for specific labels. Example: `{"grasshopper": 0.5, "cricket": 0.45}`. If a label has a confidence set lower that `default_minimum_confidence`, that confidence over-writes the default for the specified label if `labels_confidences` is left blank, no filtering on labels will be applied. |
+| `use_weight_transform`       | bool                  | Optional  | True        | Loads preprocessing transform from weights metadata. |
+| `input size`                 | List[int]             | Optional  | `None`      | Resize the image. Overides resize from weights metadata. |
+| `mean_rgb`                   | [float, float, float] | Optional  | `[0, 0, 0]` | Specifies the mean and standard deviation values for normalization in RGB order. |
+| `std_rgb`                    | [float, float, float] | Optional  | `[1, 1, 1]` | Specifies the standard deviation values for normalization in RGB order. |
+| `swap_r_and_b`               | bool                  | Optional  | `False`     | If True, swaps the R and B channels in the input image. Use this if the images passed as inputs to the model are in the OpenCV format. |
+| `channel_last`               | bool                  | Optional  | `False`     | If True, the image tensor will be converted to channel-last format. Default is False. |
+
+### Preprocessing transforms behavior and **order**:
+
+- If there are a transform in the metadata of the weights and `use_weight_transform` is True, `weights_transform` is added to the pipeline.
+- If `input_size` is provided, the image is resized using `v2.Resize()` to the specified size.
+- If both mean and standard deviation values are provided in `normalize`, the image is normalized using `v2.Normalize()` with the specified mean and standard deviation values.
+- If `swap_R_and_B` is set to `True`, first and last channel are swapped. 
+- If `channel_last` is `True`, a transformation is applied to convert the channel order to the last dimension format. (C, H ,W) -> (H, W, X).
+
+
+#### Full example configuration
 
 The following JSON config file includes the following resources:
+
 - TorchVision module
 - modular resource (TorchVision vision service)
 - a [webcam camera](https://docs.viam.com/components/camera/webcam/)
@@ -99,40 +142,7 @@ The following JSON config file includes the following resources:
 }
 ```
 
+### Resources
 
-### Attributes description
-
-The only **required attribute** to configure your torchvision vision service is a `model_name`:
-
-
-| Name         | Type   | Inclusion    | Default | Description                                                                                                                                                                       |
-| ------------ | ------ | ------------ | ------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `model_name` | string | **Required** |         | Vision model name as expected by the method [get_model()](https://pytorch.org/vision/main/models.html#listing-and-retrieving-available-models) from torchvision multi-weight API. |
-
-
-
-## Supplementaries
-### Optional config attributes
-| Name                         | Type                  | Inclusion | Default     | Description                                                                                                                                                                                                                                                                                                                                        |
-| ---------------------------- | --------------------- | --------- | ----------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `weights`                    | string                | Optional  | `DEFAULT`   | Weights model name as expected by the method [get_model()](https://pytorch.org/vision/main/models.html#listing-and-retrieving-available-models) from torchvision multi-weight API.                                                                                                                                                                 |
-| `default_minimum_confidence` | float                 | Optional  |             | Default minimum confidence for filtering all labels that are not specified in `label_confidences`.                                                                                                                                                                                                                                                 |
-| `labels_confidences`         | dict[str, float]      | Optional  |             | Dictionary specifying minimum confidence thresholds for specific labels. Example: `{"grasshopper": 0.5, "cricket": 0.45}`. If a label has a confidence set lower that `default_minimum_confidence`, that confidence over-writes the default for the specified label if `labels_confidences` is left blank, no filtering on labels will be applied. |
-| `use_weight_transform`       | bool                  | Optional  | True        | Loads preprocessing transform from weights metadata.                                                                                                                                                                                                                                                                                               |
-| `input size`                 | List[int]             | Optional  | `None`      | Resize the image. Overides resize from weights metadata.                                                                                                                                                                                                                                                                                           |
-| `mean_rgb`                   | [float, float, float] | Optional  | `[0, 0, 0]` | Specifies the mean and standard deviation values for normalization in RGB order                                                                                                                                                                                                                                                                    |
-| `std_rgb`                    | [float, float, float] | Optional  | `[1, 1, 1]` | Specifies the standard deviation values for normalization in RGB order.                                                                                                                                                                                                                                                                            |
-| `swap_r_and_b`               | bool                  | Optional  | `False`     | If True, swaps the R and B channels in the input image. Use this if the images passed as inputs to the model are in the OpenCV format.                                                                                                                                                                                                             |
-| `channel_last`               | bool                  | Optional  | `False`     | If True, the image tensor will be converted to channel-last format. Default is False.                                                                                                                                                                                                                                                              |
-### Preprocessing transforms behavior and **order**:
-   - If there are a transform in the metadata of the weights and `use_weight_transform` is True, `weights_transform` is added to the pipeline.
-   - If `input_size` is provided, the image is resized using `v2.Resize()` to the specified size.
-   - If both mean and standard deviation values are provided in `normalize`, the image is normalized using `v2.Normalize()` with the specified mean and standard deviation values.
-   - If `swap_R_and_B` is set to `True`, first and last channel are swapped. 
-   - If `channel_last` is `True`, a transformation is applied to convert the channel order to the last dimension format. (C, H ,W) -> (H, W, X).
-
-
-
-### RESOURCES
 - [Table of all available classification weights](https://pytorch.org/vision/main/models.html#table-of-all-available-classification-weights)
 - [Quantized models](https://pytorch.org/vision/main/models.html#quantized-models)
diff --git a/meta.json b/meta.json
@@ -6,7 +6,9 @@
   "models": [
     {
       "api": "rdk:service:vision",
-      "model": "viam:vision:torchvision"
+      "model": "viam:vision:torchvision",
+      "short_description": "Service wrapper for the torchvision computer vision library.",
+      "markdown_link": "README.md#viamvisiontorchvision"
     }
   ],
   "build": {