Skip to content

lvhg/torchvision

 
 

Repository files navigation

Viam Torchvision Module

This is a Viam module providing a model of vision service for TorchVision's New Multi-Weight Support API.

For a given model architecture (e.g. ResNet50), multiple weights can be available. Each of those weights comes with preprocessing and label metadata.

Getting started

First, create a machine in Viam.

To use this module, follow these instructions to add a module from the Viam Registry and select the viam:vision:torchvision model from the torchvision module.

Navigate to the CONFIGURE tab of your machine in the Viam app.

Add vision / torchvision to your machine.

Depending on the type of models configured, the module implements:

  • For detectors:

    • GetDetections()
    • GetDetectionsFromCamera()
  • For classifiers:

    • GetClassifications()
    • GetClassificationsFromCamera()

viam:vision:torchvision

To configure the torchvision model, use the following template:

{
  "model_name": <string>,
  "labels_confidences": {
    <label1>: <float>,
    <label2>: <float>
  },
  "default_minimum_confidence": <float>
}

Attributes

The only required attribute to configure your torchvision vision service is a model_name:

Name Type Inclusion Default Description
model_name string Required Vision model name as expected by the method get_model() from torchvision multi-weight API.

Optional attributes

Name Type Inclusion Default Description
weights string Optional DEFAULT Weights model name as expected by the method get_model() from torchvision multi-weight API.
default_minimum_confidence float Optional Default minimum confidence for filtering all labels that are not specified in label_confidences.
labels_confidences dict[str, float] Optional Dictionary specifying minimum confidence thresholds for specific labels. Example: {"grasshopper": 0.5, "cricket": 0.45}. If a label has a confidence set lower that default_minimum_confidence, that confidence over-writes the default for the specified label if labels_confidences is left blank, no filtering on labels will be applied.
use_weight_transform bool Optional True Loads preprocessing transform from weights metadata.
input size List[int] Optional None Resize the image. Overides resize from weights metadata.
mean_rgb [float, float, float] Optional [0, 0, 0] Specifies the mean and standard deviation values for normalization in RGB order.
std_rgb [float, float, float] Optional [1, 1, 1] Specifies the standard deviation values for normalization in RGB order.
swap_r_and_b bool Optional False If True, swaps the R and B channels in the input image. Use this if the images passed as inputs to the model are in the OpenCV format.
channel_last bool Optional False If True, the image tensor will be converted to channel-last format. Default is False.

Preprocessing transforms behavior and order:

  • If there are a transform in the metadata of the weights and use_weight_transform is True, weights_transform is added to the pipeline.
  • If input_size is provided, the image is resized using v2.Resize() to the specified size.
  • If both mean and standard deviation values are provided in normalize, the image is normalized using v2.Normalize() with the specified mean and standard deviation values.
  • If swap_R_and_B is set to True, first and last channel are swapped.
  • If channel_last is True, a transformation is applied to convert the channel order to the last dimension format. (C, H ,W) -> (H, W, X).

Full example configuration

The following JSON config file includes the following resources:

{
  "modules": [
    {
      "executable_path": "/path/to/run.sh",
      "name": "mytorchvisionmodule",
      "type": "local"
    }
  ],
  "services": [
    {
      "attributes": {
        "model_name": "fasterrcnn_mobilenet_v3_large_320_fpn",
        "labels_confidences": {"grasshopper": 0.5, 
                                "cricket": 0.45 },
        "default_minimum_confidence": 0.3
        
      },
      "name": "detector-module",
      "type": "vision",
      "namespace": "rdk",
      "model": "viam:vision:torchvision"
    }
  ],
    "components": [
    {
      "namespace": "rdk",
      "attributes": {
        "video_path": "video0"
      },
      "depends_on": [],
      "name": "cam",
      "model": "webcam",
      "type": "camera"
    },
    {
      "model": "transform",
      "type": "camera",
      "namespace": "rdk",
      "attributes": {
        "source": "cam",
        "pipeline": [
          {
            "attributes": {
              "detector_name": "detector-module",
              "confidence_threshold": 0.5
            },
            "type": "detections"
          }
        ]
      },
      "depends_on": [],
      "name": "detections"
    }
  ]
}

Resources

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 95.2%
  • Shell 4.2%
  • Makefile 0.6%