VISTA-3D: Substantial Variability in Inference Latency #688

drbeh · 2024-10-02T13:40:29Z

Describe the bug

We tried to benchmark VISTA-3D for speed (latency), but we observed significant variation in the results with each run. Despite executing multiple trials and using the median for our measurements, the median latency can still differ considerably. For instance, with just five samples, the median can vary by more than 100%.
A third-party benchmark study also revealed that the inference latency distribution is not Gaussian and occasionally exhibits bimodal characteristics, with two distinct peaks

Here are the tests cases that we used:

test_case:
  case1:
    image: https://vista3d-nim-test-images.s3.amazonaws.com/test-vista3d-nim-image-url/256cubic.nii.gz
    prompts: 
      classes: []
    size: [256, 256, 256]

  case2:
    image: https://vista3d-nim-test-images.s3.amazonaws.com/test-vista3d-nim-image-url/256cubic.nii.gz
    prompts: 
      classes:
      - spleen
    size: [256, 256, 256]

  case3:
    image: https://vista3d-nim-test-images.s3.amazonaws.com/test-vista3d-nim-image-url/512cubic.nii.gz
    prompts: 
      classes: []
    size: [512, 512, 512]

  case4:
    image: https://vista3d-nim-test-images.s3.amazonaws.com/test-vista3d-nim-image-url/512cubic.nii.gz
    prompts: 
      classes:
      - liver
    size: [512, 512, 512]

  case5:
    image: https://vista3d-nim-test-images.s3.amazonaws.com/test-vista3d-nim-image-url/512-768.nii.gz
    prompts: 
      classes: []
    size: [512, 512, 768]

  case6:
    image: https://vista3d-nim-test-images.s3.amazonaws.com/test-vista3d-nim-image-url/512-768.nii.gz
    prompts: 
      classes:
      - heart
    size: [512, 512, 768]

Environment

The baseline and banchmak are being run on different machines but the same container.

================================
Printing MONAI config...
================================
MONAI version: 1.4.0rc9
Numpy version: 1.24.4
Pytorch version: 2.5.0a0+872d972e41.nv24.08.01
MONAI flags: HAS_EXT = False, USE_COMPILED = False, USE_META_DICT = False
MONAI rev id: fa1c1af79ef5387434f2a76744f75b5aaca09f0b
MONAI __file__: /usr/local/lib/python3.10/dist-packages/monai/__init__.py

Optional dependencies:
Pytorch Ignite version: 0.4.11
ITK version: 5.4.0
Nibabel version: 5.2.1
scikit-image version: 0.23.2
scipy version: 1.13.1
Pillow version: 10.4.0
Tensorboard version: 2.17.0
gdown version: 5.2.0
TorchVision version: 0.20.0a0
tqdm version: 4.66.4
lmdb version: 1.5.1
psutil version: 5.9.8
pandas version: 2.2.2
einops version: 0.7.0
transformers version: 4.40.2
mlflow version: NOT INSTALLED or UNKNOWN VERSION.
pynrrd version: 1.0.0
clearml version: 1.16.3

For details about installing the optional dependencies, please visit:
    https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies


================================
Printing system config...
================================
System: Linux
Linux version: Ubuntu 22.04.5 LTS
Platform: Linux-6.8.0-41-generic-x86_64-with-glibc2.35
Processor: x86_64
Machine: x86_64
Python version: 3.10.12
Process name: pt_main_thread
Command: ['python', '-c', 'import monai; monai.config.print_debug_info()']
Open files: []
Num physical CPUs: 8
Num logical CPUs: 16
Num usable CPUs: 16
CPU usage (%): [4.2, 2.8, 2.5, 2.1, 2.5, 2.6, 2.1, 17.5, 2.6, 2.6, 2.1, 2.0, 2.1, 1.8, 78.3, 7.2]
CPU freq. (MHz): 1839
Load avg. in last 1, 5, 15 mins (%): [3.5, 1.8, 0.6]
Disk usage (%): 38.2
Avg. sensor temp. (Celsius): UNKNOWN for given OS
Total physical memory (GB): 125.7
Available memory (GB): 121.3
Used memory (GB): 2.9

================================
Printing GPU config...
================================
Num GPUs: 1
Has CUDA: True
CUDA version: 12.6
cuDNN enabled: True
NVIDIA_TF32_OVERRIDE: None
TORCH_ALLOW_TF32_CUBLAS_OVERRIDE: 1
cuDNN version: 90400
Current device: 0
Library compiled for CUDA architectures: ['sm_70', 'sm_72', 'sm_75', 'sm_80', 'sm_86', 'sm_87', 'sm_90', 'compute_90']
GPU 0 Name: NVIDIA A40
GPU 0 Is integrated: False
GPU 0 Is multi GPU board: False
GPU 0 Multi processor count: 84
GPU 0 Total memory (GB): 44.4
GPU 0 CUDA capability (maj.min): 8.6

The text was updated successfully, but these errors were encountered:

yiheng-wang-nv · 2024-10-11T05:54:23Z

close as discussed offline, did not detect unstable inference speed

Nic-Ma assigned yiheng-wang-nv and KumoLiu Oct 2, 2024

yiheng-wang-nv closed this as completed Oct 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VISTA-3D: Substantial Variability in Inference Latency #688

VISTA-3D: Substantial Variability in Inference Latency #688

drbeh commented Oct 2, 2024

yiheng-wang-nv commented Oct 11, 2024

VISTA-3D: Substantial Variability in Inference Latency #688

VISTA-3D: Substantial Variability in Inference Latency #688

Comments

drbeh commented Oct 2, 2024

yiheng-wang-nv commented Oct 11, 2024