Skip to content

Conversation

@farazshaikh
Copy link

@farazshaikh farazshaikh commented Dec 25, 2025

Problem

Video segmentation with the SAM3 video model causes RAM Out-Of-Memory errors on longer videos (800+ frames) because:

  • init_state() loads ALL video frames into RAM at once via load_resource_as_video_frames
  • For 800 frames at 1024x1024, this pre-allocates ~5GB+ of RAM in a single tensor
  • Memory grows unbounded as frame count increases

Solution: Mega-Batch Processing

Instead of loading all frames at once, process the video in mega-batches of 200 frames:

Total: 800 frames
Mega-batch 1: frames 0-199   -> init_state loads 200 frames -> process -> cleanup
Mega-batch 2: frames 200-399 -> init_state loads 200 frames -> process -> cleanup
Mega-batch 3: frames 400-599 -> init_state loads 200 frames -> process -> cleanup
Mega-batch 4: frames 600-799 -> init_state loads 200 frames -> process -> cleanup

Key Features

  1. Bounded RAM usage: Peak RAM reduced from O(total_frames) to O(200 frames) = ~1.2GB instead of ~5GB
  2. Tracking continuity: First batch uses text prompt, subsequent batches use box prompts from the last frame to re-seed tracking
  3. Per-batch cleanup: Inference state reset + temp directory deleted + gc.collect() after each mega-batch

Memory Profile

Frames Before (RAM) After (RAM) Reduction
200 ~1.2GB ~1.2GB 1x
400 ~2.4GB ~1.2GB 2x
800 ~5GB ~1.2GB 4x
1600 ~10GB (OOM) ~1.2GB 8x

Technical Details

  • MEGA_BATCH_SIZE = 200: Number of frames per SAM3 initialization
  • FRAME_BATCH_SIZE = 32: Batch size for processing outputs (unchanged)
  • Frame indices are remapped: global -> local for SAM3, local -> global for results
  • last_frame_boxes/last_frame_labels passed between mega-batches for tracking

Trade-offs

  • Object IDs may change between mega-batches since each batch is a new SAM3 session
  • Box-prompt seeding maintains spatial tracking but not ID continuity
  • For strict ID continuity, users should use shorter video segments

Testing

Tested with 800-frame video tracking 3 objects:

  • Before: OOM crash during init_state
  • After: Completes successfully with stable ~1.2GB RAM usage

Changes

  • src/comfyui_sam3/nodes.py: Refactored _segment_with_video_model method (229 insertions, 229 deletions)

…sing

Problem:
- SAM3 init_state() loads ALL video frames into RAM at once
- For 800 frames at 1024x1024, this consumes ~5GB+ of RAM
- Previous fix only optimized our wrapper, not SAM3 internal loading

Solution:
Implement mega-batch processing that reduces peak RAM from O(total_frames) to O(MEGA_BATCH_SIZE):

1. MEGA_BATCH_SIZE = 200 frames per initialization
   - Only 200 frames are saved to temp dir at a time
   - SAM3 init_state only loads 200 frames (~1.2GB instead of ~5GB)

2. Box-prompt seeding between mega-batches
   - First batch: use text prompt for initial detection
   - Subsequent batches: use bounding boxes from last frame
   - This preserves object tracking across batch boundaries

3. Per-batch cleanup
   - Reset inference state after each mega-batch
   - Delete temp directory after each batch
   - gc.collect() to release memory

Technical changes:
- New outer loop over mega-batches (0-199, 200-399, etc.)
- Frame indices remapped: global -> local for SAM3, local -> global for results
- last_frame_boxes/labels passed between batches for tracking continuity
- FRAME_BATCH_SIZE=32 still used for processing outputs within each mega-batch

Memory profile:
- Before: 800 frames = ~5GB RAM spike at init_state
- After: 200 frames per batch = ~1.2GB RAM (4x reduction)

Tracking continuity:
- Objects detected in batch N are re-seeded in batch N+1 via box prompts
- Object IDs may change between batches but masks remain consistent
@farazshaikh farazshaikh force-pushed the fix/ram-oom-video-segmentation branch from 5655974 to 2eafa2d Compare December 25, 2025 09:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant