Skip to content

Add LaughterSegmentation to contrib#56

Open
jimburtoft wants to merge 3 commits intoaws-neuron:mainfrom
jimburtoft:contrib/laughter-segmentation
Open

Add LaughterSegmentation to contrib#56
jimburtoft wants to merge 3 commits intoaws-neuron:mainfrom
jimburtoft:contrib/laughter-segmentation

Conversation

@jimburtoft
Copy link

Description

Add LaughterSegmentation, a Wav2Vec2-based laughter detection model compiled with torch_neuronx.trace(). Includes a self-contained Jupyter notebook with compilation, single-core and DataParallel benchmarks, accuracy validation against CPU reference, and an end-to-end inference demo. Also documents a weight_norm parametrization crash workaround required on SDK 2.28+.

Model Information

Model Name: LaughterSegmentation (omine-me/LaughterSegmentation)
Model Architecture: Wav2Vec2ForAudioFrameClassification (~315M params, FP32)
Purpose: Audio frame classification -- detects laughter segments in speech audio
Checklist

Required Components

  • Accuracy Test (test/integration/test_model.py)
    • 12 tests total: smoke tests, cosine similarity validation across 5 input types, frame-level prediction agreement, DataParallel correctness & speedup, throughput & latency benchmarks
    • All 12 pass on inf2.xlarge (SDK 2.28, PyTorch 2.9)
  • README.md with the following sections:
    • Usage Example: Jupyter notebook workflow with bash quickstart
    • Compatibility Matrix: inf2.xlarge validated, trn2.3xlarge tested
    • Example Checkpoints: Link to HuggingFace model
    • Testing Instructions: pytest and standalone runner commands
  • Source Code (src/)
    • N/A -- this model uses torch_neuronx.trace() directly rather than NxD Inference modeling code. All logic is in the notebook and test file.

Optional Components

  • Unit Tests -- test/unit/ directory created (empty init.py, consistent with other contrib models)
    Folder Structure
    /contrib/models/LaughterSegmentation/
    README.md
    laughter_neuron_inf2.ipynb # notebook with executed outputs
    /test
    init.py
    /unit
    init.py
    /integration
    init.py
    test_model.py

Testing

How did you test this change?
Ran pytest on a fresh inf2.xlarge instance (sa-east-1) with the Deep Learning AMI Neuron (Ubuntu 24.04) 20260227 and the pre-installed PyTorch inference venv.
source /opt/aws_neuronx_venv_pytorch_inference_vllm_0_13/bin/activate
pip install safetensors pytest
pytest test_model.py --capture=tee-sys -v
Test Results:
test_model.py::TestModelLoads::test_neuron_model_loads PASSED [ 8%]
test_model.py::TestModelLoads::test_neuron_model_runs PASSED [ 16%]
test_model.py::TestAccuracy::test_cosine_similarity[random_normal] PASSED [ 25%]
test_model.py::TestAccuracy::test_cosine_similarity[quiet_noise] PASSED [ 33%]
test_model.py::TestAccuracy::test_cosine_similarity[loud_signal] PASSED [ 41%]
test_model.py::TestAccuracy::test_cosine_similarity[sine_440hz] PASSED [ 50%]
test_model.py::TestAccuracy::test_cosine_similarity[silence] PASSED [ 58%]
test_model.py::TestAccuracy::test_frame_agreement PASSED [ 66%]
test_model.py::TestDataParallel::test_data_parallel_runs PASSED [ 75%]
test_model.py::TestDataParallel::test_data_parallel_speedup PASSED [ 83%]
test_model.py::TestPerformance::test_throughput PASSED [ 91%]
test_model.py::TestPerformance::test_latency PASSED [100%]
======================= 12 passed, 2 warnings in 51.49s =======================

Key metrics:

  • Cosine similarity: >= 0.999999 across all inputs
  • Frame agreement: 100%
  • Single core: 101.5 W/s, 9.85 ms p50 latency
  • DataParallel (2 cores): 176.5 W/s, 1.88x speedup
    Compatibility

Tested with:

  • Neuron SDK Version(s): 2.28
  • Instance Type(s): inf2.xlarge
  • PyTorch Version: 2.9.0
  • Python Version: 3.12.3
    Additional Information
  • This model uses torch_neuronx.trace() rather than NxD Inference, since it is an encoder-only classification model (not an autoregressive LLM). It does not have a src/ directory with modeling code.
  • weight_norm workaround: Wav2Vec2 uses weight_norm parametrizations that crash torch_neuronx.trace() on SDK 2.28+. The notebook and test strip parametrizations before tracing. This affects any HuggingFace model using weight_norm.
  • The notebook also runs on trn2.3xlarge by uncommenting the --lnc compiler arg.

Related Issues

None

vLLM Integration

  • This model/feature is intended for use with vLLM
  • Documentation includes vLLM registration instructions

By submitting this PR, I confirm that:

  • I have read and followed the contributing guidelines (../contrib/CONTRIBUTING.md)
  • This is a community contribution and may have limited testing compared to officially-supported models
  • The code follows best practices and is well-documented
  • All required components listed above are included

jimburtoft and others added 3 commits March 4, 2026 22:23
Wav2Vec2-based laughter detection model (315M params) compiled with
torch_neuronx.trace(). Includes single-core and DataParallel benchmarks,
accuracy validation, and end-to-end inference demo on inf2.xlarge.
12/12 tests pass on inf2.xlarge (SDK 2.28, PyTorch 2.9):
- Smoke tests (model loads and runs)
- Accuracy: cosine similarity >= 0.999, 100% frame agreement
- DataParallel: 1.88x speedup on 2 cores (176.5 W/s)
- Performance: 101.5 W/s throughput, 9.85 ms p50 latency
Added maintainer information for Jim Burtoft.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant