Add LaughterSegmentation to contrib by jimburtoft · Pull Request #56 · aws-neuron/neuronx-distributed-inference

jimburtoft · 2026-03-06T01:08:15Z

Description

Add LaughterSegmentation, a Wav2Vec2-based laughter detection model compiled with torch_neuronx.trace(). Includes a self-contained Jupyter notebook with compilation, single-core and DataParallel benchmarks, accuracy validation against CPU reference, and an end-to-end inference demo. Also documents a weight_norm parametrization crash workaround required on SDK 2.28+.

Model Information

Model Name: LaughterSegmentation (omine-me/LaughterSegmentation)
Model Architecture: Wav2Vec2ForAudioFrameClassification (~315M params, FP32)
Purpose: Audio frame classification -- detects laughter segments in speech audio
Checklist

Required Components

Accuracy Test (test/integration/test_model.py)
- 12 tests total: smoke tests, cosine similarity validation across 5 input types, frame-level prediction agreement, DataParallel correctness & speedup, throughput & latency benchmarks
- All 12 pass on inf2.xlarge (SDK 2.28, PyTorch 2.9)
README.md with the following sections:
- Usage Example: Jupyter notebook workflow with bash quickstart
- Compatibility Matrix: inf2.xlarge validated, trn2.3xlarge tested
- Example Checkpoints: Link to HuggingFace model
- Testing Instructions: pytest and standalone runner commands
Source Code (src/)
- N/A -- this model uses torch_neuronx.trace() directly rather than NxD Inference modeling code. All logic is in the notebook and test file.

Optional Components

Unit Tests -- test/unit/ directory created (empty init.py, consistent with other contrib models)
Folder Structure
/contrib/models/LaughterSegmentation/
README.md
laughter_neuron_inf2.ipynb # notebook with executed outputs
/test
init.py
/unit
init.py
/integration
init.py
test_model.py

Testing

How did you test this change?
Ran pytest on a fresh inf2.xlarge instance (sa-east-1) with the Deep Learning AMI Neuron (Ubuntu 24.04) 20260227 and the pre-installed PyTorch inference venv.
source /opt/aws_neuronx_venv_pytorch_inference_vllm_0_13/bin/activate
pip install safetensors pytest
pytest test_model.py --capture=tee-sys -v
Test Results:
test_model.py::TestModelLoads::test_neuron_model_loads PASSED [ 8%]
test_model.py::TestModelLoads::test_neuron_model_runs PASSED [ 16%]
test_model.py::TestAccuracy::test_cosine_similarity[random_normal] PASSED [ 25%]
test_model.py::TestAccuracy::test_cosine_similarity[quiet_noise] PASSED [ 33%]
test_model.py::TestAccuracy::test_cosine_similarity[loud_signal] PASSED [ 41%]
test_model.py::TestAccuracy::test_cosine_similarity[sine_440hz] PASSED [ 50%]
test_model.py::TestAccuracy::test_cosine_similarity[silence] PASSED [ 58%]
test_model.py::TestAccuracy::test_frame_agreement PASSED [ 66%]
test_model.py::TestDataParallel::test_data_parallel_runs PASSED [ 75%]
test_model.py::TestDataParallel::test_data_parallel_speedup PASSED [ 83%]
test_model.py::TestPerformance::test_throughput PASSED [ 91%]
test_model.py::TestPerformance::test_latency PASSED [100%]
======================= 12 passed, 2 warnings in 51.49s =======================

Key metrics:

Cosine similarity: >= 0.999999 across all inputs
Frame agreement: 100%
Single core: 101.5 W/s, 9.85 ms p50 latency
DataParallel (2 cores): 176.5 W/s, 1.88x speedup
Compatibility

Tested with:

Neuron SDK Version(s): 2.28
Instance Type(s): inf2.xlarge
PyTorch Version: 2.9.0
Python Version: 3.12.3
Additional Information
This model uses torch_neuronx.trace() rather than NxD Inference, since it is an encoder-only classification model (not an autoregressive LLM). It does not have a src/ directory with modeling code.
weight_norm workaround: Wav2Vec2 uses weight_norm parametrizations that crash torch_neuronx.trace() on SDK 2.28+. The notebook and test strip parametrizations before tracing. This affects any HuggingFace model using weight_norm.
The notebook also runs on trn2.3xlarge by uncommenting the --lnc compiler arg.

Related Issues

None

vLLM Integration

This model/feature is intended for use with vLLM
Documentation includes vLLM registration instructions

By submitting this PR, I confirm that:

I have read and followed the contributing guidelines (../contrib/CONTRIBUTING.md)
This is a community contribution and may have limited testing compared to officially-supported models
The code follows best practices and is well-documented
All required components listed above are included

Wav2Vec2-based laughter detection model (315M params) compiled with torch_neuronx.trace(). Includes single-core and DataParallel benchmarks, accuracy validation, and end-to-end inference demo on inf2.xlarge.

12/12 tests pass on inf2.xlarge (SDK 2.28, PyTorch 2.9): - Smoke tests (model loads and runs) - Accuracy: cosine similarity >= 0.999, 100% frame agreement - DataParallel: 1.88x speedup on 2 cores (176.5 W/s) - Performance: 101.5 W/s throughput, 9.85 ms p50 latency

Added maintainer information for Jim Burtoft.

jimburtoft and others added 3 commits March 4, 2026 22:23

Add LaughterSegmentation contrib model with inf2 benchmark notebook

c26ad67

Wav2Vec2-based laughter detection model (315M params) compiled with torch_neuronx.trace(). Includes single-core and DataParallel benchmarks, accuracy validation, and end-to-end inference demo on inf2.xlarge.

Update README with maintainer details

923fe6c

Added maintainer information for Jim Burtoft.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LaughterSegmentation to contrib#56

Add LaughterSegmentation to contrib#56
jimburtoft wants to merge 3 commits intoaws-neuron:mainfrom
jimburtoft:contrib/laughter-segmentation

jimburtoft commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jimburtoft commented Mar 6, 2026

Description

Model Information

Required Components

Optional Components

Testing

Key metrics:

Tested with:

Related Issues

vLLM Integration

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant