Skip to content

Feat/quantization paradox verifier#308

Open
jay7-tech wants to merge 3 commits intoopencv:mainfrom
jay7-tech:feat/quantization-paradox-verifier
Open

Feat/quantization paradox verifier#308
jay7-tech wants to merge 3 commits intoopencv:mainfrom
jay7-tech:feat/quantization-paradox-verifier

Conversation

@jay7-tech
Copy link
Copy Markdown

@jay7-tech jay7-tech commented Mar 23, 2026

Added a small verification check in the [benchmark.py]
to catch the "Quantization Paradox".

Sometimes INT8 models actually run slower than FP32 on certain ARM targets due to missing dot-product extensions or ORT threading overhead.

The loop now tracks the mean latency for fp32, fp16, and int8. If you pass --paradox_strict, the build will fail if the INT8 model regresses performance compared to FP32, preventing us from merging mathematically slower quantized models.

Note: I also patched a silent bug in Benchmark.run where _benchmark_results_brief kept accumulating across different models rather than resetting, which was mixing up the statss

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant