-
Notifications
You must be signed in to change notification settings - Fork 647
Adding contrast-limited adaptive histogram equalization (CLAHE) to DALI image operators #6069
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@tonyreina, thank you for your contribution. We appreciate the time you spent diving into DALI and extending it. I haven't delved deeply into the code yet, as I focused more on general remarks - mostly regarding testing, examples, and memory management. Please let us know if you need any guidance in applying the suggestions. |
084b2f1 to
4e087d2
Compare
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
!build |
|
CI MESSAGE: [36715300]: BUILD STARTED |
|
Would be nice to add images to the example, to see the effect of this transformation visually |
|
View / edit / reply to this conversation on ReviewNB mzient commented on 2025-10-31T12:25:19Z
📚1️⃣Welcome to this hands-on tutorial!
Those pictures and the opening sentence are nice on their own, but stand out among our examples. |
|
View / edit / reply to this conversation on ReviewNB mzient commented on 2025-10-31T12:25:20Z I think it would be better to demonstrate the processing on an actual image (you can grab one from DALI_extra repository). The image here stands in stark opposition to the claim made in the introduction "CLAHE is a powerful technique that improves contrast in images without overamplifying noise" - here we can see mostly noise amplification and the actual contrast between the squares is diminished. This doesn't look like an effective demonstration, since the useful features (the squares) lose contrast and noise gains contrast. |
Signed-off-by: Tony Reina <tbreina@bu.edu>
Signed-off-by: Tony Reina <tbreina@bu.edu>
|
I think I made all of the recommended changes. The notebook and test examples now use the MRI knee image from DALI_extra. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This review covers only the changes made since the last review, not the entire PR. The latest changes focus primarily on formatting corrections and code cleanup addressing previous review feedback. The main updates include: copyright year standardization to 2025across multiple files, Black formatting improvements throughout test files to adhere to the 100-character line limit, and proper handling of signed/unsigned comparison issues in CUDA code. Additionally, the PR includes improvements to include statements and removes debug output from test files to maintain clean test logs.
Important Files Changed
| Filename | Score | Overview |
|---|---|---|
dali/operators/image/clahe/clahe_op.cu |
4/5 | Fixed signed/unsigned comparison warning in clip redistribution loop and improved numeric constants precision |
dali/test/python/test_clahe.py |
4/5 | Added .as_cpu() calls for proper GPU tensor conversion to numpy arrays and removed debug print statements |
docs/examples/image_processing/clahe_example.ipynb |
5/5 | Updated copyright year from 2022-2023 to 2022-2025 following repository standards |
dali/operators/image/clahe/clahe_op.cc |
5/5 | Updated copyright year and maintained existing functionality without changes |
dali/operators/image/clahe/clahe_cpu.cc |
5/5 | Updated copyright year and maintained existing CPU implementation without changes |
dali/operators/image/clahe/clahe_test.cc |
5/5 | Updated copyright year and maintained existing test logic without changes |
dali/test/python/test_dali_cpu_only.py |
5/5 | Applied Black formatting for better code readability while maintaining functionality |
dali/test/python/test_eager_coverage.py |
5/5 | Applied Black formatting improvements for consistent code style across parameter lists |
dali/test/python/checkpointing/test_dali_checkpointing.py |
5/5 | Applied Black formatting improvements for multi-line function parameters |
dali/test/python/test_dali_variable_batch_size.py |
5/5 | Applied Black formatting improvements for better parameter organization |
docs/examples/image_processing/index.py |
5/5 | Updated copyright year and maintained proper documentation index structure |
dali/operators/image/CMakeLists.txt |
5/5 | Maintained existing build configuration without changes |
dali/operators/image/clahe/CMakeLists.txt |
5/5 | Maintained existing build configuration without changes |
Confidence score: 4/5
- This PR addresses formatting and compilation issues from previous reviews with minimal risk
- Score reflects successful resolution of signed/unsigned comparison warnings and proper tensor handling improvements
- Pay close attention to the CUDA implementation file due to the complexity of performance-critical GPU kernels and ensure all compiler warnings are resolved
13 files reviewed, 2 comments
Signed-off-by: Tony Reina <tbreina@bu.edu>
Signed-off-by: Tony Reina <tbreina@bu.edu>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greetings Summary
This review covers only the changes made since the last review, not the entire PR. The most recent changes include:
-
Implementation refinements: Improved CUDA optimization patterns with proper C++20 bit_cast for OpenCV compatibility constants, replaced division operations with multiplication by reciprocals for better performance, and enhanced memory coalescing patterns in vectorized kernels.
-
Test infrastructure expansion: Added comprehensive test coverage across DALI's testing framework including eager execution tests (
test_eager_coverage.py), CPU-only validation (test_dali_cpu_only.py), variable batch size support (test_dali_variable_batch_size.py), and checkpointing exclusions (test_dali_checkpointing.py). -
Documentation integration: Added the CLAHE example notebook to the image processing documentation index with proper operator references, and applied consistent code formatting across multiple test files following the project's Black formatting standards.
-
Build system integration: Added CMake configuration for the CLAHE module following DALI's standard patterns for operator integration.
The changes demonstrate good integration of the new CLAHE operator into DALI's established infrastructure, with appropriate test coverage and documentation updates. The GPU implementation shows sophisticated optimization strategies while maintaining OpenCV algorithmic compatibility.
Important Files Changed
| Filename | Score | Overview |
|---|---|---|
| dali/operators/image/clahe/clahe_op.cu | 3/5 | GPU CLAHE implementation with extensive CUDA optimizations but contains performance concerns around expensive powf operations and sequential algorithms |
| dali/operators/image/clahe/clahe_op.cc | 4/5 | GPU operator backend with proper memory management and comprehensive schema documentation |
| dali/operators/image/clahe/clahe_cpu.cc | 4/5 | CPU implementation using OpenCV with thread safety and proper input validation |
| dali/test/python/operator_1/test_clahe.py | 4/5 | Comprehensive Python test suite with OpenCV validation and device consistency checks |
| docs/examples/image_processing/clahe_example.ipynb | 4/5 | Educational Jupyter notebook with practical examples, but contains executed outputs that should be cleared |
| dali/operators/image/clahe/clahe_test.cc | 4/5 | C++ unit tests comparing CPU/GPU implementations with appropriate tolerances |
| dali/test/python/test_eager_coverage.py | 5/5 | Adds CLAHE to eager execution test coverage with proper integration |
| dali/test/python/test_dali_cpu_only.py | 5/5 | Adds CPU-only testing for CLAHE operator |
| dali/test/python/test_dali_variable_batch_size.py | 5/5 | Validates CLAHE works with dynamic batch sizes |
| dali/test/python/checkpointing/test_dali_checkpointing.py | 5/5 | Properly excludes CLAHE from checkpointing as expected for new operators |
| dali/operators/image/CMakeLists.txt | 5/5 | Simple and correct integration of CLAHE subdirectory into build system |
| dali/operators/image/clahe/CMakeLists.txt | 5/5 | Standard DALI CMake configuration following established patterns |
| docs/examples/image_processing/index.py | 5/5 | Minimal documentation index update adding CLAHE example |
Confidence score: 3/5
- This PR requires careful review due to complex GPU kernel optimizations and potential performance issues in the CUDA implementation
- Score reflects concerns about expensive floating-point operations (powf calls) in device code, sequential algorithms that could benefit from parallelization, and potential memory access inefficiencies
- Pay close attention to the CUDA kernel implementations in clahe_op.cu, particularly the color space conversion routines and CDF redistribution logic which may impact performance
13 files reviewed, no comments
Signed-off-by: Tony Reina <tbreina@bu.edu>
Signed-off-by: Tony Reina <tbreina@bu.edu>
|
I updated the powf calls with a LUT to improve speed. The other greptile comment I think was around the parallelization of the CDF calculation. Happy to add that if it is really necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This review covers only the changes made since the last review, not the entire PR. The most recent changes show comprehensive progress in addressing previous review feedback including notebook conversion, memory management improvements, error handling standardization, and formatting fixes. The PR continues to implement CLAHE (Contrast-Limited Adaptive Histogram Equalization) as a new image processing operator in DALI with both CPU (OpenCV-based) and GPU (CUDA-optimized) backends. Key recent improvements include migrating from a Python script to a Jupyter notebook for documentation, replacing manual CUDA memory allocation with DALI's DynamicScratchpad API, standardizing error handling to use DALI_ENFORCE macros, and addressing various CUDA kernel optimization issues.
Important Files Changed
| Filename | Score | Overview |
|---|---|---|
| dali/operators/image/CMakeLists.txt | 5/5 | Simple addition of clahe subdirectory to build system |
| docs/examples/image_processing/index.py | 5/5 | Updates copyright year and adds CLAHE example to documentation index |
| dali/operators/image/clahe/CMakeLists.txt | 5/5 | Standard CMake configuration for new CLAHE operator module |
| dali/test/python/test_eager_coverage.py | 5/5 | Adds CLAHE to eager execution test coverage with formatting improvements |
| dali/test/python/checkpointing/test_dali_checkpointing.py | 5/5 | Adds CLAHE to unsupported checkpointing operators list with formatting updates |
| dali/test/python/test_dali_cpu_only.py | 4/5 | Extends CPU-only testing to include CLAHE operator |
| dali/test/python/test_dali_variable_batch_size.py | 4/5 | Integrates CLAHE into variable batch size testing framework |
| dali/operators/image/clahe/clahe_cpu.cc | 4/5 | Well-structured CPU implementation using OpenCV with thread-safety |
| dali/operators/image/clahe/clahe_op.cc | 4/5 | Comprehensive GPU operator implementation with proper DALI patterns |
| docs/examples/image_processing/clahe_example.ipynb | 4/5 | Educational notebook with comprehensive examples but contains executed output |
| dali/operators/image/clahe/clahe_test.cc | 4/5 | C++ unit tests with good coverage but could benefit from more comprehensive validation |
| dali/test/python/operator_1/test_clahe.py | 3/5 | Extensive Python test suite but has non-deterministic elements and incomplete GPU feature parity |
| dali/operators/image/clahe/clahe_op.cu | 3/5 | Complex CUDA implementation with performance optimizations but several implementation issues |
Confidence score: 3/5
- This PR requires careful review due to complex CUDA kernel implementations and some remaining issues in the GPU backend
- Score reflects unresolved issues in CUDA kernels including undefined behavior, potential out-of-bounds access, and warp divergence problems that could affect correctness and performance
- Pay close attention to the CUDA kernel implementation in clahe_op.cu which contains several optimization patterns that need validation for correctness
13 files reviewed, no comments
Category:
New feature (non-breaking change which adds functionality)
Description:
This PR adds Contrast-Limited Adaptive Histogram Equalization (CLAHE) to the DALI image operators.
CLAHE performs local histogram equalization with clipping and bilinear blending of lookup tables (LUTs) between neighboring tiles. This technique enhances local contrast while preventing over-amplification of noise. The implementation maintains exact algorithmic compatibility with OpenCV's
cv::createCLAHE()while providing significant GPU performance optimizations.Additional information:
Affected modules and functionalities:
clahe_op.ccandclahe_op.cufor GPU implementation with CUDA kernelsclahe_cpu.ccfor CPU implementation using OpenCVKey points relevant for the review:
Tests:
test_clahe.pywith multiple parameter combinations, device testing (CPU/GPU), and API validationclahe_test.ccwith CPU vs GPU equivalence testing, different tile sizes, clip limits, and error handlingclahe_example.pydemonstrating usage patterns and parameter effectsChecklist
Documentation
PERFORMANCE_NOTES.mdDALI team only
Requirements
REQ IDs: N/A
JIRA TASK: N/A