Skip to content

Conversation

@rhowardstone
Copy link

@rhowardstone rhowardstone commented Oct 14, 2025

This PR adds an optimized histogram implementation using Cython and OpenMP that provides 10-18x speedup for RDF calculations with large datasets.

Context

This PR supersedes #5103, which used Numba. Following feedback from @orbeckst and @IAlibay that MDAnalysis traditionally uses Cython/C++ for acceleration, this implementation has been rewritten to use Cython with OpenMP, aligning with the project's existing infrastructure (similar to c_distances_openmp.pyx).

Implementation

The optimization strategies include:

  • Cache-efficient memory access patterns with blocking
  • Parallel execution using OpenMP with thread-local storage to avoid contention
  • Reduced Python overhead through Cython compilation
  • Automatic selection of serial vs parallel based on dataset size

Performance

With OMP_NUM_THREADS=4:

  • 100k distances: 11.2x speedup
  • 1M distances: 15.3x speedup
  • 10M distances: 17.8x speedup
  • Serial version: 5-7x consistent speedup
  • 100% numerical accuracy (matches numpy.histogram within floating point precision)

Testing

  • ✅ All 14 new histogram tests passing
  • ✅ All 19 existing RDF tests passing
  • ✅ Comprehensive correctness validation against numpy.histogram
  • ✅ Serial/parallel consistency verified
  • ✅ Tested with multiple data sizes, bin counts, and ranges

Technical Details

Files Modified:

  • package/MDAnalysis/lib/c_histogram.pyx - New Cython+OpenMP histogram implementation
  • package/setup.py - Added histogram extension with OpenMP compilation flags
  • package/MDAnalysis/analysis/rdf.py - Updated to use Cython histogram
  • testsuite/MDAnalysisTests/lib/test_c_histogram.py - Comprehensive test suite
  • package/CHANGELOG - Documented enhancement

Key Features:

  • No external dependencies (Cython required only at build time)
  • Follows MDAnalysis conventions (mirrors c_distances_openmp.pyx)
  • Thread-safe parallel execution using chunking strategy
  • Handles non-contiguous arrays automatically

Related to #3435

🤖 Generated with Claude Code (Sonnet 4.5), checked and approved by me.

PR Checklist

  • Issue raised/referenced?
  • Tests updated/added?
  • Documentation updated/added?
  • package/CHANGELOG file updated?
  • Is your name in package/AUTHORS? (Already added in previous PR)
  • I certify that I can submit this code contribution as described in the Developer Certificate of Origin, under the MDAnalysis LICENSE.

📚 Documentation preview 📚: https://mdanalysis--5128.org.readthedocs.build/en/5128/

rhowardstone and others added 5 commits September 3, 2025 23:15
This commit adds an optimized histogram implementation using Numba JIT
compilation that provides 10-15x speedup for RDF calculations with large
datasets. The optimization strategies include:

- Cache-efficient memory access patterns with blocking
- Parallel execution using thread-local storage
- SIMD-friendly operations through Numba's auto-vectorization
- Reduced Python overhead through JIT compilation

The implementation automatically falls back to numpy.histogram when Numba
is not available, maintaining full backward compatibility.

Performance improvements:
- 10-15x speedup for large datasets (>100k distances)
- Scales efficiently to 50M+ distances
- Minimal memory overhead
- 100% numerical accuracy (matches numpy within floating point precision)

Fixes MDAnalysis#3435

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Fixes CI linting failure by applying Black code formatter to the test file.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit replaces the Numba-based histogram implementation with
a Cython+OpenMP version as requested by MDAnalysis core developers.
This aligns with MDAnalysis's existing acceleration infrastructure.

Key changes:
- Implemented c_histogram.pyx with OpenMP parallel support
- Serial version: 5-7x speedup over numpy.histogram
- Parallel version: 11-18x speedup for large datasets (>100k elements)
- Updated setup.py to build histogram extension with OpenMP flags
- Modified rdf.py to use Cython histogram module
- Removed old Numba-based histogram_opt.py module
- All 14 histogram tests passing
- All 19 existing RDF tests passing

Performance (with OMP_NUM_THREADS=4):
- 100k distances: 11.2x speedup
- 1M distances: 15.3x speedup
- 10M distances: 17.8x speedup
- 100% numerical accuracy validated against numpy.histogram

Related to Issue MDAnalysis#3435

🤖 Generated with Claude Code, checked and approved by me.

Co-Authored-By: Claude <noreply@anthropic.com>
@orbeckst
Copy link
Member

@rhowardstone thank you for the cythonized PR (and your discussion post with background #5104 ). Sorry that you haven't heard from anyone in a while. We really appreciate your clear communication about using AI tools.

We currently don't have a clear policy in place how to handle primarily AI generated code. Until we have more clarity on how we as a project want to proceed, we are not going to merge this PR (so I'll leave a blocking review). I don't want to close the PR, though, and in fact I'd encourage you to resolve the merge conflicts so that the actual CI can run.

I also want to say that your PR is a good example for the potential and you're providing motivation to move forward with this challenging discussion.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking until MDAnalysis has a policy on AI-generated code.

(In the meantime, please

  • resolve conflicts
  • merge the current develop branch (ie with the 2.10.0 release)
  • move your changelog entries into the 2.11.0 section
  • update your versionadded to 2.11.0

Fixing the conflicts should also make the CI tests run, which is essential in evaluating the PR – without running tests, it makes no sense for any one to review.)

@codecov
Copy link

codecov bot commented Oct 30, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.73%. Comparing base (5c7c480) to head (b4cd051).

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #5128      +/-   ##
===========================================
+ Coverage    92.68%   92.73%   +0.04%     
===========================================
  Files          180      180              
  Lines        22452    22453       +1     
  Branches      3186     3186              
===========================================
+ Hits         20809    20821      +12     
- Misses        1169     1176       +7     
+ Partials       474      456      -18     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@orbeckst
Copy link
Member

- Replace 'long' with 'cnp.int64_t' to fix cross-platform compatibility
- On Windows, 'long' is 32-bit while np.int64 is 64-bit causing mismatch
- Apply black formatting to Python files
- Move histogram enhancement entry to 2.11.0 section
- Update PR number from MDAnalysis#5103 to MDAnalysis#5128
- Update versionadded directives from 2.10.0 to 2.11.0
- Resolve merge conflicts with upstream develop
@rhowardstone rhowardstone force-pushed the optimize-rdf-histogram-cython branch from 9fd2803 to 3c9dfc6 Compare October 31, 2025 15:45
@rhowardstone
Copy link
Author

Okay! I think I got it? All tests now pass, let me know if there's anything else you need from me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants