Skip to content

Improved memory profiling, new features, bugfixes

Compare
Choose a tag to compare
@jaltmayerpizzorno jaltmayerpizzorno released this 04 Oct 18:26
· 1340 commits to master since this release

Overhauled memory attribution logic:

  • uses Python's custom memory management APIs to efficiently disambiguate native vs. Python memory allocations, supplanting the prior approach that employed periodic call stack sampling.
  • performs immediate lookup of the location in source code responsible for allocation/deallocation, reducing the "smearing" effect in attributions previously caused by delayed attribution.
  • computes average memory consumption (rather than total) for each line of code (using the novel technique of "one-shot" tracing); lines executed many times no longer appear to have consumed large amounts of memory.
  • no longer reports negative memory growth from output, caused by lines freeing more than allocating, which has been a source of confusion for some users.
  • this release also resolves a memory leak.

Overhauled internal signal handling:

  • uses signal actors, an approach based on actors that decouples signal handling logic from the main thread, avoiding the risk of races and deadlocks and simplifying logic

Bug fixes:

  • fixed missing handling of pynvml.NVMLError_NotSupported exception (issue #262);
  • fixed issue cleaning up after profiling multiprocessor and multithreaded programs;
  • fixed issue not accounting for elapsed time when zero frames were recorded (issue #269).

New features:

  • added JSON output option (--json);
  • added programmatic profile control (scalene_profiler.start() and scalene_profiler.stop()).

Miscellaneous:

  • improved documentation.

Note: this release is for MacOS and Linux only.