Skip to content

Releases: SciCompKL/OpDiLib

v2.0.0

14 Nov 20:53

Choose a tag to compare

Interface Changes

Macros

  • OPDI_CRITICAL_NAME macro no longer accepts arguments other than name
  • new OPDI_CRITICAL_NAME_ARGS macro
  • new OPDI_MASKED, OPDI_END_MASKED macros
  • new OPDI_SINGLE_COPYPRIVATE, OPDI_SINGLE_COPYPRIVATE_NOWAIT macros

Layer Interfaces

  • introduce OmpLogicInterface::beginSkippedParallelRegion, OmpLogicInterface::endSkippedParallelRegion
  • refactor OmpLogicInterface::resetTask -> OmpLogicInterface::resetImplicitTask
  • changed offset in enums OmpLogicInterface::ScopeEndpoint and OmpLogicInterface::SyncRegionKind
  • introduce BackendInterface::getCriticalIdentifier, BackendInterface::getReductionIdentifier, BackendInterface::getOrderedIdentifier
  • refactor BackendInterface::getNestedLockIdentifier -> BackendInterface::getNestLockIdentifier
  • refactor BackedInterface::getTaskData -> BackendInterface::getImplicitTaskData

Compiler Options

  • introduce OPDI_BACKEND_GENERATE_MASKED_EVENTS, OPDI_BACKEND_GENERATE_WORK_EVENTS, OPDI_OMPT_BACKEND_IMPLICIT_TASK_END_SOURCE, OPDI_OMPT_BACKEND_BARRIER_IMPLEMENTATION_BEHAVIOUR, OPDI_OMP_LOGIC_CLEAR_ADJOINTS, OPDI_SYNC_REGION_BARRIER_BEHAVIOUR, OPDI_SYNC_REGION_BARRIER_IMPLICIT_BEHAVIOUR, OPDI_SYNC_REGION_BARRIER_EXPLICIT_BEHAVIOUR, OPDI_SYNC_REGION_BARRIER_IMPLEMENTATION_BEHAVIOUR, OPDI_SYNC_REGION_BARRIER_REVERSE_BEHAVIOUR

Instrumentation

  • all functions in OmpLogicInstrumentInterface receive AD data as parameters
  • calls to instrumentation occur under the same conditions as AD treatment, like tape activity

Features

  • support for custom mutexes
  • support for masked construct
  • support for broadcasts via single copyprivate
  • provide EmptyTool as a default tool layer implementation
  • support for disabling OpDiLib at runtime: set opdi::tool to an instance of EmptyTool (preferred) or nullptr
  • endpoints of sync regions that result in reverse barriers become configurable
  • optionally use the implicit barriers at the end of parallel regions to produce ImplicitTaskEnd events of non-primary threads
  • expose mechanism to skip AD treatment of parallel regions
  • backend-independent solution for copy operations at the beginning of parallel regions
  • optionally clear adjoints as part of tape resets
  • the OMPT backend's implementation of opdi_* functions can be included as a default runtime implementation
  • syntax checking supports local enablind/disabling via // opdi-syntax-on, // opdi-syntax-off
  • additions to syntax definition
  • modernize declaration of custom reductions
  • faster tests, extended test suite and CI pipeline (see below)
  • changes of OmpLogicOutputInstrument in accordance with other changes in OpDiLib

Performance Improvements

  • implicit barriers on parallel regions do not produce reverse barriers
  • generating events for masked constructs and worksharing constructs becomes optional
  • MutexOmpLogic::onMutexReleased no longer acquires an internal mutex
  • reductions in the macro backend no longer require additional internal locks, address of parallel data as wait identifier
  • ordered constructs in the macro backend no longer share their wait identifier, use address of parallel data as wait identifier

Refactoring and Code Quality

  • refactor MutexOmpLogic, more expressive names and reduced boilerplate, comments regarding thread-safety
  • refactor ParallelOmpLogic and ImplicitTaskOmpLogic, more expressive names and fewer void* pointers
  • transition from master to masked throughout OpDiLib
  • remove ProbeScopeStatus, resolve order of ImplicitTaskProbe and ReductionProbe via omp_get_level
  • dedicated header file for macros related to thread sanitizer
  • refactoring of further smaller aspects throughout OpDiLib
  • add warnings, errors, and assertions throughout OpDiLib

Fixes

  • ensure that setAdjointAccessMode has no effect when called in the reverse pass
  • address compiler warnings
  • fix bug in application of syntax checking to a single file
  • fix ranges in TestExternalFunctionLocal for small data and/or many threads
  • fix memory leak: implicit tasks free allocated tape positions

Tests

Test Suite

  • test suite moves from C++11 to C++17
  • testing with strict warning levels
  • updated image versions
  • new tests: TestCriticalNameHint, TestExternalFunctionLogicCalls, TestCustomMutex, TestSingleCopyprivate, TestSingleCopyprivateNowait, TestParallelCopyin, TestParallelFirstprivate, TestParallelFirstprivate2, TestParallelForLastprivate, TestOrderedMultiple, TestOrderedNowait, TestOrderedReduction, TestLock2
  • new drivers
    • DriverPrimal for primal computations in the presence of OpDiLib
    • DriverFirstOrderForward for the tapeless forward mode of AD in the presence of OpDiLib
    • DriverFirstOrderReverseVector for vector mode tests
  • drivers specify associated tests explicitly
  • introduce driver-specific flags
  • where possible, prefer OpDiLib's tool interface over direct interaction with the OO AD tool
  • faster tests
    • smaller internal data sets
    • default to -O1
    • automatic deduction of appropriate parallelism
    • tests do not use atomics for reversal of serial parts
  • instrumentation no longer tied to debug mode
  • support for excluding specific tests and drivers from testing
  • container file for thread sanitizer tests

CI Pipeline

  • test different configurations of OpDiLib
    • source of ImplicitTaskEnd events
    • generation of events for masked constructs, worksharing constructs
  • automated syntax checking
  • tests with address sanitizer
  • tests with thread sanitizer

Miscellaneous

  • parallel regions suspend recording on the encountering task's tape
  • barriers always produce events for both endpoints
  • backend layer becomes responsible for producing events for both reduction-related barriers
  • new newsletter address
  • Zenodo publication

v1.7.1

27 Oct 12:15

Choose a tag to compare

Features

  • macros that indicate OpDiLib's version

v1.7

07 Feb 16:46

Choose a tag to compare

Restructure data of parallel regions and implicit tasks

  • associate tapes, positions, and adjoint access modes with tasks instead of parallel regions
  • extend data of parallel regions by pointers to the parent task and child tasks
  • backend support for querying the current task's data
  • create task data for initial implicit tasks
  • track adjoint access mode entirely via tasks
  • support resetting tasks according to a tape position, to recover previous adjoint access modes

Features

  • track adjoint access mode of initial implicit task
  • support additional sync region and worksharing types
  • support use of mutexes before tool initialization (OMPT backend)
  • support posting messages that are printed during the reverse pass

Instrument

  • implement onSetAdjointAccessMode in OmpLogicOutputInstrument
  • adapt OmpLogicInstrumentInterface and OmpLogicOutputInstrument according to the other changes

Tests

  • run build tests with enabled output instrument
  • tests treat non-empty output on stderr as error (can be disabled)
  • add tests' error output files to gitignore
  • add assertions to adjoint access control tests
  • add TestAdjointAccessControlNested2, TestTaskReset

Fixes

  • fix transport of adjoint access mode into and out of parallel regions
  • fix using directive in second order test driver
  • fix placement of skipParallelHandling decrement

Chore

  • update license headers for 2025
  • update OpDiLib-related publications in README

v1.6

12 Jul 19:13

Choose a tag to compare

Features

  • Gitlab CI pipeline
  • tracking of passive parallel regions
  • tsan annotations for mutex synchronization in the reverse pass
  • postEvaluate() method in the logic interface

Fixes

  • build flags in the test system for producing reference files
  • memory leaks
    • in the OPDI_SINGLE macro
    • in tests for external functions
    • clearing the tape pool

Chore

  • update links

v1.5.1

16 Jan 10:44

Choose a tag to compare

  • license header update for 2024
  • small fix

v1.5

13 Jul 09:09

Choose a tag to compare

  • relicense OpDiLib under LGPLv3

v1.4

20 Jun 14:41

Choose a tag to compare

  • Extend OpDiLib instrumentation.
    • instrumentation of master constructs
    • instrumentation of passive barriers
    • instrumentation of passive parallel regions
    • revise and extend reverse pass instrumentation
  • Balance the tape activity calls received by thread 0.

v1.3.2

25 Apr 12:33

Choose a tag to compare

  • update publication
  • update tests and examples to CoDiPack 2
  • fix OpDiLib finalization order in tests and examples

v1.3.1

03 Jan 17:43

Choose a tag to compare

  • update license header and readme
  • improve portability of tests makefile

v1.3

09 Nov 15:44

Choose a tag to compare

  • fix issues regarding reduction combined with nowait
  • default adjoint access mode now configurable
  • extend tests, test system maintenance
  • extend instrumentation, instrumentation maintenance
  • various fixes and small changes