Skip to content

Tips for Symbolic Code

Dave Parker edited this page Apr 22, 2019 · 2 revisions

The symbolic engines of PRISM rely on BDD- and MTBDD-based representations of probabilistic models using the CUDD library. There is a C++-based abstraction layer in the dd package, a Java-based wrapper for that in the jdd package and various C++-based implementations of the high-level symbolic algorithms in the mtbdd, hybrid and sparse packages.

On this page, we provide some hints for programming in the symbolic engines.

Symbolic MTBDDs from Java

  • Testing for equality: Use the equals method of JDDNode, which performs an equality test using MTBDD equality. To test against constants 0 (false, empty set, ...) or 1, you can use node.equals(JDD.ZERO) and node.equals(JDD.ONE), without having to bother with reference handling for these constants.
  • In general, to obtain a (referenced) constant node use: JDDNode c = JDD.Constant(5.0);
  • Use the various static JDD methods for performing MTBDD operations.
  • For low-level visualisation, use JDD.PrintMinterms or JDD.ExportDDToDotFile to obtain information of the MTBDD. Use the -extraddinfo command line switch to obtain information about the variable indices and the corresponding variable name in the model.

Reference Counting

The handling of references to the MTBDD nodes is crucial for PRISM/CUDD to work properly. If there are too few references, the CUDD garbage collection can remove nodes that still have references from Java, generally leading to a program crash. If there are too many references, then the associated memory is leaked and not freed for reuse.

  • Many of the methods accepting or returing JDDNode document whether they will dereference arguments and reference the result. It is an ongoing effort to properly annotate the existing code base.
  • Generally, JDDNode returned from JDD methods are referenced and the methods will dereference the argument JDDNode.
  • Collections of MTBDD variables (JDDVars) used as arguments of a method are generally not dereferenced, but should be explicitly cleared (JDDVars.derefAll) when they are no longer needed.
  • Use JDD.Deref to dereference a JDDNode that is no longer needed.
  • Use the copy method of JDDNode if you need a copy. This is preferred over just increasing the reference count (via JDD.Ref) as it allows better reference count debugging (see below).
  • "Lightweight" JDDNodes for getThen and getElse: When you are traversing the MTBDD structure from Java, the JDDNodes returned from the getThen and getElse methods have no increased references, as they generally are protected against garbage collection due to the internal references from their parent nodes. Note that these kinds of nodes should only be used for traversal, equality checks and to obtain the value of a constant node. Please don't call Ref or Deref or use them as arguments in JDD method calls. Instead, if you need to convert such a "lightweight" node to a proper node, call copy() on the node returned by getThen or getElse and use the returned node.

Symbolic MTBDDs from C++

If you have to implement MTBDD handling in C++ code, please refer to the existing implementations for guidance.

It is good practice to check each DdNode pointer returned from the CUDD layer against NULL (indicating "out of memory") and propagate NULL back to the Java layer, where a proper error message will be printed. This avoids user-unfriendly segmentation fault errors when CUDD runs out of memory. There are ongoing efforts to consistently implement this in the existing code base.

Debugging

Reference counting problems are a bit difficult to debug: In case of a reference leak, there is a warning at the end of the PRISM run. In the other case, with too many dereferences, you'll get a crash during garbage collection - but only if there is memory pressure that triggers CUDD to actually perform garbage collection. In both cases, the problematic referencing or dereferencing is not in proximity to the visible effect.

You can use the -dddebug switch to activate an additional tracking of the references that occur from the Java side, which will provide error messages or warnings in case that problems with the reference counting are detected. As additional tracking data structures are used, there will be a performance impact when this kind of debugging is activated.

Additionally, every JDDNode is assigned a unique ID, which is consistent across PRISM invocations with the same arguments (as long as the remainder of PRISM behaves deterministically). This allows you to note the ID of a problematic JDDNode (as obtained by a run with -dddebug) and use this ID in a subsequent PRISM run with the -ddtrace n command line switch (for ID n). This will trace all referencing and dereferencing of this particular JDDNode and the related counts, as well as providing stack traces.

The different trace points, i.e., events, are listed below:

  • trace(ptrToNode, ID=n) The JDDNode with ID=n is the result of a JDD method call.

  • trace(Copied from m, ID=n) The JDDNode with ID=n is the result of a copy() call for node m.

  • trace(Copy to m, ID=n) JDDNode n is copied, resulting in a JDDNode with ID n.

  • trace(Ref, ID=n) JDDNode n is dereferenced once.

  • trace(Deref, ID=n) JDDNode n is dereferenced once.

  • trace(Deref (as method argument), ID=n) JDDNode n is dereferenced as an argument in a JDD method call into CUDD. Note that the CUDD reference count reflects the count before the dereferencing. @]

At each trace point, additional information is printed. As an example consider

 ID = 50, CUDD ptr = 0x7fa05ba81fe0, refs for this JDDNode = 2, refs from Java = 4, refs from CUDD (including internal MTBDD refs) = 7
  • "ID" is the JDDNode ID
  • "CUDD ptr" is the raw C++ pointer to the CUDD MTBDD. This will generally change with each PRISM run due to low-level memory layout of the process.
  • "refs for this JDDNode" is the current number of references attached to this particular JDDNode
  • "refs from Java" is the current number of references to this particular CUDD MTBDD node from any of the JDDNodes
  • "refs from CUDD" is the current number of references as seen by CUDD for this particular CUDD MTBDD node. This is the sum of the refs from Java as well as the internal references from other MTBDD nodes in the shared MTBDD data structure of CUDD.

Additional command line flags are available:

  • -ddtraceall: Trace all JDDNodes, regardless of their IDs.
  • -ddtracefollowcopies: Trace JDDNodes that are the result of a copy() call for a node that is already traced.
  • -dddebugwarnfatal: Treat warnings as fatal, i.e., they generate an error message/exception.
  • -dddebugwarnoff: Deactivate warnings.

Currently, one situation is handled as a warning instead of an error: If you dereference a JDDNode to a reference count below zero, but overall the number of references from Java remains positive, only a warning is generated. This is due to the fact that some of the existing code uses such a pattern, i.e., where referencing and the corresponding dereferencing do not necessarily happen against the same JDDNode object. It's ongoing work to improve these cases.

Note that currently only the reference counts from the Java layer are tracked, if there are problems with the reference counting in C++, e.g., in the algorithm implementations in the mtbdd, hybrid or sparse packages, those will not necessarily be detected.

To summarize: Activate -dddebug. If the PRISM run aborts due to a problem, note the location and the ID of the JDDNode and rerun PRISM with enabled tracing for that JDDNode ID. If PRISM finishes normally but provides a warning about reference leaks, note the JDDNode candidate IDs and rerun PRISM with tracing enabled.

The DebugJDD class provides a command-line interface that allows to run some small test cases using the debug options. Please have a look at the source code in jdd/DebugJDD.java. To run this, you can use

PRISM_MAINCLASS=jdd.DebugJDD bin/prism

to obtain usage instructions, where bin/prism is the usual PRISM command-line startup script.

Sanity checks

There is a framework for performing sanity checks for the symbolic MTBDD operations. These checks can be activated using the -ddsanity command-line switch. The implementation of the checks can be found in jdd.SanityJDD. Note that enabling the checks can significantly slow down execution.

Some examples for checks that are performed:

  • For the JDD methods, some checks on the arguments are performed. As an example, for the methods that expect 0/1-MTBDDs such as And, Or or for the first argument of ITE, it is checked that the arguments are actually 0/1-MTBDDs.
  • For the graph-based methods in mtbdd.PrismMTBDD such as the Prob0, Prob1, Prob0E, etc computations, it is checked that the transition matrix is actually a function over the correct BDD-variables and that the state sets for the until formula are contained inside the reachable state space of the model. The latter is a requirement for those methods to work correctly.

It is ongoing work to add additional sanity checks to the existing code base.

Clone this wiki locally