-
Notifications
You must be signed in to change notification settings - Fork 80
Tips for Symbolic Code
The symbolic engines of PRISM rely on BDD- and MTBDD-based representations of probabilistic models using the CUDD library. There is a C++-based abstraction layer in the
dd
package, a Java-based wrapper for that in the
jdd
package and various C++-based implementations of the high-level symbolic algorithms in the mtbdd,
hybrid and
sparse packages.
On this page, we provide some hints for programming in the symbolic engines.
- Testing for equality: Use the
equalsmethod ofJDDNode, which performs an equality test using MTBDD equality. To test against constants 0 (false, empty set, ...) or 1, you can usenode.equals(JDD.ZERO)andnode.equals(JDD.ONE), without having to bother with reference handling for these constants. - In general, to obtain a (referenced) constant node use:
JDDNode c = JDD.Constant(5.0); - Use the various static
JDDmethods for performing MTBDD operations. - For low-level visualisation, use
JDD.PrintMintermsorJDD.ExportDDToDotFileto obtain information of the MTBDD. Use the-extraddinfocommand line switch to obtain information about the variable indices and the corresponding variable name in the model.
The handling of references to the MTBDD nodes is crucial for PRISM/CUDD to work properly. If there are too few references, the CUDD garbage collection can remove nodes that still have references from Java, generally leading to a program crash. If there are too many references, then the associated memory is leaked and not freed for reuse.
- Many of the methods accepting or returing
JDDNodedocument whether they will dereference arguments and reference the result. It is an ongoing effort to properly annotate the existing code base. - Generally,
JDDNodereturned fromJDDmethods are referenced and the methods will dereference the argumentJDDNode. - Collections of MTBDD variables (
JDDVars) used as arguments of a method are generally not dereferenced, but should be explicitly cleared (JDDVars.derefAll) when they are no longer needed. - Use
JDD.Derefto dereference aJDDNodethat is no longer needed. - Use the
copymethod ofJDDNodeif you need a copy. This is preferred over just increasing the reference count (viaJDD.Ref) as it allows better reference count debugging (see below). - "Lightweight"
JDDNodesforgetThenandgetElse: When you are traversing the MTBDD structure from Java, theJDDNodesreturned from thegetThenandgetElsemethods have no increased references, as they generally are protected against garbage collection due to the internal references from their parent nodes. Note that these kinds of nodes should only be used for traversal, equality checks and to obtain the value of a constant node. Please don't callReforDerefor use them as arguments inJDDmethod calls. Instead, if you need to convert such a "lightweight" node to a proper node, callcopy()on the node returned bygetThenorgetElseand use the returned node.
If you have to implement MTBDD handling in C++ code, please refer to the existing implementations for guidance.
It is good practice to check each DdNode pointer returned from the CUDD layer against NULL (indicating "out of memory") and propagate NULL back to the Java layer, where a proper error message will be printed. This avoids user-unfriendly segmentation fault errors when CUDD runs out of memory. There are ongoing efforts to consistently implement this in the existing code base.
Reference counting problems are a bit difficult to debug: In case of a reference leak, there is a warning at the end of the PRISM run. In the other case, with too many dereferences, you'll get a crash during garbage collection - but only if there is memory pressure that triggers CUDD to actually perform garbage collection. In both cases, the problematic referencing or dereferencing is not in proximity to the visible effect.
You can use the -dddebug switch to activate an additional tracking of the references that occur from the Java side, which will provide error messages or warnings in case that problems with the reference counting are detected. As additional tracking data structures are used, there will be a performance impact when this kind of debugging is activated.
Additionally, every JDDNode is assigned a unique ID, which is consistent across PRISM invocations with the same arguments (as long as the remainder of PRISM behaves deterministically). This allows you to note the ID of a problematic JDDNode (as obtained by a run with -dddebug) and use this ID in a subsequent PRISM run with the -ddtrace n command line switch (for ID n). This will trace all referencing and dereferencing of this particular JDDNode and the related counts, as well as providing stack traces.
The different trace points, i.e., events, are listed below:
-
trace(ptrToNode, ID=n)The JDDNode with ID=n is the result of a JDD method call. -
trace(Copied from m, ID=n)The JDDNode with ID=n is the result of a copy() call for node m. -
trace(Copy to m, ID=n)JDDNode n is copied, resulting in a JDDNode with ID n. -
trace(Ref, ID=n)JDDNode n is dereferenced once. -
trace(Deref, ID=n)JDDNode n is dereferenced once. -
trace(Deref (as method argument), ID=n)JDDNode n is dereferenced as an argument in a JDD method call into CUDD. Note that the CUDD reference count reflects the count before the dereferencing. @]
At each trace point, additional information is printed. As an example consider
ID = 50, CUDD ptr = 0x7fa05ba81fe0, refs for this JDDNode = 2, refs from Java = 4, refs from CUDD (including internal MTBDD refs) = 7
- "ID" is the JDDNode ID
- "CUDD ptr" is the raw C++ pointer to the CUDD MTBDD. This will generally change with each PRISM run due to low-level memory layout of the process.
- "refs for this JDDNode" is the current number of references attached to this particular JDDNode
- "refs from Java" is the current number of references to this particular CUDD MTBDD node from any of the JDDNodes
- "refs from CUDD" is the current number of references as seen by CUDD for this particular CUDD MTBDD node. This is the sum of the refs from Java as well as the internal references from other MTBDD nodes in the shared MTBDD data structure of CUDD.
Additional command line flags are available:
-
-ddtraceall: Trace allJDDNodes, regardless of their IDs. -
-ddtracefollowcopies: TraceJDDNodesthat are the result of acopy()call for a node that is already traced. -
-dddebugwarnfatal: Treat warnings as fatal, i.e., they generate an error message/exception. -
-dddebugwarnoff: Deactivate warnings.
Currently, one situation is handled as a warning instead of an error: If you dereference a JDDNode to a reference count below zero, but overall the number of references from Java remains positive, only a warning is generated. This is due to the fact that some of the existing code uses such a pattern, i.e., where referencing and the corresponding dereferencing do not necessarily happen against the same JDDNode object. It's ongoing work to improve these cases.
Note that currently only the reference counts from the Java layer are tracked, if there are problems with the reference counting in C++, e.g., in the algorithm implementations in the mtbdd, hybrid or sparse packages, those will not necessarily be detected.
To summarize: Activate -dddebug. If the PRISM run aborts due to a problem, note the location and the ID of the JDDNode and rerun PRISM with enabled tracing for that JDDNode ID. If PRISM finishes normally but provides a warning about reference leaks, note the JDDNode candidate IDs and rerun PRISM with tracing enabled.
The DebugJDD class provides a command-line interface that allows to run some small test cases using the debug options. Please have a look at the source code in jdd/DebugJDD.java. To run this, you can use
PRISM_MAINCLASS=jdd.DebugJDD bin/prismto obtain usage instructions, where bin/prism is the usual PRISM command-line startup script.
There is a framework for performing sanity checks for the symbolic MTBDD operations. These checks can be activated using the -ddsanity command-line switch. The implementation of the checks can be found in jdd.SanityJDD. Note that enabling the checks can significantly slow down execution.
Some examples for checks that are performed:
- For the
JDDmethods, some checks on the arguments are performed. As an example, for the methods that expect 0/1-MTBDDs such asAnd,Oror for the first argument ofITE, it is checked that the arguments are actually 0/1-MTBDDs. - For the graph-based methods in
mtbdd.PrismMTBDDsuch as the Prob0, Prob1, Prob0E, etc computations, it is checked that the transition matrix is actually a function over the correct BDD-variables and that the state sets for the until formula are contained inside the reachable state space of the model. The latter is a requirement for those methods to work correctly.
It is ongoing work to add additional sanity checks to the existing code base.