-
Notifications
You must be signed in to change notification settings - Fork 40
Efficiency Considerations
Multiple efficiency considerations were considered for implementing this architecture.
-
Array-encoded terms The binary tree corresponding to a Narsese sentence can be encoded in an initially empty array, by putting the root node as entry of position
i=1
. And then, recursively, the left children is put at positioni*2
and the right childen at positioni*2+1
. This representation, due to locality of the assigned memory region, is more efficient for CPU's than a representation which needs to follow pointers to traverse the tree. It needs more memory though, but memory nowadays is cheap. -
Recursion-free unification Since the terms are layed out in an array, Variable Unification can be implemented with a single traversal through the term arrays to unify. A recursive formulation of unification is more expensive as it depends on unnecessary stack manipulations for each recursive call and pointer indirections when terms are not stored in contiguous memory. While traversing, a hashmap of variable assignments is maintained, which lets the unification fail if an inconsistent assignment is made.
-
Pre-allocated memory Since every data structure is bounded, all memory is allocated at process startup. This avoids expensive memory allocations and releases at runtime.
-
Contiguous memory allocation Concepts are stored as contiguous memory blocks. This allows concept updates to utilize cache-locality effecively for higher performance in processing.
-
Max-heap based Priority Queue The Priority Queue is implemented as a bounded Max-Heap structure. This datastructure allow for
O(1)
access of the highest-ranked element. Also insertion happens inO(log(n))
which allows for theoretically optimalO(n*log(n))
re-sort, withn
being the amount of elements in the datastructure. -
Multi-threading Belief selection happens with a parallel OpenMP for-loop, allowing the inference to happen in parallel by multiple threads, with only memory updates being locked to store the results.
-
Inference rule code generation Instead of interpreting inference rules at runtime or hardcoding them with code difficult to maintain, the Narsese parser is utilized to allow for the following inference rule format, established as C-macro:
(Premise1, Premise2, |-, Conclusion, TruthFunction)
. Inference rule code is automatically generated from this description, including complete unrolling of the unification. This allows modern compiler optimization to apply to make the rule matching for multiple rules more efficient.