Skip to content

Spotmap

Andrew Strain edited this page Aug 29, 2019 · 44 revisions

Nested boxes

'Spotmap' is made for this project as a kind of spatial index to efficiently link particles to smaller and smaller groups, which are recursively nested within each others spatial bounds. Spotmap was designed as essentially a complicated linked list without much idea of what a "tree" abstract-data-type should generally entail.

A mixture of terminology and a somewhat ad-hoc range of names exists for examples of tree data types ("r-tree s-tree b-tree b+tree b*tree...etc"). Some basic 'data tree' terminology is helpful to share:

Terminology of Tree Graphs

  • graph - A 'graph' is like a wireframe representation of links between things. (From mathematics 'graph theory')

  • tree - A data structure like a linked list, except items can have multiple onward links, enabling a tree shaped graph, having a single root and multiple ends or 'leaf nodes'.

  • nodes - The records in a 'tree' which are each linked to a parent record and can each link to multiple child records.

  • edges - The links between nodes.

  • keys - Nodes references to the data which their trees purpose is to index for efficient searching. These can be to anything: files, particles, text characters, genes or readings. These references to data are distinct from edges (the internal node-to-node links) They link the graph nodes to the raw data. They may be best known as 'keys'.

Indexing Scheme and Metadata

The details of how to 'nodes, edges and keys' are stored or encoded are referred here as the indexing scheme. For example the keys of each node could be stored in fixed arrays or in dynamically sized lists, or they could be keyed only at the leaf nodes, or at every node.

Metadata about the keys which are grouped by a node can be attached to it. Spotmap's nodes each store 'bounding box values' which define the extents of the key datas location in space. This metadata enables the main purpose of the graph which is to enable efficient spacial queries. This is the purpose of a type of tree known as "R-tree", however spotmaps indexing scheme and core implementation could also be suitable for graphing non-spatial information, like files and directories.

The bounding box metadata on each of spotmaps nodes makes them conceptually a cuboid or 'spot' in 3d space. All data keyed by a node and all its child nodes exist within its spot in space.

Characteristics of Spotmap's Indexing Scheme:

  • It has minimal constraints on number of node edges or keys or depth, due to implicit encoding of these details.

  • All node relations and content (edges and keys) are defined by about 5 pointers per node, and by one sorted-list of all key ids which is shared between all nodes.

  • Key data is effectively 100% packed (it is implied in the sorted-list order).

  • Edge information (children pointers) is mostly implied, but can also be padded to allow insertions into the address sequence without complex record shifting.

  • Spotmap is not inherently 'self balancing' (a feature of some tree designs).

Spotmaps tree as an SOC (Structure of Arrays)

The following 6 flat arrays can encode any tree hierarchy of spots:

One sorted-list of all items:

  delineation_sequence[item_id] //sorted list of the keys ids

A collection of spot records having these fields:

  spot.dln_anchor[spot_id] //start of nodes keys in delin_seq  
  spot.dln_span[spot_id]   //number of spots keys (optional)  
  spot.parent[spot_id]     //id of parent spot                
  spot.firstchild[spot_id] //id of *first child* spot           
  spot.depth[spot_id]      //used in maintenance              

Any spot can link any number of child spots and items. Each spot also has computed and stored spatial bounds (high and low) and total mass and center of mass. These details are computed after initial 'loading' of the Tree (creating a fresh hierarchy) and they need to be refreshed in use. spot.depth of each spot is recorded to assist the recurrent task of refreshing spots measurements.

Visualising Spotmaps Node Index Hierarchy

A logging function was made to examine spotmaps internal references. Here Spot ids are printed as chars "1" to "G", horizontally along the order of the delineation_sequence and down through spot-nesting-depth. In this example 50 key items (across) are contained by 43 spot records ("1" to "G") :

<---- length of delineation_seq (total keys) --->
1111111111111111111111111111111111111111111111111
2222222222222222333334444444444444445555555555555
6667777777778888jjjjknooooopppppppqqxxxxxyzzzzABC
   999aaaaaahhhilllm  rrssstttttuu  DDEEE FGGG   
      bccccc               vvvww                 
       dddde                                     
       fffg                                      

Notes

  • The Root spot '1' includes all keys. Spot 2 includes 16 keys, etc...
  • The ids of the spots increase in the order in which they were recursively split into sub-spots.
  • The ids of the spots under any parent or 'trunk' node, form a single unbroken sequence that starts with the trunks firstchild id, and ends on the last spot which does not have trunk as an ancestor - (or on the spot before the next spot whose parent is a sibling of the trunk, or with some other quick trick...)
  • A single unbroken sequence in the 'delineation_sequence' is what defines the individual items of each spot: from spot.dln_anchor to +spot.dln_span ( or to spot.dln_anchor[i+1] ... )
  • Fast memory efficient operations are possible by this implicit ordering of spots relationships by spots id, and of items relationships to multiple spots via pointers to the 'sorted-by-spot-nest' delineation_sequence.

Node Subdivision

Nodes can be subdivided into child nodes by any measure of their key data - as the tree structure does not concern why keys are separated into nodes and sub-nodes. In Spotmap, nodes are measured as 'spots' in 3D space with (the key data of) child nodes fitting inside their parents spot. The process of dividing and subdividing keys into hierarchy of tree nodes has been called 'bulk loading' in physics contexts. In spotmap the creation and linking of each level of child spots from a parent spot is called 'budding', the process of recursively 'budding' an 'ancestor' spot and all offspring until leaf nodes of certain size result is called 'blooming'.

Standard Bulk Loading

  1. Spotmap does a sweep of all 'jotes' (particles) to create the root spot and to intialise an unsorted list of jotes, which becomes the delineation_sequence -once it is sorted into the order which the spots are nested in.
  2. The root spot (and every following spot) is then surveyed to split it into 'child spots'.
  3. The jote-ids in the delineation-sequence are sorted into the series of the child-spots.
  4. Each child-spot's records include its 'dln_anchor' and 'dln_span'.
  5. Spots are surveyed and sorted into 'kid spots' in the dln_sequence, and every spots details are recursively written in the spot record to point to their dln_span. When a kid spot is small enough, it is not further divided, these last spots are 'leaf' nodes.

Budding by Velocity

In spotmap the first generation of nodes from the rootnode is divided by keys velocity, resulting in 10 spots with the fastest moving keys in the first and slowest moving in the last. These 10 spots are subsequently 'bloomed' by space - subdivision by space is the purpose but by dividing a primary generation by velocity, spots containing faster moving keys can be rebloomed more often on subsequent frames, and spots containing slow moving keys can be unaltered and just have their bounding boxes remeasured.

Fractional Reloading

With knowing the unbroken id-range under any spot, spots can be 'rebloomed' and written over the old record.

  • If less spot_ids are used, the id records of the unused are tagged as invalid by setting their spot.parent to 0.
  • If more spot_ids are necessary, a sufficient gap in the id-sequence is made with a shift of all subsequent spots elements and adjustment of spots child and parent values which are past the gap range.

Fractional spotmap reloading is implemented. The whole scheme is laborous to describe fully, here is a summary in haste:

  • The 'bloom ids' under each 'velocity trunk' generally pad themselves to about 125% of the running average size of a 'trunk-bloom' (The more spots, the less bloom-size variance and less padding is necessary)
  • On some update frames, adjacent 'velocity trunks' are resplit together, allowing faster and slower keys to migrate through the order of faster or slower velocity trunks.
  • Velocity trunks need rebloomed by space after 'rebudding' by velocity.
  • On some frames the only update is the fastest velocity trunk is rebloomed by space.
  • 1/10th or 1/5th of the spotmap is effectively rebloomed on every frame.
  • All the valid spot records are quite quickly remeasured on every frame - to update their bounding-box info - this does not involve regraphing the tree, it just involves navigating the existing tree.

In this spotmap rendering it is possible to observe that the spots containing the fast moving particles in the center of the system are remapped much more often than those slower particles.

Uses of spotmap

Spotquest notes describes a use of spotmap.

Tail Notes...

  • Sourcefile: figment/spotmap.js (this file is currently messy)

  • For ideal performance leaf nodes could cluster more efficiently than the present method (simple subdivision by spatial grid until leaf node size is achieved). A method to split nodes of intermediate size into clustered leaf nodes is half developed.

Clone this wiki locally