Incremental flow tree construction

How render object construction works in WebKit

There are three cases to consider:

Adding a DOM node—A RenderObject is created for the DOM node in question and then the addChild method is called on the RenderObject belonging to the DOM node's parent. addChild is a virtual method and will perform different fixups depending on its type. For example, RenderBlocks create anonymous boxes if necessary to accommodate the new RenderObject child.
Removing a DOM node—As above, the RenderObject belonging to the former DOM node's parent is retrieved and removeChild is called on it to remove the child's render object. removeChild is a virtual method and does different things depending on the class's type. For instance, RenderBlock removes anonymous boxes as necessary.
Modifying the style of a DOM node—The RenderObject corresponding to the DOM node is retrieved and the old and new styles are compared to determine a StyleDifference. Then styleDidChange is called on the DOM node. This floods m_needsLayout bits (and possibly other bits) around the tree. m_needsLayout seems to be roughly equivalent to "needs the assign-heights pass to run". Constraint solving passes are skipped as necessary if the appropriate bits are set.

Possible algorithm for Servo

Adding and removing nodes

Observe that what WebKit does works out to a bottom-up traversal: it creates render objects and then stitches them in or out recursively by calling addChild or removeChild. This suggests that an analogous parallel bottom-up computation is possible. Suppose that we have a data type FlowManipulation which looks as follows:

enum FlowManipulation {
    AddFlows([Flow]),
    AddBoxes([Box]),
    RemoveFlows([Flow]),
    RemoveBoxes([Box]),
}

Then for each node we have a kernel function with this type:

Input: A DOM node, associated computed style, and possibly FlowManipulations for kids.
Output: A FlowManipulation instance.

The algorithm for this kernel function is as follows:

Use the display, position, and content properties of the computed style, as well as DOM node properties, to create the appropriate flows or render boxes (if any).
If any flows are to be generated, create a FlowManipulation describing them.
Perform flow manipulations for kids as necessary: for example, if this is a div node that created a BlockFlow, then AddFlows operations will create children of this BlockFlow. Each type of flow will have a different implementation for the various methods: for example, for BlockFlow instances, AddFlows or AddBoxes methods can create anonymous blocks, and RemoveFlows or RemoveBoxes methods remove anonymous boxes. It is important that FlowManipulations be executed in the correct order, and in the right place in the tree: for this, the DOM node that each member of the AddFlows/AddBoxes/RemoveFlows/RemoveBoxes instruction corresponds to needs to be consulted.

We run this kernel function on each node that was added or removed until no more flow manipulations occur. When the document is first created, the root node must always produce an AddFlows manipulation with a single root element.

Modifying nodes

This can just be done sequentially with markDirty style operations, since it's not likely to be high in the profile. (It isn't in Gecko.) We should be able to fairly easily parallelize if we have to. Constraint solving passes will ignore nodes that are not dirty.

IRC conversation, 8/28/2013

[6:33pm] eatkinson: pcwalton: so you want to look into DOM nodes of the FlowManipulation of the children right?
[6:33pm] tjc joined the chat room.
[6:33pm] tjc left the chat room. (Quit: Places to go, people to annoy)
[6:33pm] pcwalton: look into DOM nodes? not sure what you mean
[6:34pm] sanxiyn: eatkinson: Not sending it because 1. not sure having the same box attached to different flows makes sense 2. box builder seems to be planned to be rewritten anyway
[6:34pm] eatkinson: sanxiyn: awesome. i'd be really interested in taking a look at that at some point
[6:35pm] eatkinson: pcwalton: last part of point 3. in "Adding and removing nodes"
[6:39pm] pcwalton: eatkinson: ah right. we look at the DOM node in order to figure out where to place the new flow in the tree
[6:40pm] pcwalton: eatkinson: the reason I proposed doing it this way is that if you have a bunch of flows to be added that are all siblings of each other
[6:40pm] pcwalton: in WebKit you just do insertBefore, insertBefore, insertBefore
[6:40pm] pcwalton: and that's OK because it's sequential
[6:41pm] eatkinson: pcwalton: sounds racy if we do it in parallel, but maybe it doesn't matter if DOM nodes are read-only
[6:41pm] pcwalton: but in Servo if it's a bottom up traversal you don't have access to your flow siblings
[6:41pm] pcwalton: because that'd be racy
[6:41pm] pcwalton: but there is an immutable thing to use to coordinate, and that's the DOM node
[6:41pm] pcwalton: so I figured we'd use that as the point of reference
[6:42pm] pcwalton: eatkinson: but thinking on this later, maybe AddFlow and RemoveFlow doesn't even really make sense at this point
[6:42pm] pcwalton: since AddFlow is kind of silly without an anchor point
[6:42pm] pcwalton: maybe the kernel function should just shove the flow onto the DOM node somehow
[6:43pm] pcwalton: and then the parent's kernel function just iterates over its kids
[6:43pm] pcwalton: and stitches together any new flows it finds
[6:43pm] sanxiyn: How does this work with "containing block" stuff?
[6:43pm] eatkinson: pcwalton: that's kind of what i was thinking
[6:43pm] pcwalton: sanxiyn: good question. WebKit does some extra things here
[6:43pm] pcwalton: we need to think about containing blocks
[6:43pm] pcwalton: eatkinson: that is, the parent just iterates in lockstep over your old flows and the DOM nodes
[6:43pm] pcwalton: and when it finds a mismatch it either stitches in new flows or throws out dead old flows
[6:44pm] pcwalton: until the mismatch is corrected
[6:45pm] sanxiyn: pcwalton: Where should I look at in WebKit? (I tried, but can't really navigate source code of that project)
[6:45pm] sanxiyn: That is, on extra things it does for containing block
[6:45pm] eatkinson: pcwalton: i'll see if i can write up some of my thoughts later tonight. i want to make sure i grok your proposal first though

Alternative proposal (by eatkinson)

Flow tree construction seems to be best implemented with a bottom-up traversal of the DOM tree. However, there are some cases where each DOM node should correspond not to a flow, but some other kind of intermediate information. For example, if an inline node is the parent of an inline block split, we would probably want to store a list of flows - [InlineFlow, BlockFlow, InlineFlow] - which would all be parented by the same BlockFlow once we reach a block DOM node.

The "kernel" function of this traversal would be concerned with merging n sets of this intermediate information into one, depending on the properties of the current node. This may prove tricky, but probably not as tricky as the current inorder construction code. Parallelizing this process should be possible within the parallel traversal framework I am currently working on.

Incremental layout would work at the traversal level by not visiting every node in the tree. We would mark nodes as dirty if their float, display, or position style properties changed, or if their children are dirty. The main problem with this approach is that we will always rebuild the root node, which might prove slow depending on the page. It is also fairly coarse-grained incrementality, as we either totally re-compute a node or don't visit it at all.

Adding or removing a node would be handled by invalidating the parent that just had a child added or removed (as well as any new nodes that were just added).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly