-
Notifications
You must be signed in to change notification settings - Fork 144
Misc
Here is a list some thoughts in a rough state, mostly things learned along the way.
The tessellator-based approach implemented here is meant to be used to render paths covering a large amount of pixels. The idea is that the GPU is a lot better than the CPU at filling a large amount of pixels if we provide it with the information in the right form, so we go through a rather expensive tessellation process on the CPU side in order to be able to fill many pixels on the GPU at a reduced cost. For very small paths (most text in web pages for example), tessellation is probably not going to be the best trade-off, since generating the tessellation is may take as long as rasterizing the shape using a good CPU rasterizer.
Some other path rendering strategies involve less CPU-side processing, and trade it with more complex fragment shaders. It would be interesting to experiment with the various strategies. I assume they'll have different performance profiles depending on the type of shapes and the resolution at which they are being rendered.
The first versions of the fill tessellator were using the connectivity of the edges to determine the type of vertex that the sweep-line was processing at each step. For example, when looking at a vertex V, we'd ask the Path data structure it's previous and next vertices in order to determine the type of event (left, right, start, end, merge, split). working this way is very simple, but it falls down with certain edge cases, such as when two start events are interleaved (TODO, illustration).
The tessellator inherited this strategy from the computational geometry book in which the monotone polygon tessellation algortithm assumes an polygons with no self-intersection and therefore does not have to handle these types of problematic cases.
In order to fix this, the tessellator accumulates all edges that start at the current point and handle them all at once. This is adds a fair amount of complexity, but makes it possible to consider all cases including several pairs of edges intersecting at the same position.
Another benefit of working this way is that the tessellator is now independent from the path data structure. The input is a sorted list of edges that does not need to retain extra information connectivity information, and can be constructed from an iterator or built directly with the PathBuilder interface. As a result, if a program uses the tessellator with a custom path data structure (such as a half-edge mesh), tessellating does not require re-building a particular type of path object, as long as the custom path provides an iterator of path events or SVG events.
That said, all of the other open-source tessellators (including GLU/libtess2 and Skia) that I have seen so far use a half-egde data structure and multiple passes, so it might turn out that doing this differently does not make a big difference.
The tessellator was initially implemented with 32 bits floating point numbers. This caused a lot of bugs in the intersection detection code, and with the way the sweep line accumulates edges at the current position by comparing positions. I would have saved a lot of time and effort if I had followed cairo and skia's footsteps and went with integer or fixed point coordinates from the start.
TODO: a lot of war stories to tell here. Variable precision is terribly bug-prone for this type of algorithm.
Integer and fixed point number arithmetic can also have different kinds of precision issues when performing divisions and multiplications.
The problem with fixed point and integer coordinates is that it is very easy to run into overflows because they can describe a much smaller range than floating point number. There's a trade-off to make between how much precision we want in the fractional part and the size of the range. Currently the tessellator works with 32 bit fixed point precision numbers with 16 bits for fractional part, while the line segment intersection computation is implemented with f64 floating point numbers. There's been experiments with various fixed point precision math and just integer arithmetic, but it's quite hard to not run into overflows with the them when postponing division (otherwise the precision loss is worse than float math so why bother).
The challenge, when trying to not lose any precision, is that every time the internal representations of two fixed point numbers are multiplied, we need to potentially double the amount of bits required to hold the result, which grows quickly over the 64 bits we are comfortable working with. So postponing all divisions to the end of a long series of computation is not always a viable option.
One way to mitigate the overflow issue could be to enforce tiled tessellation, with small enough tiles that the limited fixed point range is not an issue anymore. The tessellated geometry can be a lot bigger than the range can describe, by simply adding tiles.
For reference Skia's tessellator stores positions using 16.16 fixed point numbers and uses doubles to compute edge intersections. Since the resulting intersections are not exact, there are a corrective measures that are taken later in the algorithm to correct cases where the imprecision cause some invariants of the algorithm to be broken. This may turn out to be the best compromise (unfortunately).
Path flattening used to be done in by PathBuilder, which meant that we would build and store a path object containing a lot of small line segments instead of the bezier curves. If all we need to do is iterate over flattened paths, storing the flattened path is not necessary. We can store the compact and resolution-independent bezier path and flatten it on the fly while iterating, which saves memory and does not require throwing away and re-building the path when the zoom level changes.
It would be interesting to render all overlapping shapes in one pass and only write to the frame buffer once, a bit like WebRender does. This would probably be a bit complicated to set up with a tessellator: The tessellated geometry would contain all paths and if one of the path start animating, the whole shape would need to be tessellated again. To do this we'd need to go back to using a path representation that stores the connectivity (half-edge mesh probably) and tessellate everything at once. It would certainly be more expensive on the CPU side (first frame), and cheaper on the GPU side (all frames) as long as all paths stay together. Other rendering approaches like the vector texture would probably play better with the idea of minimal overdraw (maybe).