Skip to content
pcwalton edited this page Dec 5, 2012 · 63 revisions

This is a very high-level overview of Servo's design. Servo remains an early prototype and portions of the design are not yet represented in the actual code. Some important aspects of the system have not been planned in detail.

Overview and goals

Servo is a research project to develop a new Web browser engine. Our goal is to create an architecture that takes advantage of parallelism at many levels while eliminating common sources of bugs and security vulnerabilities associated with incorrect memory management and data races.

Because C++ is poorly suited to preventing these problems, Servo is written in Rust, a new language designed specifically with Servo's requirements in mind. Rust provides a task-parallel infrastructure and a strong type system that enforces memory safety and data race freedom.

Servo is explicitly not aiming to create a full Web browser (except for demonstration purposes). Rather it is focused on creating a solid, embeddable engine. Although Servo is a research project, it is designed to be "productizable"—the code that we write should be of high enough quality that it could eventually be shipped to users.

Strategies for parallelism

Some ideas we are exploring or plan to:

  • Task-based architecture. Major components in the system should be factored into actors with isolated heaps. This will not only provide some incidental parallelism but also encourage loose coupling throughout the system, enabling us to replace components for the purposes of experimentation and research. Implemented.
  • Copy-on-write DOM. In Servo, the DOM is a versioned data structure that is shared between the content (JavaScript) task and the layout task, allowing layout to run even while scripts are reading and writing DOM nodes. Partially implemented.
  • Parallel rendering. Both rendering and compositing are separate threads, decoupled from layout in order to maintain responsiveness. The compositor thread manages its memory manually to avoid garbage collection pauses. Implemented.
  • Tiled rendering. We divide the screen into a grid of tiles and render each one in parallel. Tiling is needed for mobile performance regardless of its benefits for parallelism. Partially implemented.
  • Layered rendering. We divide the display list into subtrees whose contents can be retained on the GPU and render them in parallel. Partially implemented.
  • Selector matching. This is an embarrassingly parallel problem. Unlike Gecko, Servo will will do selector matching in a separate pass from layout so that it may be more easily parallelized. Not implemented.
  • Parallel layout. Rendering all of CSS (floats in particular) is a difficult problem. We are currently deferring this problem to the future. Meyerovich has done lots of interesting research here. Not implemented.
  • Text shaping. A crucial part of inline layout, text shaping is fairly costly and has potential for parallelism across text runs. Not implemented.
  • Parsing. We have various ideas. Parallel JavaScript parsing is potentially the biggest win, but it requires SpiderMonkey changes. Parallel HTML parsing could be beneficial, while parallel CSS parsing is probably easiest but least beneficial. May not be a win because all parallel parsing algorithms are likely to require speculation, which can backfire. Not implemented.
  • Image decoding. Decoding multiple images in parallel is straightforward. Implemented.
  • Decoding of other resources. This is probably less important than image decoding, but anything that needs to be loaded by a page can be done in parallel, e.g. parsing entire style sheets or decoding videos.

Challenges

  • Parallel data structures. We must maintain good straight-line performance and make complex shared types type-safe.
  • Language immaturity. The Rust compiler and language are not yet stable. Rust also has fewer libraries than C++ available; we can bind to C++ libraries, but that involves more work than simply using the C++ header files.
  • Thread-hostile libraries. Some third-party libraries we need don't play well in multi-threaded environments. Fonts in particular have been difficult. Even if libraries are technically thread-safe, often thread safety is achieved through a library-wide mutex lock, harming our opportunities for parallelism.

The task architecture

Diagram

(Diagram)

  • Each box represents a Rust task.
  • Blue boxes represent the primary tasks in the browser pipeline.
  • Gray boxes represent tasks auxiliary to the browser pipeline.
  • White boxes represent worker tasks. Each such box represents several tasks, the precise number of which will vary with the workload.
  • Dashed lines indicate supervisor relationships.
  • Solid lines indicate communication channels.

Description

Each engine instance can for now be thought of as a single tab or window, and manages a pipeline of tasks that accepts input, runs JavaScript against the DOM, performs layout, builds display lists, renders display lists to tiles and finally composites the final image to a surface.

The pipeline consists of three main tasks:

  • Content—Content's primary mission is to create and own the DOM and execute the JavaScript engine. It receives events from multiple sources, including navigation events, and routes them as necessary. When the content task needs to query information about layout it must send a request to the layout task.
  • Layout—Layout takes a snapshot of the DOM, calculates styles, and constructs the main layout data structure, the flow tree. The flow tree is used to calculate the layout of nodes and from there build a display list, which is sent to the render task.
  • Renderer—The renderer receives a display list and renders visible portions to one or more tiles, possibly in parallel.
  • Compositor—The compositor composites the tiles from the renderer and sends to the screen for display. As the UI thread, the compositor is also the first receiver of UI events, which are generally immediately sent to content for processing (although some events, such as scroll events, can be handled initially by the compositor for responsiveness).

Two complex data structures are involved in multi-task communication in this pipeline: the DOM and the display list. The DOM is communicated from content to layout and the display list from layout to the renderer. Figuring out an efficient and type-safe way to represent, share, and/or transmit these two structures is one of the major challenges for the project.

The copy-on-write DOM

Servo's DOM is a tree with versioned nodes that may be shared between a single writer and many readers. The DOM uses a copy-on-write strategy to allow the writer to modify the DOM in parallel with readers. The writer is always the content task and the readers are always layout tasks or subtasks thereof.

DOM nodes are Rust values whose lifetimes are managed by the JavaScript garbage collector. JavaScript accesses DOM nodes directly—there is no XPCOM or similar infrastructure.

The interface to the DOM is not currently type safe—it is possible to manage nodes incorrectly and end up dereferencing bogus pointers. Eliminating this unsafety is a high-priority, and necessary, goal for the project; as DOM nodes have a complex life cycle this will present some challenges.

The display list

Servo's rendering is entirely driven by a display list—a sequence of high-level drawing commands created by the layout task. Servo's display list is deeply immutable so that it may be shared by renderers operating concurrently and is generally self-contained. This is in contrast to WebKit's renderer, which does not use a display list, and Gecko's, which uses a display list but also consults additional information, some directly from the DOM, during rendering.

JavaScript and DOM bindings

We are currently using SpiderMonkey, although pluggable engines is a long-term, low-priority goal. Each content task gets its own JavaScript runtime. DOM bindings use the native JavaScript engine API instead of XPCOM. Automatic generation of bindings from WebIDL is a priority.

Multi-process architecture

Similar to Chromium and WebKit2, we intend to have a trusted application process and multiple, less trusted engine processes. The high-level API will in fact be IPC-based, likely with non-IPC implementations for testing and single-process use-cases, though it is expected most serious uses would use multiple processes. The engine processes will use the operating system sandboxing facilities to restrict access to system resources.

At this time we do not intend to go to the same extreme sandboxing ends as Chromium does, mostly because locking down a sandbox constitutes a large amount of development work (particularly on low-priority platforms like Windows XP and older Linux) and other aspects of the project are higher priority. Rust's type system also adds a significant layer of defense against memory safety vulnerabilities. This alone does not make a sandbox any less important to defend against unsafe code, bugs in the type system, and third-party/host libraries, but it does reduce the attack surface of Servo significantly relative to other browser engines. Additionally, we have performance-related concerns regarding some sandboxing techniques (for example, proxying all OpenGL calls to a separate process).

I/O and resource management

Web pages depend on a wide variety of external resources, with many mechanisms of retrieval and decoding. These resources are cached at multiple levels—on disk, in memory, and/or in decoded form. In a parallel browser setting, these resources must be distributed among concurrent workers.

Traditionally, browsers have been single-threaded, performing I/O on the "main thread", where most computation also happens. This leads to latency problems. In Servo there is no "main thread" and the loading of all external resources is handled by a single resource manager task.

Browsers have many caches, and Servo's task-based architecture means that it will probably have more than extant browser engines (e.g. we might have both a global task-based cache and a task-local cache that stores results from the global cache to save the round trip through the scheduler). Servo should have a unified caching story, with tunable caches that work well in low-memory environments.

Clone this wiki locally