Life of a Pixel 2018

This talk is about how Chrome turns web content into pixels. The entire process is called “rendering”.
We’ll describe what we mean by content, and what we mean by pixels, and then we’ll explain the magic in between.

这个演讲主要介绍chrome是如何把web content转成像素点的,整个过程被称为“rendering”。

Chrome’s architecture is constantly evolving. This talk connects high-level concepts (which change slowly) to specific classes (which change frequently).
The details are based primarily on what is currently shipping in the canary channel (M69), but a few of the biggest future refactorings are mentioned in passing.


“Content” is the generic term in Chromium for all of the code inside a webpage or the frontend of a web application.
It’s made of text, markup (surrounding the text), styles (defining how markup is rendered), and script (which can modify all of the above dynamically).
There are other kinds of content, which we won’t cover here.


A real webpage is just thousands of lines of HTML, CSS, and JavaScript delivered in plain text over the network.
There’s no notion of compilation or packaging as you might find on other kinds of software platforms - the webpage’s source code is the input to the renderer.


Architecturally, the “content” namespace in the Chromium C++ codebase is responsible for everything in the red box.
Contrast with tab strip, address bar, navigation buttons, menus, etc. which live outside of “content”.
Key to Chrome’s security model: rendering happens in a sandboxed process.

从架构上说,”content“在Chromium C++ 代码的命名空间负责红框中的所有内容。

At the other end of the pipeline we have to get pixels onto the screen using the graphics libraries provided by the underlying operating system.
On most platforms today, that’s a standardized API called “OpenGL”. On Windows there’s an extra translation into DirectX. In the future, we may support newer APIs such as Vulkan.
These libraries provide low-level graphics primitives like “textures” and “shaders”, and let you do things like “draw a polygon at these coordinates into a buffer of virtual pixels”. But obviously they don’t understand anything about the web or HTML or CSS.


So the goal of rendering can be stated as: turn HTML / CSS / JavaScript into the right OpenGL calls to display the pixels.
But keep in mind a second goal as we describe the pipeline: We also want the right intermediate data structures to update the rendering efficiently after it’s produced, and answer queries about it from script or other parts of the system.


We break the pipeline into multiple “lifecycle stages”, generating those intermediate outputs.
We’ll first describe each stage of a working pipeline, then come back to the notion of efficient updating and introduce some concepts for optimization.


HTML tags impose a semantically meaningful hierarchical structure on the document. For example, a

may contain two paragraphs, each with text. So the first step is to parse those tags to build an object model that reflects this structure.

HTML标签在文档上强加了语义上有意义的层次结构。 例如,

可能包含两个段落,每个段落都带有文本。 因此,第一步是解析这些标签以构建反映此结构的对象模型。

If you’ve taken computer science classes you may recognize this as a “tree”.


The DOM serves double duty as both the internal representation of the page, and the API exposed to script for querying or modifying the rendering.
The JavaScript engine (V8) exposes DOM web APIs as thin wrappers around the real DOM tree through a system called “bindings”.

DOM具有双重功能,它既能作为页面的内部展示,也能作为查询和更改呈现的脚本的公开API。Javascript引擎(V8)通过一个系统叫做“bindings”把真实的DOM tree封装成一个轻量的wrapper。

Having built the DOM tree, the next step is to process the CSS styles.
A CSS selector selects a subset of DOM elements that its property declarations should apply to.

构建了DOM tree以后,下一步是处理CSS styles。一个CSS选择器选择它属性声明了应当应用的DOM元素的子集。

Style properties are the knobs by which web authors can influence the rendering of DOM elements.
There are hundreds of style properties.


Furthermore, it’s not trivial to determine which elements a style rule selects.
Some elements may be selected by more than one rule, with conflicting declarations for a particular style property.


The layout stage runs after the style recalc stage.
First, the layout tree is constructed. Then, we walk the layout tree, filling in the geometry data, and processing side effects of the geometry.

layout阶段在style重计算阶段以后进行。首先,layout tree构建,然后我们遍历这个layout tree,填充几何数据,然后处理几何产生的副作用。

Today, layout objects contain both inputs and outputs of the layout stage, without a clean separation between them.
For example, the LayoutObject acquires ownership of its element’s ComputedStyle object.
A new layout system called LayoutNG is expected to simplify the architecture, and make it easier to build new layout algorithms.

比如,LayoutObject获取它元素的 ComputedStyle 对象的所有权。一个新的layout系统叫做LayoutNG,被期望用来简化架构,构建新的更简单的layout算法。

Now that we understand the geometry of our layout objects, it’s time to paint them.
Paint records paint operations into a list of display items.
A paint operation might be something like “draw a rectangle at these coordinates, in this color”.
There may be multiple display items for each layout object, corresponding to different parts of its visual appearance, like the background, foreground, outline, etc.
This is just a recording that can be played back later. We’ll see why that’s useful in a bit.

现在我们理解了layout object的几何形状,接下来就是绘制它们。
paint将paint操作记录为一列display item。一个paint操作类似于“在这些坐标用这个颜色绘制一个矩形”。对于每个layout object也许有多个display item,对应视觉外观的不同部分,比如背景,前景,轮廓等。~~~

It’s important to paint elements in the right order, so that they stack correctly when they overlap.
The order can be controlled by style.

以正确的顺序paint element非常重要,只有这样才能在元素覆盖的时候正常显示。这个顺序能被style所控制。

It’s even possible for an element to be partly in front of and partly behind another element.
That’s because paint runs in multiple phases, and each paint phase does its own traversal of a subtree.


The paint operations in the display item list are executed by a process called rasterization.
Each cell in the resulting bitmap holds values for four color channels.

在display item列表上进行paint操作被称为rasterization(光栅化)。位图上的每个像素点存着四个颜色通道的值。

The rastered bitmap is stored in memory, typically GPU memory referenced by an OpenGL texture object.
The GPU can also run the commands that produce the bitmap (“accelerated rasterization”).
Note that these pixels are not yet on the screen!


Rasterization issues OpenGL calls through a library called Skia. Skia provides a layer of abstraction around the hardware, and understands more complex things like paths and Bezier curves.
Skia is open-source and maintained by Google. It ships in the Chrome binary but lives in a separate code repository. It’s also used by other products such as the Android OS.
Skia’s GPU-accelerated codepath builds its own buffer of drawing operations, which is flushed at the end of the raster task.


Recall that the renderer process is sandboxed, so it can’t make system calls directly.
GL calls issued by Skia are actually proxied into a different process using a “command buffer”.
The GPU process receives the command buffer and issues the “real” GL calls through a set of function pointers.
Besides escaping the renderer sandbox, isolating GL in the GPU process protects us from unstable or insecure graphics drivers.

Skia发起的GL调用使用“command buffer”被代理给了另一个不同的进程。
GPU进程接受command buffer然后通过函数指针发起“真正”的GL调用。

Those GL function pointers are initialized by dynamic lookup from the system’s shared OpenGL library - or the ANGLE library on Windows.
ANGLE is another library built by Google; its job is to translate OpenGL to DirectX, which is Microsoft’s API for accelerated graphics on Windows.
There are also OpenGL drivers for Windows, but historically they have not been very high quality.

ANGLE是Google建立的另一个库; 它的工作是将OpenGL转换为DirectX,这是微软在Windows上加速图形的API。

Moving raster to the GPU process will improve performance.
It’s also needed to support Vulkan.


We have now gone all the way from content to pixels in memory.
But note that the rendering is not static.
Running the full pipeline is expensive


Change is modelled as animation frames.
Each frame is a complete rendering of the state of the content at a particular point in time.

Certain style properties cause a layer to be created for a layout object.
If a layout object doesn’t have a layer, it paints into the layer of the nearest ancestor that has one.


Building the layer tree is a new lifecycle stage on the main thread. Today, this happens before paint, and each layer is painted separately.

构建layer tree是主线程上的新生命周期阶段。如今,这发生在paint之前,每层都是单独paint的。

After paint is finished, the commit updates a copy of the layer tree on the compositor thread, to match the state of the tree on the main thread.

paint结束后,commit将更新compositor线程上的layer tree的拷贝,以匹配主线程上tree的状态。

Recall: raster is the step after paint, which turns paint ops into bitmaps.
Layers can be large - rastering the whole layer is expensive, and unnecessary if only part of it is visible.
So the compositor thread divides the layer into tiles.
Tiles are the unit of raster work. Tiles are rastered with a pool of dedicated raster threads. Tiles are prioritized based on their distance from the viewport.
(Not shown: a layer actually has multiple tilings for different resolutions.)

回想一下:raster在paint之后,它将paint ops转换为位图。
layer可能很大 - 整个layer都光栅化的成本很高,如果只有部分可见则不必要全部光栅化。
tiles是raster work的基本单位。 专门的raster线程池对tiles进行光栅化。 根据tiles与viewport的距离确定tiles的优先级。

Once all the tiles are rastered, the compositor thread generates “draw quads”. A quad is like a command to draw a tile in a particular location on the screen, taking into account all the transformations applied by the layer tree. Each quad references the tile’s rastered output in memory (remember, no pixels are on the screen yet).
The quads are wrapped up in a compositor frame object which gets submitted to the browser process.

一旦所有tiles都被光栅化,compositor线程就会生成“draw quads”。 quads类似一个命令,考虑了图层树应用的所有变换在屏幕上的特定位置绘制tile。 每个quads在内存中引用了tile的光栅输出(请记住,屏幕上还没有像素)。
quads被包装在一个compositor frame object中,该对象被提交给浏览器进程。

The compositor thread has two copies of the tree, so that it can raster tiles from a new commit while drawing the previous commit.


The browser process runs a component called the display compositor, inside a service called “viz” (short for visuals).
The display compositor aggregates compositor frames submitted from all the renderer processes, along with frames from the browser UI outside of the WebContents. Then it issues the GL calls to draw the quad resources, which go to the GPU process just like the GL calls from the raster workers.
On most platforms the display compositor’s output is double-buffered, so the quads draw into a backbuffer, and a “swap” command makes it visible.
(On OS X we do something a little different with CoreAnimation.)
Finally our pixels are on the screen. :)

浏览器进程在名为“viz”(visual的简称)的service中运行一个名为display compositor的组件。
display compositor聚合从所有renderer进程提交的compositor frame,以及来自WebContents外部的browser UI的frame。 然后它发起GL调用来绘制quad资源,这些资源就像来自raster worker的GL调用一样进入GPU进程。
在大多数平台上,display compositor的输出是双缓冲的,因此quads绘制到backbuffer,“swap”命令使其可见。
(在OS X上,我们使用CoreAnimation做了一些不同的事情。)

文章转载自:Life of a Pixel 2018