perf: eliminate overdraw for opaque image fills #1327

grebmeg · 2025-12-17T22:46:54Z

This change mirrors what we do for solid colors, where we clear commands if a solid color covers the entire wide tile. We now apply the same approach to fully opaque images.

The benchmark scene below shows roughly a 30% performance improvement for this case, although the exact gain depends on how many images overlap across full wide tiles.

Low quality:

Medium quality:

I’ll also open a follow-up PR to update this where has_opacities currently returns true for all images.

nicoburns · 2025-12-21T20:54:24Z

Sounds like a big win for large images (which I am very happy to see, because this is a case (the main case?) where Vello renderers seem to me to be noticably slower than alternatives).

grebmeg · 2025-12-21T21:10:33Z

Sounds like a big win for large images

Yeah, I think it’s a really nice win, though I should point out that it mainly applies to overlapping images.

(which I am very happy to see, because this is a case (the main case?) where Vello renderers seem to me to be noticably slower than alternatives).

Oh, I thought we didn’t have image benchmarks in the Blend2 performance suite, but after reading the options documentation more carefully, I realized that Pattern_NN and Pattern_BI correspond to nearest-neighbor filtering and bilinear interpolation, respectively. And yes, for large images, the simple FillRectA doesn’t look great compared to the others. Blend2 looks insanely fast, we need this!

LaurenzV · 2025-12-21T21:15:31Z

In the Blend2D suite, it's using transparent images so it probably wouldn't make a difference there. But yes, I there probably is more that can be done to optimize the performance of images. linebender/fearless_simd#171 might also help there.

LaurenzV · 2025-12-21T21:18:20Z

Oh and another thing, in that particular case the images only have 1x scaling and are pixel-aligned and I believe Blend2D has a special case for that, that's why we are much slower there. If you take a look at FillRectU and FillRectRot for example, the story looks different already.

nicoburns · 2025-12-21T21:54:29Z

Oh and another thing, in that particular case the images only have 1x scaling and are pixel-aligned and I believe Blend2D has a special case for that, that's why we are much slower there. If you take a look at FillRectU and FillRectRot for example, the story looks different already.

I think I've also seen discussed that a lot of renderers generate mipmaps for images, which would presumably greatly help performance in cases with large downscaling factors. https://servo.org for example features several images with native dimensions of in some cases over 4000px, some of which are rendered downscaled by a factor of ~10x.

LaurenzV · 2025-12-21T22:08:02Z

I'm not sure this would help with performance though? I actually think it would be more costly, because you need to compute the mipmaps, which you don't need currently. When rendering we only sample the affected pixels, so the size of the image doesn't really make a difference here, I think.

nicoburns · 2025-12-21T22:32:33Z

When rendering we only sample the affected pixels

Hmm... I had assumed that we would be doing what https://en.wikipedia.org/wiki/Image_scaling describes as "box sampling" for when downscaling with a scale factor > 2x. But perhaps we're currently just dropping pixels?

LaurenzV · 2025-12-21T22:37:59Z

If you are using NN, then yes, the pixels will just be dropped right now. For bilinear/bicubic we do still sample neighboring pixels, but that isn't enough if the downscale factor is larger.

LaurenzV · 2025-12-29T08:49:05Z

sparse_strips/vello_bench/src/integration.rs

I'm wondering whether integration (similar as in integration tests) would be a better name? But not sure, up to you.

I think you made a good point, “integration” is a more descriptive name here. Thank you!

Fixed.

LaurenzV · 2025-12-29T08:55:47Z

sparse_strips/vello_bench/src/scene.rs

+                let scale = width / original_width;
+                let height = original_height * scale;
+
+                renderer.set_transform(Affine::IDENTITY);


(I think this could be omitted since we don't ever change the transform, right?)

Yep, right.

Fixed.

LaurenzV · 2025-12-29T08:57:18Z

sparse_strips/vello_bench/src/scene.rs

+                let alpha = u16::from(rgba[3]);
+                let premultiply = |component| (alpha * u16::from(component) / 255) as u8;
+                vello_common::color::PremulRgba8 {
+                    r: premultiply(rgba[0]),
+                    g: premultiply(rgba[1]),
+                    b: premultiply(rgba[2]),
+                    a: alpha as u8,
+                }


Suggested change

let alpha = u16::from(rgba[3]);

let premultiply = |component| (alpha * u16::from(component) / 255) as u8;

vello_common::color::PremulRgba8 {

r: premultiply(rgba[0]),

g: premultiply(rgba[1]),

b: premultiply(rgba[2]),

a: alpha as u8,

}

AlphaColor::from_rgba8(rgba[0], rgba[1], rgba[2], rgba[3])

.premultiply()

.to_rgba8()

Thanks!

Fixed.

LaurenzV · 2025-12-29T09:08:04Z

sparse_strips/vello_common/src/coarse.rs

+                                    && !img.has_opacities
+                                    && img.sampler.alpha == 1.0


I think once we implement has_opacities we should always set it to true in case img.sampler.alpha != 1, so that we don't need the second check img.sampler.alpha == 1

Good point! Since #1329 implements tracking has_opacities, I’ll update it there. Thank you!

LaurenzV · 2025-12-29T09:08:52Z

sparse_strips/vello_common/src/coarse.rs

+                                    {
+                                        ranges.clear();
+                                    }
+                                    // Fall through to emit the fill command below.


Can you maybe add to the comment something like "as opposed to solid paints, where we have a return statement"?

LaurenzV · 2025-12-29T09:12:27Z

sparse_strips/vello_cpu/src/dispatch/multi_threaded.rs

        };
        task_sender.send(task).unwrap();
-        self.run_coarse(true);
+        // TODO: Support encoded_paints in multithreading.


Can you clarify this comment? This means we don't support clearing commands with indexed paints, but indexed paints themselves still work, right?

Correct. Indexed paints should render fine, the missing part is the overdraw elimination optimization for opaque indexed paints, since opacity can’t be determined without encoded_paints.

Todo comment updated.

grebmeg mentioned this pull request Dec 18, 2025

perf(vello_common): track has_opacities to skip alpha blending #1329

Merged

grebmeg mentioned this pull request Dec 22, 2025

Reduce the memory footprint of wide tile commands #1325

Merged

grebmeg force-pushed the gemberg/perf/image-rendering-improvements branch from 2d4526c to 422ece4 Compare December 22, 2025 02:35

LaurenzV approved these changes Dec 29, 2025

View reviewed changes

grebmeg added 2 commits December 30, 2025 11:05

perf: eliminate overdraw for opaque image fills

885c7cc

.

5e74593

grebmeg force-pushed the gemberg/perf/image-rendering-improvements branch from 422ece4 to 5e74593 Compare December 29, 2025 23:37

grebmeg enabled auto-merge December 29, 2025 23:44

grebmeg added this pull request to the merge queue Dec 29, 2025

Merged via the queue into main with commit 1e44c05 Dec 29, 2025
17 checks passed

grebmeg deleted the gemberg/perf/image-rendering-improvements branch December 29, 2025 23:53

perf: eliminate overdraw for opaque image fills #1327

perf: eliminate overdraw for opaque image fills #1327

Uh oh!

Conversation

grebmeg commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nicoburns commented Dec 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

grebmeg commented Dec 21, 2025

Uh oh!

LaurenzV commented Dec 21, 2025

Uh oh!

LaurenzV commented Dec 21, 2025

Uh oh!

nicoburns commented Dec 21, 2025

Uh oh!

LaurenzV commented Dec 21, 2025

Uh oh!

nicoburns commented Dec 21, 2025

Uh oh!

LaurenzV commented Dec 21, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

grebmeg commented Dec 17, 2025 •

edited

Loading

nicoburns commented Dec 21, 2025 •

edited

Loading