Skip to content

Conversation

@grebmeg
Copy link
Collaborator

@grebmeg grebmeg commented Dec 17, 2025

This change mirrors what we do for solid colors, where we clear commands if a solid color covers the entire wide tile. We now apply the same approach to fully opaque images.

The benchmark scene below shows roughly a 30% performance improvement for this case, although the exact gain depends on how many images overlap across full wide tiles.

bench_output
  • Low quality:
image
  • Medium quality:
image

I’ll also open a follow-up PR to update this where has_opacities currently returns true for all images.

@nicoburns
Copy link
Contributor

nicoburns commented Dec 21, 2025

Sounds like a big win for large images (which I am very happy to see, because this is a case (the main case?) where Vello renderers seem to me to be noticably slower than alternatives).

@grebmeg
Copy link
Collaborator Author

grebmeg commented Dec 21, 2025

Sounds like a big win for large images

Yeah, I think it’s a really nice win, though I should point out that it mainly applies to overlapping images.

(which I am very happy to see, because this is a case (the main case?) where Vello renderers seem to me to be noticably slower than alternatives).

Oh, I thought we didn’t have image benchmarks in the Blend2 performance suite, but after reading the options documentation more carefully, I realized that Pattern_NN and Pattern_BI correspond to nearest-neighbor filtering and bilinear interpolation, respectively. And yes, for large images, the simple FillRectA doesn’t look great compared to the others. Blend2 looks insanely fast, we need this!

image

@LaurenzV
Copy link
Collaborator

In the Blend2D suite, it's using transparent images so it probably wouldn't make a difference there. But yes, I there probably is more that can be done to optimize the performance of images. linebender/fearless_simd#171 might also help there.

@LaurenzV
Copy link
Collaborator

Oh and another thing, in that particular case the images only have 1x scaling and are pixel-aligned and I believe Blend2D has a special case for that, that's why we are much slower there. If you take a look at FillRectU and FillRectRot for example, the story looks different already.

@nicoburns
Copy link
Contributor

Oh and another thing, in that particular case the images only have 1x scaling and are pixel-aligned and I believe Blend2D has a special case for that, that's why we are much slower there. If you take a look at FillRectU and FillRectRot for example, the story looks different already.

I think I've also seen discussed that a lot of renderers generate mipmaps for images, which would presumably greatly help performance in cases with large downscaling factors. https://servo.org for example features several images with native dimensions of in some cases over 4000px, some of which are rendered downscaled by a factor of ~10x.

@LaurenzV
Copy link
Collaborator

I'm not sure this would help with performance though? I actually think it would be more costly, because you need to compute the mipmaps, which you don't need currently. When rendering we only sample the affected pixels, so the size of the image doesn't really make a difference here, I think.

@nicoburns
Copy link
Contributor

When rendering we only sample the affected pixels

Hmm... I had assumed that we would be doing what https://en.wikipedia.org/wiki/Image_scaling describes as "box sampling" for when downscaling with a scale factor > 2x. But perhaps we're currently just dropping pixels?

@LaurenzV
Copy link
Collaborator

If you are using NN, then yes, the pixels will just be dropped right now. For bilinear/bicubic we do still sample neighboring pixels, but that isn't enough if the downscale factor is larger.

@grebmeg grebmeg force-pushed the gemberg/perf/image-rendering-improvements branch from 2d4526c to 422ece4 Compare December 22, 2025 02:35
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether integration (similar as in integration tests) would be a better name? But not sure, up to you.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you made a good point, “integration” is a more descriptive name here. Thank you!

Fixed.

let scale = width / original_width;
let height = original_height * scale;

renderer.set_transform(Affine::IDENTITY);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I think this could be omitted since we don't ever change the transform, right?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, right.

Fixed.

Comment on lines 82 to 89
let alpha = u16::from(rgba[3]);
let premultiply = |component| (alpha * u16::from(component) / 255) as u8;
vello_common::color::PremulRgba8 {
r: premultiply(rgba[0]),
g: premultiply(rgba[1]),
b: premultiply(rgba[2]),
a: alpha as u8,
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let alpha = u16::from(rgba[3]);
let premultiply = |component| (alpha * u16::from(component) / 255) as u8;
vello_common::color::PremulRgba8 {
r: premultiply(rgba[0]),
g: premultiply(rgba[1]),
b: premultiply(rgba[2]),
a: alpha as u8,
}
AlphaColor::from_rgba8(rgba[0], rgba[1], rgba[2], rgba[3])
.premultiply()
.to_rgba8()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Fixed.

Comment on lines 1241 to 1242
&& !img.has_opacities
&& img.sampler.alpha == 1.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think once we implement has_opacities we should always set it to true in case img.sampler.alpha != 1, so that we don't need the second check img.sampler.alpha == 1

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Since #1329 implements tracking has_opacities, I’ll update it there. Thank you!

{
ranges.clear();
}
// Fall through to emit the fill command below.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you maybe add to the comment something like "as opposed to solid paints, where we have a return statement"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

};
task_sender.send(task).unwrap();
self.run_coarse(true);
// TODO: Support encoded_paints in multithreading.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify this comment? This means we don't support clearing commands with indexed paints, but indexed paints themselves still work, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Indexed paints should render fine, the missing part is the overdraw elimination optimization for opaque indexed paints, since opacity can’t be determined without encoded_paints.

Todo comment updated.

@grebmeg grebmeg force-pushed the gemberg/perf/image-rendering-improvements branch from 422ece4 to 5e74593 Compare December 29, 2025 23:37
@grebmeg grebmeg enabled auto-merge December 29, 2025 23:44
@grebmeg grebmeg added this pull request to the merge queue Dec 29, 2025
Merged via the queue into main with commit 1e44c05 Dec 29, 2025
17 checks passed
@grebmeg grebmeg deleted the gemberg/perf/image-rendering-improvements branch December 29, 2025 23:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants