-
Notifications
You must be signed in to change notification settings - Fork 210
perf: eliminate overdraw for opaque image fills #1327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Sounds like a big win for large images (which I am very happy to see, because this is a case (the main case?) where Vello renderers seem to me to be noticably slower than alternatives). |
Yeah, I think it’s a really nice win, though I should point out that it mainly applies to overlapping images.
Oh, I thought we didn’t have image benchmarks in the Blend2 performance suite, but after reading the options documentation more carefully, I realized that
|
|
In the Blend2D suite, it's using transparent images so it probably wouldn't make a difference there. But yes, I there probably is more that can be done to optimize the performance of images. linebender/fearless_simd#171 might also help there. |
|
Oh and another thing, in that particular case the images only have 1x scaling and are pixel-aligned and I believe Blend2D has a special case for that, that's why we are much slower there. If you take a look at FillRectU and FillRectRot for example, the story looks different already. |
I think I've also seen discussed that a lot of renderers generate mipmaps for images, which would presumably greatly help performance in cases with large downscaling factors. https://servo.org for example features several images with native dimensions of in some cases over 4000px, some of which are rendered downscaled by a factor of ~10x. |
|
I'm not sure this would help with performance though? I actually think it would be more costly, because you need to compute the mipmaps, which you don't need currently. When rendering we only sample the affected pixels, so the size of the image doesn't really make a difference here, I think. |
Hmm... I had assumed that we would be doing what https://en.wikipedia.org/wiki/Image_scaling describes as "box sampling" for when downscaling with a scale factor > 2x. But perhaps we're currently just dropping pixels? |
|
If you are using NN, then yes, the pixels will just be dropped right now. For bilinear/bicubic we do still sample neighboring pixels, but that isn't enough if the downscale factor is larger. |
2d4526c to
422ece4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering whether integration (similar as in integration tests) would be a better name? But not sure, up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you made a good point, “integration” is a more descriptive name here. Thank you!
Fixed.
| let scale = width / original_width; | ||
| let height = original_height * scale; | ||
|
|
||
| renderer.set_transform(Affine::IDENTITY); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I think this could be omitted since we don't ever change the transform, right?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, right.
Fixed.
| let alpha = u16::from(rgba[3]); | ||
| let premultiply = |component| (alpha * u16::from(component) / 255) as u8; | ||
| vello_common::color::PremulRgba8 { | ||
| r: premultiply(rgba[0]), | ||
| g: premultiply(rgba[1]), | ||
| b: premultiply(rgba[2]), | ||
| a: alpha as u8, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| let alpha = u16::from(rgba[3]); | |
| let premultiply = |component| (alpha * u16::from(component) / 255) as u8; | |
| vello_common::color::PremulRgba8 { | |
| r: premultiply(rgba[0]), | |
| g: premultiply(rgba[1]), | |
| b: premultiply(rgba[2]), | |
| a: alpha as u8, | |
| } | |
| AlphaColor::from_rgba8(rgba[0], rgba[1], rgba[2], rgba[3]) | |
| .premultiply() | |
| .to_rgba8() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Fixed.
| && !img.has_opacities | ||
| && img.sampler.alpha == 1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think once we implement has_opacities we should always set it to true in case img.sampler.alpha != 1, so that we don't need the second check img.sampler.alpha == 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Since #1329 implements tracking has_opacities, I’ll update it there. Thank you!
| { | ||
| ranges.clear(); | ||
| } | ||
| // Fall through to emit the fill command below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you maybe add to the comment something like "as opposed to solid paints, where we have a return statement"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| }; | ||
| task_sender.send(task).unwrap(); | ||
| self.run_coarse(true); | ||
| // TODO: Support encoded_paints in multithreading. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify this comment? This means we don't support clearing commands with indexed paints, but indexed paints themselves still work, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. Indexed paints should render fine, the missing part is the overdraw elimination optimization for opaque indexed paints, since opacity can’t be determined without encoded_paints.
Todo comment updated.
422ece4 to
5e74593
Compare

This change mirrors what we do for solid colors, where we clear commands if a solid color covers the entire wide tile. We now apply the same approach to fully opaque images.
The benchmark scene below shows roughly a 30% performance improvement for this case, although the exact gain depends on how many images overlap across full wide tiles.
I’ll also open a follow-up PR to update this where
has_opacitiescurrently returnstruefor all images.