-
Notifications
You must be signed in to change notification settings - Fork 211
Reduce the memory footprint of wide tile commands #1325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hey @LaurenzV, awesome work! I’m adding myself as a reviewer to indicate that I plan to review the PR, but I’ll likely be able to get to it on Monday. 🙏 |
| pub strips: Box<[Strip]>, | ||
| #[cfg(feature = "multithreading")] | ||
| /// The index of the thread that owns the alpha buffer. | ||
| /// Always 0 in single-threaded mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I’m calculating the struct size correctly, it’s now 32 bytes instead of 24 bytes in single-threaded mode, so we’re paying an additional 8 bytes even though this field isn’t used there. While we probably won’t have thousands of clips simultaneously, would it still make sense to keep this behind a feature gate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really a problem though? That struct is only used to keep track of the clip stack, and realistically you aren't going to have more than a few nested clip path, so the impact on memory is very negligible I think (unlike the wide tile commands, where you could end up with thousands or even tens of thousands of commands depending on what you are drawing)? I could add it back, but the main reason I removed it is that it keeps the code simpler because we don't have to slap on the cfg attribute everywhere we use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it’s not a problem, I just think paying for something you don’t use isn’t ideal, even if the cost is negligible and the code is slightly simpler.
| tiles, | ||
| width, | ||
| height, | ||
| props: Props::default(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to assign reasonable starting capacities for vectors in Props, specifically for FillProps? I think we can assume it will need to grow during the first render since it’s the primary data-holding structure. It might also be worth checking whether this shows any impact in benchmarks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any suggestions for that? I think the impact will be negligible since, as before, we aren't likely pushing a lot of elements into it, but if you have a preference for the starting capacity I can add that. I don't think it would have a performance impact because the benchmarks (or at least the Blend2D benchmarks) runs with a "warm" render context, i.e. a render context that has been previously rested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries then, choosing the right capacity size isn’t easy. Regarding the warm render context, are you saying the render context isn’t recreated from scratch, but instead reset after each render?
| pub strips: Box<[Strip]>, | ||
| #[cfg(feature = "multithreading")] | ||
| /// The index of the thread that owns the alpha buffer. | ||
| /// Always 0 in single-threaded mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it’s not a problem, I just think paying for something you don’t use isn’t ideal, even if the cost is negligible and the code is slightly simpler.
| tiles, | ||
| width, | ||
| height, | ||
| props: Props::default(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries then, choosing the right capacity size isn’t easy. Regarding the warm render context, are you saying the render context isn’t recreated from scratch, but instead reset after each render?
| /// The index of the thread that owns the alpha buffer | ||
| /// containing the mask values at `alpha_idx`. | ||
| /// Always 0 in single-threaded mode. | ||
| pub thread_idx: u8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can consider this negligible in ClipProps as well? And for FillProps, not having thread_idx wouldn’t actually change the struct size, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, yes. As mentioned though, I can change it back in case you consider it important. 😄
That's how it's currently done in the Blend2D benchmark suite, yep! |
|
@grebmeg Thanks a lot!! I'll try to take a look at your other PRs soon. |
This PR reduces the memory footprint of wide tile commands from 48 bytes to 16 bytes. This is achieved by introducing a number of changes:
This leads to some nice speedups in many cases:

