Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocation reduction #39

Merged
merged 2 commits into from
Sep 3, 2024
Merged

Conversation

aconz2
Copy link
Contributor

@aconz2 aconz2 commented Aug 16, 2024

Summary of the PR

I stumbled upon this code in profiling cloud hypervisor and seeing some vec things show up. I can't measure any real difference in my cloud hypervisor testing but there is a measurable difference in the tests. Not sure if you want it but figured I'd open a PR

Requirements

Before submitting your PR, please make sure you addressed the following
requirements:

  • All commits in this PR have Signed-Off-By trailers (with
    git commit -s), and the commit message has max 60 characters for the
    summary and max 75 characters for each description line.
  • All added/changed functionality has a corresponding unit/integration
    test.
  • All added/changed public-facing functionality has entries in the "Upcoming
    Release" section of CHANGELOG.md (if no such section exists, please create one).
  • Any newly added unsafe code is properly documented.

@aconz2
Copy link
Contributor Author

aconz2 commented Aug 16, 2024

see the commit message for some more details/numbers

Copy link
Collaborator

@rbradford rbradford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - thanks!

src/aml.rs Show resolved Hide resolved
src/aml.rs Show resolved Hide resolved
@@ -273,13 +269,10 @@ impl<'a> Aml for Package<'a> {
child.to_aml_bytes(&mut bytes);
}

let mut pkg_length = create_pkg_length(bytes.len(), true);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would agree that reversing and inserting at 0 is usually equivalent to appending. But I think @timw-rivos must have had a reason for doing it this way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I just thought it through once and didn't look at it again ;)

@@ -87,6 +81,10 @@ impl AmlSink for alloc::vec::Vec<u8> {
fn byte(&mut self, byte: u8) {
self.push(byte);
}

fn vec(&mut self, v: &[u8]) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should rename this API to e.g.

AmlSink::sink_{byte, word, dword, qword, slice} @sboeuf @timw-rivos WDYT?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is private to this library, but it is more clear

@aconz2 aconz2 force-pushed the allocation_reduction branch 2 times, most recently from 047ceac to 5901f5b Compare August 17, 2024 23:21
timw-rivos
timw-rivos previously approved these changes Aug 19, 2024
aconz2 added 2 commits August 26, 2024 14:11
Signed-off-by: Andrew Consroe <aconz2@gmail.com>
Add AmlSink::vec specialization for Vec so we can memcpy

Convert calls like: `for byte in bytes { sink.byte(byte); }` into
`sink.vec(bytes)`

Remove all calls to Vec::insert by reordering appends or doing a single
memmove and slice copy

Evaluation done by timing the tests and tracking allocations with
valgrind's dhat, both for heap tracking and memcopy tracking.

after/before for total blocks allocated is 6588/8029 = 82%
after/before fastest run time is 4.6/5.9 = 78%

```
set -e
for x in main allocation_reduction; do
    git checkout $x
    echo "=== $x ==="
    cargo test --release --tests --no-run &>/dev/null
    exe=$(find target/release/deps -executable -name 'acpi_tables*')
    sha256sum ${exe}
    taskset -c 0 hyperfine --shell=none "${exe} --test-threads 1"

    valgrind --tool=dhat --mode=heap --dhat-out-file=/dev/null ${exe} \
      --test-threads 1 > /dev/null
    valgrind --tool=dhat --mode=copy --dhat-out-file=/dev/null ${exe} \
      --test-threads 1 > /dev/null
done
```

And the output (with manual removal of extraneous output)

```
=== main ===
Benchmark 1: target/release/deps/acpi_tables-44d55ae4eaed1824
--test-threads 1
  Time (mean ± σ):    6.2 ms ±   0.1 ms    [User: 2.0 ms, System: 2.9
ms]
  Range (min … max):  5.9 ms …   6.7 ms    474 runs

DHAT --mode=heap

Total:     2,452,242 bytes in 8,029 blocks
At t-gmax: 75,063 bytes in 654 blocks
At t-end:  0 bytes in 0 blocks
Reads:     2,953,771 bytes
Writes:    2,194,154 bytes

DHAT --mode=copy

Total:     488,009 bytes in 4,448 blocks

=== allocation_reduction ===
Benchmark 1: target/release/deps/acpi_tables-44d55ae4eaed1824
--test-threads 1
  Time (mean ± σ):    5.0 ms ±   0.2 ms    [User: 1.0 ms, System: 2.8
ms]
  Range (min … max):  4.6 ms …   5.5 ms    587 runs

DHAT --mode=heap

Total:     2,410,141 bytes in 6,588 blocks
At t-gmax: 75,575 bytes in 654 blocks
At t-end:  0 bytes in 0 blocks
Reads:     2,920,895 bytes
Writes:    2,160,911 bytes

DHAT --mode=copy

Total:     1,149,621 bytes in 25,894 blocks
```

Signed-off-by: Andrew Consroe <aconz2@gmail.com>
@aconz2
Copy link
Contributor Author

aconz2 commented Aug 26, 2024

was out last week, just fixed the cargo fmt errors

@timw-rivos timw-rivos self-requested a review August 26, 2024 19:24
Copy link
Collaborator

@rbradford rbradford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution and patience through the review process!

@rbradford rbradford merged commit 849d595 into rust-vmm:main Sep 3, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants