What happened in 0.11.0 w.r.t performance? #194

Arnix · 2023-11-05T21:52:09Z

Arnix
Nov 5, 2023

First of all, thanks for your hard work. I have started on a project to handle incoming ASN.1 encoded CDRs for ingestion in a data lake. When upgrading to 0.11.0 the performance dropped like a stone.

To take away complexity, I made a small test based on the size_compare example:

use std::time::Instant;
use rasn_pkix::Certificate;

fn main() {
    const DER_BYTES: &[u8] = include_bytes!("../standards/pkix/tests/data/letsencrypt-x3.crt");

    let loops = 1000;
    let start = Instant::now();
    for _ in 1..loops {
        let _original = rasn::der::decode::<Certificate>(&DER_BYTES).unwrap();
    }
    let end = Instant::now();
    let duration = end - start;
    println!("{} iterations took : {:?}", loops, duration);
}

Build and run this on 0.10.6 (debug build) yields the following result:
1000 iterations took : 115.877791ms

Build and run on 0.11.0 yields this:
1000 iterations took : 32.089247875s

Any help to point out if I am missing out on something or if something fundamental changed (for worse) is greatly appreciated.

Thanks
Arnix

XAMPPRocky · 2023-11-05T22:43:29Z

XAMPPRocky
Nov 5, 2023
Maintainer

Thank you for your issue! We don't have infrastructure for continuous profiling (open to contributions) so I couldn't say.

I'd recommend running your program through cargo-flamegraph and posting the results, that will give us insight into what taking up the most time.

Also generally it's not worth benchmarking debug builds, make sure you're benchmarking release builds with debuginfo.

0 replies

Arnix · 2023-11-06T13:15:09Z

Arnix
Nov 6, 2023
Author

Thanks for the feedback, here are runs on release (with debug info) for both versions, flamegraph svgs enclosed.

On 0.10.6:

sudo flamegraph -- target/release/examples/my_test
dtrace: description 'profile-997 ' matched 1 probe
1000 iterations took : 6.907084ms

On 0.11.0

sudo flamegraph -- target/release/examples/my_test
dtrace: description 'profile-997 ' matched 1 probe
1000 iterations took : 2.10861775s

7 replies

XAMPPRocky Nov 6, 2023
Maintainer

Yeah we probably want to box our error variants to ensure they're a consistent size.

With regards to zerocopy, if it can be shown that it's an improvement for large and small inputs then yeah it should be used.

However I would note that not copying is not always the most performant thing to do. Keeping a reference around to a sub slice of data requires keeping all of the data around and can make memmoves very expensive because it has to move that original data as well. It's often cheaper to copy.

Another caveat is that we have had the policy of basic decoding canonical encoding, and non-copy isn't always possible with how types are encoded in basic variants. It would require basic specific types which would lock you into a certain kind of encoding.

We should only do it where it's possible to do it for the existing types with basic decoding.

Arnix Nov 6, 2023
Author

Being new to rust and unfamiliar with this project, one obvious thing I notice is this:

Switching from constructs like:

        let the_number = root_octets
            .to_u32()
            .ok_or(DecodeError::integer_overflow(32u32, self.codec()))?;

To:

        let the_number = root_octets
            .to_u32()
            .ok_or_else(||DecodeError::integer_overflow(32u32, self.codec()))?;

Avoid that the

DecodeError::integer_overflow(32u32, self.codec())

is executed in the Ok / Some scenarios.

Thanks for the fast replies. If we go forward using rasn I hope we get at a level where we can contribute something back.

Nicceboy Nov 6, 2023

Yeah that is the reason, currently changing them

XAMPPRocky Nov 6, 2023
Maintainer

Clippy should have caught that. 😄

If we go forward using rasn I hope we get at a level where we can contribute something back.

Contributions are always welcome and encouraged, especially for newcomers.

Nicceboy Nov 6, 2023

It seems that map_err is also causing performance penalty? Fixing the above helped, but it did not completely restore the performance. Based on the profile, for example these lines slow as well:

https://github.com/XAMPPRocky/rasn/blob/e6ad63480f7396e9aa9d9c11690e26796ed78de1/src/ber/de/parser.rs#L26-L27

Arnix · 2023-11-06T17:25:14Z

Arnix
Nov 6, 2023
Author

If you look at the Decoder implementation the decode_sequence_of and decode_set_of" methods relies on parsing until an Error return from the decoder, hence the paths generating the backtraces is part of the "normal flow"

As an example

fn decode_sequence_of<D: Decode>(
        &mut self,
        tag: Tag,
        _: Constraints,
    ) -> Result<Vec<D>, Self::Error> {
        self.parse_constructed_contents(tag, true, |decoder| {
            let mut items = Vec::new();

            while let Ok(item) = D::decode(decoder) {
                items.push(item);
            }

            Ok(items)
        })
    }

my 5 cents

3 replies

Nicceboy Nov 6, 2023

That is a good catch, I wonder what should we do to solve this. The problem is currently limited to parse_value function, Error on these lines. https://github.com/XAMPPRocky/rasn/blob/e6ad63480f7396e9aa9d9c11690e26796ed78de1/src/ber/de/parser.rs#L19-L20

Logic might need some readjusting since it could hide meaningful errors as well.

XAMPPRocky Nov 6, 2023
Maintainer

It's hard because one variant of the encoding, you have to implement something like this, you need to decode until you reach an end of length marker.

I think what we could do is first expose whether it's finite or infinite, that way the finite case can be optimised to handle that. Secondly for the infinite case, we change the loop guard to be "is the next byte end of length", consume it, and then return what was parsed.

@Nicceboy Feel free to post a PR with the easy fix (ok_or_else) first, that way we can release that as a patch, and we can fix this on a follow up patch.

Nicceboy Nov 7, 2023

Maybe as temporal solution we could make Backtrace as optional value and on some cases were parser and user might get more data based on the errors, we leave Backtrace out.

XAMPPRocky · 2023-11-06T19:21:46Z

XAMPPRocky
Nov 6, 2023
Maintainer

Released the patch that fixes the biggest issue. @Arnix If you don't mind, it would be great if you could post a new flame graph with 0.11.1 so that we can see what's taking up time with that.

1 reply

Arnix Nov 6, 2023
Author

see
I
I
v

Arnix · 2023-11-06T19:29:15Z

Arnix
Nov 6, 2023
Author

sudo flamegraph -- target/release/examples/my_test
dtrace: description 'profile-997 ' matched 1 probe
1000 iterations took : 347.906083ms

Here we go!

1 reply

XAMPPRocky Nov 6, 2023
Maintainer

347ms still 3x slowdown, but at least not as bad as 32 seconds 😄

Nicceboy · 2023-11-08T17:57:23Z

Nicceboy
Nov 8, 2023

#197 should restore the performance completely for now.

1 reply

Nicceboy Nov 10, 2023

Actually, solution reverted, still WIP.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

librasn

What happened in 0.11.0 w.r.t performance? #194

{{title}}

Replies: 6 comments 13 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

What happened in 0.11.0 w.r.t performance? #194

Replies: 6 comments · 13 replies

XAMPPRocky Nov 5, 2023 Maintainer

Arnix Nov 6, 2023 Author

XAMPPRocky Nov 6, 2023 Maintainer

Arnix Nov 6, 2023 Author

XAMPPRocky Nov 6, 2023 Maintainer

Arnix Nov 6, 2023 Author

XAMPPRocky Nov 6, 2023 Maintainer

XAMPPRocky Nov 6, 2023 Maintainer

Arnix Nov 6, 2023 Author

Arnix Nov 6, 2023 Author

XAMPPRocky Nov 6, 2023 Maintainer

Replies: 6 comments 13 replies

XAMPPRocky
Nov 5, 2023
Maintainer

Arnix
Nov 6, 2023
Author

XAMPPRocky Nov 6, 2023
Maintainer

Arnix Nov 6, 2023
Author

XAMPPRocky Nov 6, 2023
Maintainer

Arnix
Nov 6, 2023
Author

XAMPPRocky Nov 6, 2023
Maintainer

XAMPPRocky
Nov 6, 2023
Maintainer

Arnix Nov 6, 2023
Author

Arnix
Nov 6, 2023
Author

XAMPPRocky Nov 6, 2023
Maintainer