Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spilled() is very slow #361

Open
Brogolem35 opened this issue Oct 18, 2024 · 1 comment
Open

spilled() is very slow #361

Brogolem35 opened this issue Oct 18, 2024 · 1 comment

Comments

@Brogolem35
Copy link

Brogolem35 commented Oct 18, 2024

I am the author of the markov_str library and I trying out the SmallVec to speed up the creation of my Markov Chains. But I have encountered this issue while benchmarking with cargo flamegraph: equvalance checks between SmallVecs take too long and almost all the time is spent on spilled().

Example:

MarkovChain type:

pub struct MarkovChain {
	items: HashMap<SmallVec<[Spur; 4]>, ChainItem>,
	state_size: usize,
	regex: Regex,
	cache: Rodeo,
}

benchmarked function:

	/// Adds text as training data. The tokens will be created with the regex of the MarkovChain.
	pub fn add_text(&mut self, text: &str) {
		let tokens: Vec<Spur> = self
			.regex
			.find_iter(text)
			.map(|t| self.cache.get_or_intern(t.as_str()))
			.collect();

		// vec.windows(0) panics for some reason.
		if tokens.is_empty() {
			return;
		}

		// Creating a preallocated buffer and filling and cleaning it instead of creating a new one every loop is way more efficient.
		let mut prevbuf: SmallVec<[Spur; 4]> = SmallVec::with_capacity(self.state_size);
		for win in tokens.windows(tokens.len().min(self.state_size + 1)) {
			let rel = win.last().unwrap();

			for i in 1..win.len() {
				prevbuf.clear();
				for t in win.iter().rev().skip(1).take(i).rev() {
					prevbuf.push(*t);
				}

				match self.items.raw_entry_mut().from_key(&prevbuf) {
					RawEntryMut::Occupied(mut view) => {
						view.get_mut().add(*rel);
					}
					RawEntryMut::Vacant(view) => {
						view.insert(prevbuf.clone(), ChainItem::new(*rel));
					}
				}
			}

Crates used:

hashbrown = "0.15.0"
lasso = {version = "0.7.3", features = ["ahasher", "inline-more"]}
rand = "0.8.5"
regex = "1.11.0"
smallvec = "1.13.2"

Flamegraph output:
image

The code and sample data I used for this benchmark can be found at Brogolem35/markov_str/.

@Brogolem35 Brogolem35 changed the title spilled() way slower than it needs to be spilled() is very slow Oct 18, 2024
@mbrubeck
Copy link
Collaborator

I can reproduce this profile locally in a release + debuginfo build, using Rust 1.82.0-x86_64-unknown-linux-gnu.

I suspect the debuginfo is misleading, since the spilled function itself is just a single integer comparison. In the optimized build, inlining may have combined it with code from other functions. Instruction-level profiling might be helpful.

The slowdown may be somewhat expected. SmallVec is generally cheaper to create and destroy than Vec, but it can be more expensive to access the contents. If you can use const generics to make your maximum size a compile-time constant, then a crate like arrayvec might be worth testing as another alternative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants