`spilled()` is very slow #361

Brogolem35 · 2024-10-18T06:55:44Z

I am the author of the markov_str library and I trying out the SmallVec to speed up the creation of my Markov Chains. But I have encountered this issue while benchmarking with cargo flamegraph: equvalance checks between SmallVecs take too long and almost all the time is spent on spilled().

Example:

MarkovChain type:

pub struct MarkovChain {
	items: HashMap<SmallVec<[Spur; 4]>, ChainItem>,
	state_size: usize,
	regex: Regex,
	cache: Rodeo,
}

benchmarked function:

	/// Adds text as training data. The tokens will be created with the regex of the MarkovChain.
	pub fn add_text(&mut self, text: &str) {
		let tokens: Vec<Spur> = self
			.regex
			.find_iter(text)
			.map(|t| self.cache.get_or_intern(t.as_str()))
			.collect();

		// vec.windows(0) panics for some reason.
		if tokens.is_empty() {
			return;
		}

		// Creating a preallocated buffer and filling and cleaning it instead of creating a new one every loop is way more efficient.
		let mut prevbuf: SmallVec<[Spur; 4]> = SmallVec::with_capacity(self.state_size);
		for win in tokens.windows(tokens.len().min(self.state_size + 1)) {
			let rel = win.last().unwrap();

			for i in 1..win.len() {
				prevbuf.clear();
				for t in win.iter().rev().skip(1).take(i).rev() {
					prevbuf.push(*t);
				}

				match self.items.raw_entry_mut().from_key(&prevbuf) {
					RawEntryMut::Occupied(mut view) => {
						view.get_mut().add(*rel);
					}
					RawEntryMut::Vacant(view) => {
						view.insert(prevbuf.clone(), ChainItem::new(*rel));
					}
				}
			}

Crates used:

hashbrown = "0.15.0"
lasso = {version = "0.7.3", features = ["ahasher", "inline-more"]}
rand = "0.8.5"
regex = "1.11.0"
smallvec = "1.13.2"

Flamegraph output:

The code and sample data I used for this benchmark can be found at Brogolem35/markov_str/.

The text was updated successfully, but these errors were encountered:

mbrubeck · 2024-10-21T18:24:44Z

I can reproduce this profile locally in a release + debuginfo build, using Rust 1.82.0-x86_64-unknown-linux-gnu.

I suspect the debuginfo is misleading, since the spilled function itself is just a single integer comparison. In the optimized build, inlining may have combined it with code from other functions. Instruction-level profiling might be helpful.

The slowdown may be somewhat expected. SmallVec is generally cheaper to create and destroy than Vec, but it can be more expensive to access the contents. If you can use const generics to make your maximum size a compile-time constant, then a crate like arrayvec might be worth testing as another alternative.

Brogolem35 changed the title ~~spilled() way slower than it needs to be~~ spilled() is very slow Oct 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`spilled()` is very slow #361

`spilled()` is very slow #361

Brogolem35 commented Oct 18, 2024 •

edited

Loading

mbrubeck commented Oct 21, 2024

spilled() is very slow #361

spilled() is very slow #361

Comments

Brogolem35 commented Oct 18, 2024 • edited Loading

mbrubeck commented Oct 21, 2024

`spilled()` is very slow #361

`spilled()` is very slow #361

Brogolem35 commented Oct 18, 2024 •

edited

Loading