diff --git a/README.md b/README.md
index b039086..c83fc3b 100644
--- a/README.md
+++ b/README.md
@@ -78,7 +78,9 @@ By default Spellbook prefers boxed slices (`Box<[T]>`) and boxed strs (`Box<str>
 
 ##### Flag sets
 
-Words in the dictionary are associated with any number of flags, like `adventure/DRSMZG` mentioned above. The order of the flags as written in the dictionary isn't important. We need a way to look up whether a flag exists in that set quickly. The right tool for the job might seem like a `HashSet<Flag>` or a `BTreeSet<Flag>`. Those are mutable though so they carry some extra overhead. A dictionary contains many many flag sets and the overhead adds up. So what we use instead is a sorted `Box<[Flag]>`. To look up a flag we use `slice::binary_search` which works as well as a `BTreeSet`.
+Words in the dictionary are associated with any number of flags, like `adventure/DRSMZG` mentioned above. The order of the flags as written in the dictionary isn't important. We need a way to look up whether a flag exists in that set quickly. The right tool for the job might seem like a `HashSet<Flag>` or a `BTreeSet<Flag>`. Those are mutable though so they carry some extra overhead. A dictionary contains many many flag sets and the overhead adds up. So what we use instead is a sorted `Box<[Flag]>` and look up flags with `slice::binary_search`.
+
+Binary searching a small slice is typically a tiny bit slower than `slice::contains` but we prefer `slice::binary_search` for its consistent performance. See [`examples/bench-slice-contains.rs`](./examples/bench-slice-contains.rs) for more details.
 
 ##### Flags
 
diff --git a/examples/bench-slice-contains.rs b/examples/bench-slice-contains.rs
index 6278bb7..991a90e 100644
--- a/examples/bench-slice-contains.rs
+++ b/examples/bench-slice-contains.rs
@@ -1,129 +1,208 @@
-//! A benchmark for the different possible strategies of looking up a flag in a flagset.
-//!
-//! Originally I thought that binary search in a sorted flagset would clearly be better but it's
-//! actually typically 1-2ns worse (24ns total) for these cases. Flagsets are probably always
-//! small enough that binary search adds more overhead than it's worth.
-//!
-//! TODO: measure a histogram of flagset lengths in real `.dic` files. If binary search is causing
-//! more harm than good, let's switch `FlagSet::contains` to use `slice::contains`. Maybe drop the
-//! FlagSet wrapper struct completely and just use `Box<[Flag]>`.
+/*
+A benchmark for the different possible strategies of looking up a flag in a flagset.
+
+Originally I thought that binary search in a sorted flagset would clearly be better but it's
+actually typically 1-2ns worse (24ns total) for common cases. When flagsets are small enough,
+binary search adds more overhead than it's worth.
+
+I took a histogram of the length of flagsets used in LibreOffice/dictionaries:
+
+```text
+# of samples: 10352117
+31.812710385711444'th percentile of data is 0 with 3293289 samples
+68.81390540698101'th percentile of data is 1 with 3830407 samples
+79.5990230790475'th percentile of data is 2 with 1116488 samples
+86.29727619964109'th percentile of data is 3 with 693411 samples
+90.34976130969153'th percentile of data is 4 with 419518 samples
+93.03862195529669'th percentile of data is 5 with 278354 samples
+95.01828466583213'th percentile of data is 6 with 204937 samples
+95.84846268642443'th percentile of data is 7 with 85941 samples
+96.32064629872325'th percentile of data is 8 with 48881 samples
+96.89803544531036'th percentile of data is 9 with 59772 samples
+97.34328736817794'th percentile of data is 10 with 46093 samples
+97.78151657289035'th percentile of data is 11 with 45366 samples
+98.05298761596299'th percentile of data is 12 with 28103 samples
+98.37843795621707'th percentile of data is 13 with 33691 samples
+98.85777952470977'th percentile of data is 14 with 49622 samples
+99.00363374950264'th percentile of data is 15 with 15099 samples
+99.37310407137014'th percentile of data is 16 with 38248 samples
+99.53481012627658'th percentile of data is 17 with 16740 samples
+99.67957278689953'th percentile of data is 18 with 14986 samples
+99.71078379427126'th percentile of data is 19 with 3231 samples
+99.73314636996471'th percentile of data is 20 with 2315 samples
+99.79239995065744'th percentile of data is 21 with 6134 samples
+99.81532279822571'th percentile of data is 22 with 2373 samples
+99.833589593317'th percentile of data is 23 with 1891 samples
+99.85790346071242'th percentile of data is 24 with 2517 samples
+99.86442386615221'th percentile of data is 25 with 675 samples
+99.9408623376262'th percentile of data is 26 with 7913 samples
+99.94625253945642'th percentile of data is 27 with 558 samples
+99.949546551686'th percentile of data is 28 with 341 samples
+99.9525024688187'th percentile of data is 29 with 306 samples
+99.95853022140302'th percentile of data is 30 with 624 samples
+99.96063607086357'th percentile of data is 31 with 218 samples
+99.96498300782342'th percentile of data is 33 with 450 samples
+99.9685474961305'th percentile of data is 35 with 369 samples
+99.97166763088168'th percentile of data is 37 with 323 samples
+99.97396667754045'th percentile of data is 39 with 238 samples
+99.97595660868207'th percentile of data is 41 with 206 samples
+99.97818803632146'th percentile of data is 43 with 231 samples
+99.97989783152566'th percentile of data is 45 with 177 samples
+99.98148204855104'th percentile of data is 47 with 164 samples
+99.98326912263454'th percentile of data is 49 with 185 samples
+99.98466014246168'th percentile of data is 51 with 144 samples
+99.98588694467036'th percentile of data is 53 with 127 samples
+99.98694952926054'th percentile of data is 55 with 110 samples
+99.98779959693267'th percentile of data is 57 with 88 samples
+99.98877524278367'th percentile of data is 59 with 101 samples
+99.9897895280743'th percentile of data is 61 with 105 samples
+99.99072653448565'th percentile of data is 63 with 97 samples
+99.99233007123084'th percentile of data is 67 with 166 samples
+99.99370177133817'th percentile of data is 71 with 142 samples
+99.994802995368'th percentile of data is 75 with 114 samples
+99.9955178250014'th percentile of data is 79 with 74 samples
+99.99609741659604'th percentile of data is 83 with 60 samples
+99.9966866680506'th percentile of data is 87 with 61 samples
+99.99709238216685'th percentile of data is 91 with 42 samples
+99.99747877656328'th percentile of data is 95 with 40 samples
+99.99779755194034'th percentile of data is 99 with 33 samples
+99.9981163273174'th percentile of data is 103 with 33 samples
+99.99845442241427'th percentile of data is 107 with 35 samples
+99.99872489849177'th percentile of data is 111 with 28 samples
+99.99893741540981'th percentile of data is 115 with 22 samples
+99.99911129288822'th percentile of data is 119 with 18 samples
+99.99918857176749'th percentile of data is 123 with 8 samples
+99.99926585064678'th percentile of data is 127 with 8 samples
+99.99943006826526'th percentile of data is 135 with 17 samples
+99.99956530630402'th percentile of data is 143 with 14 samples
+99.99966190490312'th percentile of data is 151 with 10 samples
+99.99974884364232'th percentile of data is 159 with 9 samples
+99.99983578238152'th percentile of data is 167 with 9 samples
+99.99985510210135'th percentile of data is 175 with 2 samples
+99.99988408168107'th percentile of data is 183 with 3 samples
+99.99994204084054'th percentile of data is 191 with 6 samples
+99.99996136056035'th percentile of data is 199 with 2 samples
+99.99997102042026'th percentile of data is 207 with 1 samples
+99.99998068028017'th percentile of data is 215 with 1 samples
+99.99999034014009'th percentile of data is 231 with 1 samples
+100'th percentile of data is 271 with 1 samples
+```
+
+Most words have exactly one flag. Any empty flagset is the second most popular. A quite vast
+majority (90%) has four or fewer and we hit the 99th percentile with 15 flags in a flagset. This
+breakdown also changes between dictionaries: en_US for example uses only small flagsets (around
+ten at most).
+
+Given that `contains` is faster than `binary_search` for up to the 99th percentile it might seem
+worthwhile to switch to `contains`. `binary_search` though has much more predictable performance
+when we hit these outliers that live in the low hundreds of flags.
+
+```text
+$ cargo run --release --example bench-slice-contains
+Starting: Running benchmark(s). Stand by!
+
+•••••••••••••••••
+
+Method                                                              Mean                Samples
+-----------------------------------------------------------------------------------------------
+lookup non-existing flag high in many flags (contains)          89.93 ns    4,810,921/5,000,000
+lookup non-existing flag high in many flags (binary_search)     25.79 ns    4,999,695/5,000,000
+-----------------------------------------------------------------------------------------------
+lookup non-existing flag low in many flags (contains)           60.22 ns    4,997,489/5,000,000
+lookup non-existing flag low in many flags (binary_search)      24.97 ns    4,999,760/5,000,000
+-----------------------------------------------------------------------------------------------
+lookup existing flag in many flags (contains)                   50.24 ns    4,994,203/5,000,000
+lookup existing flag in many flags (binary_search)              24.84 ns    4,991,224/5,000,000
+-----------------------------------------------------------------------------------------------
+lookup non-existing flag high in few flags (contains)           22.72 ns    4,999,801/5,000,000
+lookup non-existing flag high in few flags (binary_search)      23.66 ns    4,999,788/5,000,000
+-----------------------------------------------------------------------------------------------
+lookup existing flag in few flags (contains)                    22.71 ns    4,999,821/5,000,000
+lookup existing flag in few flags (binary_search)               23.16 ns    4,999,821/5,000,000
+-----------------------------------------------------------------------------------------------
+lookup non-existing flag high in empty flags (contains)         22.49 ns    4,999,827/5,000,000
+lookup non-existing flag high in empty flags (binary_search)    22.95 ns    4,999,814/5,000,000
+```
+
+I think the tradeoff is worthwhile: we pay around 1 extra nanosecond on average but have no
+degenerate cases.
+
+*/
 
 use brunch::Bench;
 use std::hint::black_box;
 
 type Flag = std::num::NonZeroU16;
 
+const fn flag_n(n: u16) -> Flag {
+    assert!(n != 0);
+
+    unsafe { Flag::new_unchecked(n) }
+}
+
 const fn flag(ch: char) -> Flag {
     assert!(ch as u32 != 0);
 
     unsafe { Flag::new_unchecked(ch as u16) }
 }
 
-// en_US.dic `advise/LDRSZGB`
-const MANY_FLAGS_UNSORTED: &[Flag] = &[
-    flag('L'),
-    flag('D'),
-    flag('R'),
-    flag('S'),
-    flag('Z'),
-    flag('G'),
-    flag('B'),
-];
-const MANY_FLAGS_SORTED: &[Flag] = &[
-    flag('B'),
-    flag('D'),
-    flag('G'),
-    flag('L'),
-    flag('R'),
-    flag('S'),
-    flag('Z'),
-];
-
-// en_US.dic `advent/SM`
-const FEW_FLAGS_UNSORTED: &[Flag] = &[flag('S'), flag('M')];
-const FEW_FLAGS_SORTED: &[Flag] = &[flag('M'), flag('S')];
+// be_BY.dic (Belarusian) `абвал/2,9,10,12,13,16,17,22,23,62,67,68,69,70,74,250,270,290,296,297,298,299,300,322,335,363,364,365,367,368,398,399,400,403,408,423,424,425,426,427,479,514,520,521,522,523,524,525,526,527,528,529,530,543,577,585,633,634,635,639,640,641,642,643,647,648,649,650,652,726,747,773,774,775,778,794,836,838,1076,1082,1087,1088,1089,1090,1091,1092,1093,1094,1095,1096,1097,1175,1276,1695,1696,1697,1704,1705,1706,1707,1708,1709,1710,1711,1902,1903,1904,1905,1906,1907,1908,1909,1910,1911,1912,1992,1993,2055,2056,2057,2058,2059,2060,2130,2429,2668,2875,2876,2877,2878,2879,3185,3186,3187,3188,3189,3190,3191,3192,3193,3194,3316,3317,3318,3600,3726,3949,4381,4382,4383,4384,4385,4386`
+#[rustfmt::skip]
+const MANY_FLAGS: &[Flag] = &[flag_n(2), flag_n(9), flag_n(10), flag_n(12), flag_n(13), flag_n(16), flag_n(17), flag_n(22), flag_n(23), flag_n(62), flag_n(67), flag_n(68), flag_n(69), flag_n(70), flag_n(74), flag_n(250), flag_n(270), flag_n(290), flag_n(296), flag_n(297), flag_n(298), flag_n(299), flag_n(300), flag_n(322), flag_n(335), flag_n(363), flag_n(364), flag_n(365), flag_n(367), flag_n(368), flag_n(398), flag_n(399), flag_n(400), flag_n(403), flag_n(408), flag_n(423), flag_n(424), flag_n(425), flag_n(426), flag_n(427), flag_n(479), flag_n(514), flag_n(520), flag_n(521), flag_n(522), flag_n(523), flag_n(524), flag_n(525), flag_n(526), flag_n(527), flag_n(528), flag_n(529), flag_n(530), flag_n(543), flag_n(577), flag_n(585), flag_n(633), flag_n(634), flag_n(635), flag_n(639), flag_n(640), flag_n(641), flag_n(642), flag_n(643), flag_n(647), flag_n(648), flag_n(649), flag_n(650), flag_n(652), flag_n(726), flag_n(747), flag_n(773), flag_n(774), flag_n(775), flag_n(778), flag_n(794), flag_n(836), flag_n(838), flag_n(1076), flag_n(1082), flag_n(1087), flag_n(1088), flag_n(1089), flag_n(1090), flag_n(1091), flag_n(1092), flag_n(1093), flag_n(1094), flag_n(1095), flag_n(1096), flag_n(1097), flag_n(1175), flag_n(1276), flag_n(1695), flag_n(1696), flag_n(1697), flag_n(1704), flag_n(1705), flag_n(1706), flag_n(1707), flag_n(1708), flag_n(1709), flag_n(1710), flag_n(1711), flag_n(1902), flag_n(1903), flag_n(1904), flag_n(1905), flag_n(1906), flag_n(1907), flag_n(1908), flag_n(1909), flag_n(1910), flag_n(1911), flag_n(1912), flag_n(1992), flag_n(1993), flag_n(2055), flag_n(2056), flag_n(2057), flag_n(2058), flag_n(2059), flag_n(2060), flag_n(2130), flag_n(2429), flag_n(2668), flag_n(2875), flag_n(2876), flag_n(2877), flag_n(2878), flag_n(2879), flag_n(3185), flag_n(3186), flag_n(3187), flag_n(3188), flag_n(3189), flag_n(3190), flag_n(3191), flag_n(3192), flag_n(3193), flag_n(3194), flag_n(3316), flag_n(3317), flag_n(3318), flag_n(3600), flag_n(3726), flag_n(3949), flag_n(4381), flag_n(4382), flag_n(4383), flag_n(4384), flag_n(4385), flag_n(4386)];
+
+// en_US.dic `advent/SM` (sorted)
+const FEW_FLAGS: &[Flag] = &[flag('M'), flag('S')];
 
 const EMPTY_FLAGS: &[Flag] = &[];
 
-const UNKNOWN_FLAG_HIGH: Flag = flag('Z');
-const UNKNOWN_FLAG_LOW: Flag = flag('A');
+const UNKNOWN_FLAG_HIGH: Flag = flag_n(5000);
+const UNKNOWN_FLAG_LOW: Flag = flag_n(1);
 
-const D_FLAG: Flag = flag('D');
+const FLAG_S: Flag = flag('S');
+const FLAG_1709: Flag = flag_n(1709);
 
 const SAMPLES: u32 = 5_000_000;
 
 brunch::benches!(
-    Bench::new("contains: lookup non-existing high in many flags unsorted")
+    Bench::new("lookup non-existing flag high in many flags (contains)")
         .with_samples(SAMPLES)
-        .run(|| black_box(MANY_FLAGS_UNSORTED).contains(black_box(&UNKNOWN_FLAG_HIGH))),
-    Bench::new("contains: lookup non-existing high in many flags sorted")
+        .run(|| black_box(MANY_FLAGS).contains(&black_box(UNKNOWN_FLAG_HIGH))),
+    Bench::new("lookup non-existing flag high in many flags (binary_search)")
         .with_samples(SAMPLES)
-        .run(|| black_box(MANY_FLAGS_SORTED).contains(black_box(&UNKNOWN_FLAG_HIGH))),
-    Bench::new("contains: lookup non-existing low in many flags unsorted")
-        .with_samples(SAMPLES)
-        .run(|| black_box(MANY_FLAGS_UNSORTED).contains(black_box(&UNKNOWN_FLAG_LOW))),
-    Bench::new("contains: lookup non-existing low in many flags sorted")
-        .with_samples(SAMPLES)
-        .run(|| black_box(MANY_FLAGS_SORTED).contains(black_box(&UNKNOWN_FLAG_LOW))),
-    Bench::new("contains: lookup 'D' in many flags unsorted")
-        .with_samples(SAMPLES)
-        .run(|| black_box(MANY_FLAGS_UNSORTED).contains(black_box(&D_FLAG))),
-    Bench::new("contains: lookup 'D' in many flags sorted")
-        .with_samples(SAMPLES)
-        .run(|| black_box(MANY_FLAGS_SORTED).contains(black_box(&D_FLAG))),
-    // ---
-    Bench::new("contains: lookup non-existing high in few flags unsorted")
-        .with_samples(SAMPLES)
-        .run(|| black_box(FEW_FLAGS_UNSORTED).contains(&UNKNOWN_FLAG_HIGH)),
-    Bench::new("contains: lookup non-existing high in few flags sorted")
-        .with_samples(SAMPLES)
-        .run(|| black_box(FEW_FLAGS_SORTED).contains(&UNKNOWN_FLAG_HIGH)),
-    Bench::new("contains: lookup non-existing low in few flags unsorted")
-        .with_samples(SAMPLES)
-        .run(|| black_box(FEW_FLAGS_UNSORTED).contains(&UNKNOWN_FLAG_LOW)),
-    Bench::new("contains: lookup non-existing low in few flags sorted")
-        .with_samples(SAMPLES)
-        .run(|| black_box(FEW_FLAGS_SORTED).contains(&UNKNOWN_FLAG_LOW)),
-    Bench::new("contains: lookup 'D' in few flags unsorted")
-        .with_samples(SAMPLES)
-        .run(|| black_box(FEW_FLAGS_UNSORTED).contains(black_box(&D_FLAG))),
-    Bench::new("contains: lookup 'D' in few flags sorted")
-        .with_samples(SAMPLES)
-        .run(|| black_box(FEW_FLAGS_SORTED).contains(black_box(&D_FLAG))),
-    // ---
-    Bench::new("contains: lookup non-existing high in empty flags")
+        .run(|| black_box(MANY_FLAGS).binary_search(&black_box(UNKNOWN_FLAG_HIGH))),
+    Bench::spacer(),
+    Bench::new("lookup non-existing flag low in many flags (contains)")
         .with_samples(SAMPLES)
-        .run(|| black_box(EMPTY_FLAGS).contains(&UNKNOWN_FLAG_HIGH)),
-    Bench::new("contains: lookup non-existing low in empty flags")
+        .run(|| black_box(MANY_FLAGS).contains(&black_box(UNKNOWN_FLAG_LOW))),
+    Bench::new("lookup non-existing flag low in many flags (binary_search)")
         .with_samples(SAMPLES)
-        .run(|| black_box(EMPTY_FLAGS).contains(&UNKNOWN_FLAG_LOW)),
-    // ------
+        .run(|| black_box(MANY_FLAGS).binary_search(&black_box(UNKNOWN_FLAG_LOW))),
     Bench::spacer(),
-    // ------
-    Bench::new("binary_search: lookup non-existing high in many flags sorted")
+    Bench::new("lookup existing flag in many flags (contains)")
         .with_samples(SAMPLES)
-        .run(|| black_box(MANY_FLAGS_SORTED).binary_search(&UNKNOWN_FLAG_HIGH)),
-    Bench::new("binary_search: lookup non-existing low in many flags sorted")
+        .run(|| black_box(MANY_FLAGS).contains(&black_box(FLAG_1709))),
+    Bench::new("lookup existing flag in many flags (binary_search)")
         .with_samples(SAMPLES)
-        .run(|| black_box(MANY_FLAGS_SORTED).binary_search(&UNKNOWN_FLAG_LOW)),
-    Bench::new("binary_search: lookup 'D' in many flags sorted")
+        .run(|| black_box(MANY_FLAGS).binary_search(&black_box(FLAG_1709))),
+    Bench::spacer(),
+    Bench::new("lookup non-existing flag high in few flags (contains)")
         .with_samples(SAMPLES)
-        .run(|| black_box(MANY_FLAGS_SORTED).binary_search(black_box(&D_FLAG))),
-    // ---
-    Bench::new("binary_search: lookup non-existing high in few flags sorted")
+        .run(|| black_box(FEW_FLAGS).contains(&black_box(UNKNOWN_FLAG_HIGH))),
+    Bench::new("lookup non-existing flag high in few flags (binary_search)")
         .with_samples(SAMPLES)
-        .run(|| black_box(FEW_FLAGS_SORTED).binary_search(&UNKNOWN_FLAG_HIGH)),
-    Bench::new("binary_search: lookup non-existing low in few flags sorted")
+        .run(|| black_box(FEW_FLAGS).binary_search(&black_box(UNKNOWN_FLAG_HIGH))),
+    Bench::spacer(),
+    Bench::new("lookup existing flag in few flags (contains)")
         .with_samples(SAMPLES)
-        .run(|| black_box(FEW_FLAGS_SORTED).binary_search(&UNKNOWN_FLAG_LOW)),
-    Bench::new("binary_search: lookup 'D' in few flags sorted")
+        .run(|| black_box(FEW_FLAGS).contains(&black_box(FLAG_S))),
+    Bench::new("lookup existing flag in few flags (binary_search)")
         .with_samples(SAMPLES)
-        .run(|| black_box(FEW_FLAGS_SORTED).binary_search(black_box(&D_FLAG))),
-    // ---
-    Bench::new("binary_search: lookup non-existing high in empty flags")
+        .run(|| black_box(FEW_FLAGS).binary_search(&black_box(FLAG_S))),
+    Bench::spacer(),
+    Bench::new("lookup non-existing flag high in empty flags (contains)")
         .with_samples(SAMPLES)
-        .run(|| black_box(EMPTY_FLAGS).binary_search(&UNKNOWN_FLAG_HIGH)),
-    Bench::new("binary_search: lookup non-existing low in empty flags")
+        .run(|| black_box(EMPTY_FLAGS).contains(&black_box(UNKNOWN_FLAG_HIGH))),
+    Bench::new("lookup non-existing flag high in empty flags (binary_search)")
         .with_samples(SAMPLES)
-        .run(|| black_box(EMPTY_FLAGS).binary_search(&UNKNOWN_FLAG_LOW)),
+        .run(|| black_box(EMPTY_FLAGS).binary_search(&black_box(UNKNOWN_FLAG_HIGH))),
 );