Skip to content

Security: 16-bit pasta IDs allow enumeration and collisions #322

@ZigZagT

Description

@ZigZagT

Summary

Pasta IDs are generated as 16-bit random numbers (u16), giving only 65,536 possible values. This causes two problems:

  1. Enumeration: --private (unlisted) pastas are trivially discoverable by brute-force
  2. Collisions: no uniqueness check, so new pastas silently shadow existing ones

Root Cause

// src/endpoints/create.rs:125
let mut new_pasta = Pasta {
    id: rand::thread_rng().gen::<u16>() as u64,
    // ...
};

The ID is used directly without checking for duplicates:

// src/endpoints/create.rs:377-378
let mut pastas = data.pastas.lock().unwrap();
pastas.push(new_pasta);

The encoding layer (animal names or --hash-ids) does not add entropy — both are reversible mappings of the same 16-bit number:

  • Animal names: 64 words, base-64 encoding -> 1-3 word slugs cover the full u16 space
  • Hash IDs: same unsalted Harsh instance, deterministically reversible

Enumeration

The --private flag makes pastas "unlisted" (hidden from the listing page). The only access control is knowing the URL. With 65,536 possible IDs:

  • An attacker iterates all values: for id in 0..65536 { GET /upload/{encode(id)} }
  • At 100 requests/second, full enumeration takes ~11 minutes
  • Every existing pasta (public, unlisted, burn-after-read) is found

Collisions

With only 65,536 possible IDs, collision probability grows fast. The chance that at least two pastas share an ID is 1 - (65535/65536) * (65534/65536) * ... * ((65536-N+1)/65536) for N pastas:

Active pastas Collision probability
100 ~7%
256 ~39%
300 ~50%

A collision means a new pasta silently shadows an older one — the older pasta becomes inaccessible. No error or warning is shown.

Impact

  • --private (unlisted) pastas provide no real privacy
  • Burn-after-read pastas can be found and consumed before the intended recipient
  • Data loss from collisions on any deployment with more than a few hundred pastas
  • Combined with the forgeable owner_token issue, an attacker can enumerate AND repeatedly read burn-after-read pastas

Suggested Fix

Change gen::<u16>() as u64 to gen_range(0..=9_007_199_254_740_991) (2^53 - 1), and add a uniqueness check before insertion.

Why 2^53 and not larger?

The id field is u64 throughout the codebase. Using the full u64 range hits compatibility limits:

  • SQLite: id INTEGER PRIMARY KEY uses signed i64. Values above i64::MAX (2^63 - 1) would overflow.
  • JSON / JavaScript: Number.MAX_SAFE_INTEGER is 2^53 - 1 (9,007,199,254,740,991). IDs above this lose precision when parsed by JS clients (the web frontend, any API consumers).
  • Animal names: with u64, URLs grow to 10-11 animal words. With 2^53, they stay around 8-9 — longer than today but still reasonable.

Is 2^53 secure?

At 100 requests/second, enumerating 2^53 values takes ~2.8 million years. Collision probability among 1 million active pastas is ~0.01%. More than sufficient.

Why not UUID?

Changing id from u64 to UUID would require:

  • Rewriting the SQLite schema (INTEGER PRIMARY KEY -> TEXT)
  • Rewriting all animal-name and hashid encoding/decoding (both take u64)
  • Changing all endpoint handlers (URL -> u64 lookup)
  • Changing filesystem paths (attachments/{id_as_animals()}/)
  • Changing readonly key material (encrypt(id.to_string(), ...))
  • Breaking all existing deployments (database migration, orphaned attachment directories)

Capping to 2^53 solves the security problem with a one-line change, fully backwards compatible.

Backwards Compatibility

Fully backwards compatible. No changes to:

  • Pasta struct or database schema — the id field remains u64. SQLite INTEGER PRIMARY KEY and JSON serialization work unchanged.
  • Existing pastas — old pastas keep their small IDs. They coexist with new larger IDs. No migration needed.
  • Existing links — all URLs to existing pastas continue to work. The animal-name and hashid decoding functions accept any u64 value, regardless of magnitude.
  • Encoding functionsto_animal_names(u64) and to_hashids(u64) are unchanged. Larger IDs produce longer slugs (8-9 animal words instead of 1-3), but the encoding is the same algorithm.
  • Uniqueness check — the new while pastas.iter().any(|p| p.id == new_pasta.id) loop runs inside the existing mutex lock, so it correctly checks against both old and new pastas.

The only observable difference: new pastas get longer URLs. Existing pastas and links are unaffected.

Potentially Related Issues

Neither frames the 16-bit space as a security vulnerability.


This issue was drafted with AI assistance (Claude). If any facts are incorrect or misrepresented, please point them out and I'll correct them.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions