Adding support for fewer number of fragment files (#74 and #739) #1020

yakovsh · 2026-02-05T04:08:30Z

This PR adds support for grouping fragments into smaller number of files by combining them by the first N characters from the hash into the same file (issues #74 and #739). I tried to make the changes minimally invasive and added some tests. Let me know if something needs to be adjusted or changed.

bglw · 2026-02-08T06:49:04Z

👋 Thanks for the PR!

This is a feature we need, though this isn't quite how I'd like to approach it. Some notes:

I don't love shipping options that are tied to implementation details, rather than what someone actually wants it to do. Rather than --fragment-group-lenwe should just specify something like --max-fragments.
This does require more explicit grouping when indexing, but one bonus of this would be actually storing the shorter hashes in the meta index file. For large sites this meta file gets somewhat large, so reducing the length and quantity of the fragment hashes stored in it is an extra bonus we could get.
The multi-fragments should be cached by the js in the browser, rather than loading+parsing the chunk and discarding the rest.
The test should cover a scenario where the grouping actually has an effect.

Let me know how you want to proceed with this one, whether you want to reshape this PR or if you'd rather I take a crack at this feature after 1.5.0 ships :)

yakovsh · 2026-02-10T02:53:40Z

With the approach of "--max-fragments", we would need to know first how many total pages there are, then decide how many fragments we would end up generating. So if "--max-fragments" is larger than the total number of pages, I would assume things can stay as today. Once we find the total number of pages, the number of fragments would be pages/max_fragments.

(An alternative approach might be to specify how many pages we store per fragment, something like "--pages-per-fragment" which can make this a little cleaner but probably too much tied to implementation details)

Then, instead of using page hashes, the fragments can be named like "en_001.pf_fragment", "en_002.pf_fragment", etc. and the metadata that ties pages to fragments can be numeric like "1, 2, 3", instead of hashes as they are today?

Is this how you understand it?

bglw · 2026-02-10T03:01:09Z

No we do still need hashes, and in fact an issue I didn't highlight with this PR is that reducing the hash prefix down too far will cause cache collisions. One of the jobs of the hashes is to allow indefinite caching of Pagefind assets, as they naturally cache bust when content changes. A en_001.pf_fragment file would thus be cached ~forever and would contain stale data. These files will need to be a hash of all the fragments within (or a hash of the hashes etc).

yakovsh · 2026-02-10T03:19:33Z

I didn't think of that since in my use case, I refresh the caching manually via a CDN. Let me try to reshape the PR.

yakovsh and others added 3 commits January 18, 2026 22:29

added example of overriding url inline

1c2a4ed

Merge branch 'Pagefind:main' into main

f0e8a03

Adding support for fewer number of fragment files (Pagefind#74)

f49635e

yakovsh requested a review from bglw as a code owner February 5, 2026 04:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding support for fewer number of fragment files (#74 and #739) #1020

Adding support for fewer number of fragment files (#74 and #739) #1020

yakovsh commented Feb 5, 2026

Uh oh!

bglw commented Feb 8, 2026

Uh oh!

yakovsh commented Feb 10, 2026

Uh oh!

bglw commented Feb 10, 2026

Uh oh!

yakovsh commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Adding support for fewer number of fragment files (#74 and #739) #1020

Are you sure you want to change the base?

Adding support for fewer number of fragment files (#74 and #739) #1020

Conversation

yakovsh commented Feb 5, 2026

Uh oh!

bglw commented Feb 8, 2026

Uh oh!

yakovsh commented Feb 10, 2026

Uh oh!

bglw commented Feb 10, 2026

Uh oh!

yakovsh commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants