Skip to content

Conversation

@pingSubhajit
Copy link
Contributor

Summary

This release introduces pluggable advanced chunking methods as a major feature and fixes a CLI crash on Windows.

Changes

Feature: Advanced Chunking Methods (#43)

  • Replaced word-based chunking with token-based recursive chunking using js-tiktoken as the new default.
  • Introduced pluggable chunker plugins: semantic, code, agentic, markdown, and hierarchical, each with dedicated registry entries and shared utilities for LLM, text, and optional dependency handling.
  • Added CLI support for selecting and configuring chunker plugins via unrag add.
  • Wired advanced chunkers into the /install wizard and the doctor command for validation.
  • Added comprehensive documentation for all new chunking methods under docs/(unrag)/chunking/.
  • Updated agent skills references and core type definitions to reflect the new chunking architecture.
  • Added js-tiktoken as a new runtime dependency.

Fix: Windows CLI Upgrade Crash

  • Resolved a path resolution conflict that caused the upgrade CLI command to crash on Windows.

Files Changed

  • 63 files changed (+7296, -854)
  • Core chunking engine: packages/unrag/registry/core/chunking.ts, types.ts, ingest.ts
  • New chunker plugins: packages/unrag/registry/chunkers/{agentic,code,hierarchical,markdown,semantic}/
  • Shared chunker utilities: packages/unrag/registry/chunkers/_shared/
  • CLI commands and registry: packages/unrag/cli/
  • Install wizard: apps/web/app/install/
  • Documentation: apps/web/content/docs/(unrag)/chunking/
  • Upgrade snapshot fix: packages/unrag/cli/lib/upgrade/snapshot.ts

pingSubhajit and others added 3 commits January 26, 2026 12:34
* feat: build architectural design for advanced chunking methods
* feat: Implement token-based recursive chunking with `js-tiktoken` as the new default, removing word-based chunking and updating chunking configuration and documentation.
* feat: Introduce pluggable chunker plugins including semantic, code, agentic, markdown, and hierarchical methods with CLI support.
* chore: update .gitignore
* lint: fix linter issues
* feat: update chunking docs and add chunkers in install wizard
* lint: fix linter issues
* docs: update agent skills to introduce new chunking methods
* feat: wire chunking in /install wizard & doctor command
* docs: update chunking docs to match the documentation tone
* feat: wire advanced chunkers to installation wizard
* fix: lint and type errors
* feat: add announcement for advanced chunking

---------

Co-authored-by: Subhajit Kundu <subha60kundu@gmail.com>
@pingSubhajit pingSubhajit self-assigned this Jan 26, 2026
@vercel
Copy link

vercel bot commented Jan 26, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
unrag-example-support-ticket-search Ready Ready Preview, Comment Jan 26, 2026 7:19am
unrag-web Ready Ready Preview, Comment Jan 26, 2026 7:19am

Request Review

@pingSubhajit pingSubhajit merged commit 6e69c5d into main Jan 26, 2026
4 checks passed
@pingSubhajit pingSubhajit deleted the release/v0.3.3 branch January 26, 2026 07:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants