feat(memory): JSONL-backed session persistence by is-Xiaoen · Pull Request #732 · sipeed/picoclaw

is-Xiaoen · 2026-02-24T14:24:50Z

Summary

Implements pkg/memory/ — a new session persistence layer using append-only JSONL files. This is a revised approach based on the feedback from #719, where @yinwm pointed out that JSONL fits this use case better than SQLite.

Zero new dependencies — pure stdlib, no go.mod changes
Append-only writes — AddMessage is a single file append, no full-file rewrite
Logical truncation — TruncateHistory updates a skip offset in .meta.json instead of rewriting the message file
Physical compaction — Compact rewrites the JSONL file to reclaim disk space after repeated truncations
Agent-readable — .jsonl files work directly with read_file, tail, grep, and agent skills
Crash-safe — meta is always written before JSONL rewrites; incomplete lines from interrupted writes are silently skipped; TruncateHistory always reconciles meta.Count against the actual file

File layout

sessions/
├── telegram_123456.jsonl       # one message per line, append-only
├── telegram_123456.meta.json   # summary, skip offset, timestamps

Store interface

The Store interface maps 1:1 to the current SessionManager API. Each method is an atomic operation — no separate Save() call. It returns structured data (messages + summary separately), which keeps the storage layer clean and lets the prompt builder handle provider-specific caching optimizations downstream.

type Store interface {
    AddMessage(ctx, key, role, content) error
    AddFullMessage(ctx, key, msg) error
    GetHistory(ctx, key) ([]Message, error)
    GetSummary(ctx, key) (string, error)
    SetSummary(ctx, key, summary) error
    TruncateHistory(ctx, key, keepLast) error
    SetHistory(ctx, key, history) error
    Compact(ctx, key) error
    Close() error
}

Crash safety design

The two-file design (.jsonl + .meta.json) introduces crash windows between the two writes. The ordering is carefully chosen to always degrade toward "more data visible" rather than data loss:

Operation	Crash scenario	Effect	Recovery
`addMsg`	JSONL written, meta not updated	`meta.Count` stale by 1	`TruncateHistory` always re-counts actual lines
`SetHistory`	Meta written (Skip=0), JSONL not rewritten	Old messages temporarily visible	Next `SetHistory` corrects
`Compact`	Meta written (Skip=0), JSONL not rewritten	Truncated messages reappear	Next `Compact`/`TruncateHistory` corrects

Concurrency

Session locking uses a fixed [64]sync.Mutex sharded array (FNV hash). Memory is O(1) regardless of total session count — important for a long-running daemon.

Migration

MigrateFromJSON() reads legacy sessions/*.json files, writes them via SetHistory (atomic replace), and renames originals to .json.migrated. Idempotent — safe to retry after crash.

Why JSONL over SQLite

Per @yinwm's review on #719:

JSONL append is already atomic at the OS level — no transaction machinery needed
Agent tools can directly read session files, which matters for an AI agent framework
Zero external dependencies aligns with the "pico" philosophy
Covers 90%+ of use cases; SQLite can be added later behind the same Store interface if complex queries are actually needed

Type of change

New feature (non-breaking change which adds functionality)

Test plan

33 tests total (all passing, 0 lint issues):

20 unit tests: basic roundtrip, ordering, empty sessions, tool calls, tool call IDs, summary CRUD, truncation (keep N / keep 0 / keep more than exist / stale meta.Count), SetHistory replace + skip reset, colon-in-key filename mapping, crash recovery with partial lines, persistence across instances, session isolation
3 compaction tests: compact removes skipped lines from disk, no-op when skip=0, append after compact works correctly
2 concurrency tests: 10-goroutine concurrent writes; simulated Race condition in session history causes "tool_call_ids did not have response messages" (HTTP 400) #704 race (summarizer goroutine vs main loop)
3 benchmarks: AddMessage throughput, GetHistory at 100 and 1000 messages
8 migration tests: basic, tool calls, multiple files, invalid JSON skip, rename to .migrated, idempotent, colon-in-key, retry after crash (no duplicates)

$ go test ./pkg/memory/... -v -count=1
ok  	github.com/sipeed/picoclaw/pkg/memory	6.517s

$ golangci-lint run ./pkg/memory/...
0 issues.

Scope

This PR only adds new files under pkg/memory/ — no existing code is modified. Wiring into AgentLoop will be a separate PR.

Closes #711

Introduce a backend-agnostic Store interface in pkg/memory/ that maps one-to-one with the current SessionManager API. Each method is atomic — no separate Save() call needed. Refs sipeed#711

Add JSONLStore that persists sessions as .jsonl files (one message per line) plus .meta.json for summary and truncation offset. Key design decisions: - Append-only writes — no full-file rewrites on AddMessage - Logical truncation via skip offset instead of physical deletion - Per-session mutex for safe concurrent access - Crash recovery: malformed trailing lines are silently skipped - Atomic metadata writes using temp+rename Zero new dependencies — pure stdlib. Refs sipeed#711

Cover all Store interface methods plus edge cases: - Basic roundtrip, ordering, empty session, tool calls - Logical truncation (keep last N, keep zero, keep more than exist) - SetHistory replacing all + resetting skip offset - Crash recovery with partial JSON lines - Persistence across store instances - Concurrent add+read (10 goroutines x 20 msgs) - Simulated sipeed#704 race (summarizer vs main loop) - Benchmarks for AddMessage and GetHistory (100/1000 msgs)

Read existing sessions/*.json files, convert to JSONL format, and rename originals to .json.migrated as backup. The migration is idempotent — second runs skip already-migrated files. Session keys are read from JSON content (not filenames) so that sanitized names like telegram_123 correctly map back to telegram:123.

Address file growth concern from sipeed#711 review: logical truncation via skip offset is fast but leaves dead lines on disk indefinitely. Compact() rewrites the JSONL file keeping only active messages, using the same temp+rename pattern for crash safety. No-op when skip == 0. The caller (lifecycle manager or agent loop) decides when to trigger compaction — e.g. when skipped lines exceed active lines.

Zhaoyikaiii · 2026-02-26T02:55:37Z

pkg/memory/jsonl.go

+	dir string
+
+	mu    sync.Mutex
+	locks map[string]*sync.Mutex


This is usually fine for small tools, but if picoclaw is handling tens of thousands of sessions, it's advisable to consider using an LRU cache to limit the number of locks, or adding a cleanup mechanism to the Close logic. A simpler approach is to directly use sync.Map's LoadOrStore.

Good call — switched to sync.Map with LoadOrStore. Cleaner and removes the separate mutex entirely. Pushed in 5d73ee2.

Zhaoyikaiii · 2026-02-26T03:01:43Z

pkg/memory/jsonl.go

+	// Allow up to 1 MB per line for messages with large content.
+	scanner.Buffer(make([]byte, 0, 64*1024), 1024*1024)
+
+	for scanner.Scan() {


Currently, GetHistory performs an $O(N)$ scan of the entire JSONL file, which will degrade performance as session files grow. We can achieve $O(1)$ access by using byte offsets.1. Update Metadata SchemaAdd ActiveOffset to track the byte position of the first valid message.

type sessionMeta struct { // ... ActiveOffset int64 `json:"active_offset"` }

Update TruncateHistoryCalculate the byte offset of the $N$-th message from the end during truncation.3. Optimize GetHistoryUse f.Seek to jump directly to the active conversation:Go// In GetHistory

if meta.ActiveOffset > 0 { // Start scanning from here... if _, err := f.Seek(meta.ActiveOffset, io.SeekStart); err != nil { return nil, err } }

Why: This prevents unnecessary CPU/Memory overhead from parsing discarded JSON lines, especially for long-lived sessions.

Makes sense. I went with a slightly different approach for now: added a skip parameter to readMessages so GetHistory and Compact both skip the first N lines without unmarshaling them. This avoids the CPU cost on truncated lines while keeping the implementation straightforward.

For the byte-offset Seek approach — I think that's a solid next step if sessions get really long. The trade-off right now is that TruncateHistory stays O(1) (just a metadata write), whereas computing byte offsets during truncation would make it O(N). Combined with Compact bounding the file size, the skip-scan approach should hold up well in practice. Happy to add ActiveOffset later if we see it becoming a bottleneck.

Pushed in 5d73ee2.

Zhaoyikaiii · 2026-02-26T03:05:59Z

pkg/memory/jsonl.go

+
+// readMessages reads all valid JSON lines from a .jsonl file.
+// Malformed trailing lines (e.g. from a crash) are silently skipped.
+func readMessages(path string) ([]providers.Message, error) {


or we could specify a offset as a param here

Done — readMessages now takes a skip int parameter. Lines before the offset are scanned but not unmarshaled, saving the JSON parsing overhead. See 5d73ee2.

Zhaoyikaiii · 2026-02-26T03:07:51Z

pkg/memory/jsonl.go

+		return nil
+	}
+
+	all, err := readMessages(s.jsonlPath(sessionKey))


we could save last 20 messages.and using seek to skip the front messages.

Addressed — Compact now passes meta.Skip to readMessages, so the skipped front lines are scanned without unmarshaling. Same commit (5d73ee2).

@Zhaoyikaiii

…dMessages Address review feedback from @Zhaoyikaiii: - Replace map[string]*sync.Mutex + separate mu with sync.Map.LoadOrStore for simpler, lock-free session lock management. - Add skip parameter to readMessages so callers (GetHistory, Compact) can skip truncated lines without paying the json.Unmarshal cost. - Add countLines helper for TruncateHistory's count reconciliation, avoiding full deserialization when only the line count is needed.

@yinwm

Address feedback from @yinwm for long-running daemon use: - Replace sync.Map with a fixed-size sharded lock array (64 mutexes). Keys are mapped via FNV hash, so memory is O(1) regardless of how many sessions are created over the process lifetime. - Increase scanner buffer cap from 1 MB to 10 MB. Tool results (read_file on large files, web search responses) can easily exceed 1 MB. The scanner still starts at 64 KB and only grows as needed.

is-Xiaoen · 2026-02-26T07:35:41Z

sync.Map → sharded lock array: Replaced with a fixed [64]sync.Mutex pool, keys mapped via FNV hash. Memory is O(1) regardless of session count — no growth over the daemon's lifetime.

Scanner buffer 1MB → 10MB: Tool results (read_file on large files, web search dumps, etc.) can exceed 1MB easily. Bumped the cap to 10MB. The scanner still starts at 64KB and grows lazily, so normal messages don't pay for it.

A crash between the JSONL append and the meta update in addMsg can leave meta.Count stale (e.g. file has 101 lines but meta says 100). The previous code only reconciled when Count==0, so a nonzero stale count was silently trusted, causing keepLast/skip to be calculated against the wrong total. Now TruncateHistory always counts the actual lines on disk. This is cheap (scan without unmarshal) and TruncateHistory is not a hot path.

In SetHistory and Compact, the JSONL file was rewritten before updating the meta file. If the process crashed between the two writes, the meta still had a large Skip value pointing past the now-shorter JSONL file, causing GetHistory to return empty — effectively data loss. Reverse the order: write meta (with Skip=0) first, then rewrite JSONL. On crash between the two writes, the old uncompacted file is still intact and GetHistory reads from line 1, returning stale-but-complete data. The next operation self-corrects.

MigrateFromJSON previously called AddFullMessage in a loop, then renamed the .json file to .json.migrated. If the process crashed after appending some messages but before the rename, a retry would re-read the same .json and append all messages again — duplicating whatever was written before the crash. Switch to SetHistory which atomically replaces the session contents. A retry after crash overwrites the partial data instead of appending.

is-Xiaoen · 2026-02-26T08:22:56Z

这轮重点修了三个 crash safety 的问题：

1. TruncateHistory 对账逻辑不够健壮

之前只在 meta.Count == 0 时才去数文件行数。但 addMsg 写 JSONL 和写 meta 是两步操作，中间崩溃会导致 meta.Count 偏小（比如文件 101 行但 meta 说 100），非零的 stale count 直接被信任，keepLast 算出来的 skip 就会偏。

现在 TruncateHistory 每次都 countLines 对账，反正不是热路径，countLines 也只是扫行不 unmarshal。

2. SetHistory / Compact 的写入顺序有 data loss 风险

之前是先重写 JSONL 再更新 meta。如果在中间崩溃，meta 还保留着旧的 Skip=90，但 JSONL 文件只剩 10 行了——GetHistory 从第 91 行开始读，直接返回空。

调换了顺序：先写 meta（Skip=0），再重写 JSONL。中间崩溃的最坏情况是旧文件还在、meta 说 Skip=0，GetHistory 会多返回一些已截断的消息，但不会丢数据。下次操作自动修正。

3. Migration 在崩溃重试时会重复写入

之前迁移是 per-message 调 AddFullMessage，全部写完才 rename .json → .json.migrated。如果写了 50 条消息后崩溃（rename 还没执行），重试会把 100 条全部再写一遍，变成 150 条。

改成用 SetHistory 原子替换，重试直接覆盖而不是追加。

每个 fix 都有对应的测试用例验证。

nikolasdehor

Excellent work. The design is well-reasoned, the crash safety analysis is thorough, and the test coverage (33 tests including concurrency and migration edge cases) is impressive. A few observations:

1. Lock shard collision could cause correctness issues with countLines.
FNV hash mod 64 means different session keys can share the same mutex shard. If two unrelated sessions happen to share a shard, their operations serialize correctly (just slower). However, countLines in TruncateHistory counts lines for one session while another session sharing the same shard is blocked -- this is fine because the lock prevents concurrent writes to the same session. The shard collision only causes unnecessary serialization between unrelated sessions. No bug here, just confirming the design works.

2. addMsg has a crash window between JSONL append and meta write.
As documented in the PR, if the process crashes after appending to the JSONL but before updating .meta.json, meta.Count becomes stale. The comment says TruncateHistory always re-counts. But GetHistory uses meta.Skip from the potentially-stale meta -- this is still correct because the skip offset has not changed, and the new message appears at the end. Good. However, note that meta.Count itself is never re-reconciled during AddMessage -- it just increments from the (possibly stale) value. So after a crash + recovery, meta.Count could be permanently off by 1 unless TruncateHistory is called. This is harmless since Count is only used by TruncateHistory (which re-counts), but worth documenting.

3. sanitizeKey is a lossy mapping.
As the test TestMigrateFromJSON_ColonInKey correctly notes, telegram:123 and telegram_123 map to the same file. This means a malicious or misconfigured channel name could collide with another. For a personal assistant framework this is acceptable, but consider adding a comment in sanitizeKey noting this is an intentional tradeoff.

4. rewriteJSONL does not call f.Sync() before rename.
On Linux, os.Rename does not guarantee that the file data is flushed to disk. If the OS crashes (not just the process) between f.Close() and after os.Rename, the file could be zero-length or corrupt. Adding f.Sync() before f.Close() in rewriteJSONL would make it crash-safe against OS crashes too. For a "pico" tool this is probably overkill, but since the PR explicitly discusses crash safety, worth mentioning.

5. No fsync on the JSONL append in addMsg either.
Same concern as above -- f.Write + f.Close() does not guarantee durability against power loss. The message could be lost entirely (not just truncated). Again, probably acceptable for the use case.

6. readMessages silently skips corrupt lines.
This is correct JSONL recovery behavior, but it means a partially-written line (from a crash) is silently dropped. If this happens to be a critical user message, it is lost. Consider logging a warning when a non-empty line fails to unmarshal, so operators know data was lost.

Overall this is one of the cleanest new-package PRs I have seen on this project. The interface design is right, the crash safety reasoning is thorough, and the tests are comprehensive.

is-Xiaoen mentioned this pull request Feb 25, 2026

feat(memory): add SQLite-backed persistent session store #719

Closed

10 tasks

is-Xiaoen added 5 commits February 26, 2026 08:35

feat(memory): define Store interface for session persistence

32ec8ca

Introduce a backend-agnostic Store interface in pkg/memory/ that maps one-to-one with the current SessionManager API. Each method is atomic — no separate Save() call needed. Refs sipeed#711

is-Xiaoen force-pushed the feat/jsonl-memory-store branch from cfc5ba7 to b464687 Compare February 26, 2026 00:43

is-Xiaoen mentioned this pull request Feb 26, 2026

[Feature] JSONL-backed session persistence with Store interface #711

Open

3 tasks

Zhaoyikaiii reviewed Feb 26, 2026

View reviewed changes

is-Xiaoen added 2 commits February 26, 2026 14:31

is-Xiaoen added 3 commits February 26, 2026 16:12

nikolasdehor reviewed Feb 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(memory): JSONL-backed session persistence#732

feat(memory): JSONL-backed session persistence#732
is-Xiaoen wants to merge 10 commits intosipeed:mainfrom
is-Xiaoen:feat/jsonl-memory-store

is-Xiaoen commented Feb 24, 2026 •

edited

Loading

Uh oh!

Zhaoyikaiii Feb 26, 2026

Uh oh!

is-Xiaoen Feb 26, 2026

Uh oh!

Zhaoyikaiii Feb 26, 2026

Uh oh!

is-Xiaoen Feb 26, 2026

Uh oh!

Zhaoyikaiii Feb 26, 2026

Uh oh!

is-Xiaoen Feb 26, 2026

Uh oh!

Zhaoyikaiii Feb 26, 2026

Uh oh!

is-Xiaoen Feb 26, 2026

Uh oh!

is-Xiaoen commented Feb 26, 2026 •

edited

Loading

Uh oh!

is-Xiaoen commented Feb 26, 2026

Uh oh!

nikolasdehor left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

is-Xiaoen commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

File layout

Store interface

Crash safety design

Concurrency

Migration

Why JSONL over SQLite

Type of change

Test plan

Scope

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

is-Xiaoen commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

is-Xiaoen commented Feb 26, 2026

Uh oh!

nikolasdehor left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

is-Xiaoen commented Feb 24, 2026 •

edited

Loading

is-Xiaoen commented Feb 26, 2026 •

edited

Loading