Skip to content

feat: Add state to the IR Zstd writer.#19

Open
davemarco wants to merge 9 commits intoy-scope:mainfrom
davemarco:stateful
Open

feat: Add state to the IR Zstd writer.#19
davemarco wants to merge 9 commits intoy-scope:mainfrom
davemarco:stateful

Conversation

@davemarco
Copy link
Contributor

@davemarco davemarco commented Feb 2, 2026

Description

This PR adds explicit state to the writer to handle two cases safely: avoiding double-closes during S3 upload retries, and preventing writes after a failure in the uploader thread.

Closed state

This PR includes the “closed” state change from #8. Since this PR is already introducing writer state logic, it made sense to bring that change into this PR.

The closed state prevents double-closing an IR stream during S3 upload retries (e.g., when the first upload attempt fails).

Corrupted state

This PR adds a “corrupted” state to remove the panics introduced in #8, which were brought up as undesirable during the review.

Those panics were originally added as a workaround for a synchronization issue: the uploader thread could fail while closing or resetting the stream, and the main thread might later try to write without realizing the writer was no longer usable.

Now, if the uploader thread encounters an error, it sets the writer to corrupted under a mutex. Any later write from the main thread returns an error instead of proceeding or panicking. This allows us to remove panic calls that would previously bring down the entire program, and instead fail safely at the writer level. I considered other designs to remove the panics, but I think this is a clean approach.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

TBU

Summary by CodeRabbit

New Features

  • Added ability to query the current state of writers (Open, Closed, or Corrupted).
  • Introduced new method to directly access memory writer output.

Bug Fixes

  • Strengthened state validation to prevent invalid operations on writers in improper states.
  • Enhanced error handling across all operations to properly mark writers as corrupted when failures occur.

@coderabbitai
Copy link

coderabbitai bot commented Feb 2, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@davemarco davemarco changed the title latest feat: Add state to the IR Zstd writer. Feb 2, 2026
@davemarco davemarco marked this pull request as ready for review February 20, 2026 03:40
@davemarco davemarco requested a review from a team as a code owner February 20, 2026 03:40
@davemarco davemarco marked this pull request as draft February 20, 2026 03:43
@davemarco davemarco marked this pull request as ready for review February 20, 2026 03:55
@davemarco davemarco requested a review from davidlion February 20, 2026 03:55
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/irzstd/disk.go (1)

179-197: ⚠️ Potential issue | 🟠 Major

Mark writer Corrupted if IR serializer close fails

If irWriter.Serializer.Close() fails, the writer remains Open, which can allow further writes on a partially closed stream. Marking it Corrupted prevents unsafe reuse.

🐛 Proposed fix
 	if w.irWriter != nil {
 		err := w.irWriter.Serializer.Close()
 		if err != nil {
+			w.state = Corrupted
 			return fmt.Errorf("error could not close irWriter: %w", err)
 		}
 		w.irWriter = nil
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/irzstd/disk.go` around lines 179 - 197, The Close routine currently
leaves w.state as Open if w.irWriter.Serializer.Close() returns an error,
allowing further unsafe writes; update Close (around the call to
w.irWriter.Serializer.Close() in disk.go) to set w.state = Corrupted before
returning the error so the writer cannot be reused, and still nil out w.irWriter
after handling the error; ensure this change touches the block that calls
w.irWriter.Serializer.Close(), and keep existing error wrapping (errorf) and the
prior flushIrBuffer() behavior intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/irzstd/disk.go`:
- Around line 146-150: WriteIrZstd currently checks w.state == Open but can
still dereference a nil w.irWriter if RecoverWriter set irWriter to nil; add an
explicit nil guard at the start of WriteIrZstd that returns an error (like
fmt.Errorf("cannot write: irWriter is nil, call CloseStreams or Reset")) when
w.irWriter == nil, so callers cannot panic; reference the WriteIrZstd method,
the RecoverWriter behavior, and the CloseStreams()/Reset() helpers when adding
this guard.

In `@internal/irzstd/state.go`:
- Around line 15-26: The String() method for WriterState uses
writerStateNames[s] which returns an empty string for unknown values; update
WriterState.String to perform a safe map lookup (e.g., value, ok :=
writerStateNames[s]) and return the mapped name when ok is true, otherwise
return a clear fallback like "WriterState(<value>)" or "UnknownWriterState"
including the numeric s to aid diagnostics while preserving behavior for known
states.

In `@internal/irzstd/writer.go`:
- Around line 59-63: The GetState() method documentation refers to the state as
"Closed" but the WriterState enum uses the identifier StreamsClosed; update the
comment block above GetState() to use the exact enum wording (StreamsClosed)
and, if helpful, briefly map human-readable text to the enum (e.g.,
"StreamsClosed (all streams closed)") so docs and the WriterState enum remain
consistent; locate the GetState() declaration and adjust its comment
accordingly.

---

Outside diff comments:
In `@internal/irzstd/disk.go`:
- Around line 179-197: The Close routine currently leaves w.state as Open if
w.irWriter.Serializer.Close() returns an error, allowing further unsafe writes;
update Close (around the call to w.irWriter.Serializer.Close() in disk.go) to
set w.state = Corrupted before returning the error so the writer cannot be
reused, and still nil out w.irWriter after handling the error; ensure this
change touches the block that calls w.irWriter.Serializer.Close(), and keep
existing error wrapping (errorf) and the prior flushIrBuffer() behavior intact.

davemarco and others added 4 commits February 26, 2026 14:09
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant