agentstation · jackspirou · Aug 6, 2025 · Aug 6, 2025
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -96,9 +96,8 @@ The tokenizer project is organized into modular packages with clear separation o
    - `tokens/` - Special token handling
 
 4. **llama3/cmd/llama3/** - Llama3-specific CLI commands
-   - `encode.go` - Text encoding command
+   - `encode.go` - Text encoding command (with memory-efficient streaming for stdin)
    - `decode.go` - Token decoding command
-   - `stream.go` - Streaming tokenization command
    - `info.go` - Tokenizer information command
 
 ### Key Architectural Decisions

diff --git a/cmd/tokenizer/README.md b/cmd/tokenizer/README.md
@@ -78,27 +78,21 @@ echo "128000 9906 11 1917 0 128001" | tokenizer llama3 decode
 # Round-trip encoding and decoding
 tokenizer llama3 "test" | tokenizer llama3 decode
 
-# Stream large files (automatic)
+# Process large files efficiently (automatic memory-efficient streaming)
 cat large_file.txt | tokenizer llama3
-
-# Stream large files (explicit)
-cat large_file.txt | tokenizer llama3 stream
 ```
 
-### Streaming Mode
+### Processing Large Files
 
-For processing large files or real-time input:
+The tokenizer automatically uses memory-efficient streaming when processing piped input:
 
 ```bash
-# Automatic streaming (detects piped input)
+# Process large files with O(1) memory usage
 tokenizer llama3 < input.txt
 cat large_file.txt | tokenizer llama3
 
-# Explicit streaming with options
-tokenizer llama3 stream --buffer-size=8192 --max-buffer=2097152 < large_file.txt
-
-# Stream without special tokens
-tokenizer llama3 stream --bos=false --eos=false < input.txt
+# Process without special tokens
+tokenizer llama3 --bos=false --eos=false < input.txt
 ```
 
 ## Available Tokenizers
@@ -108,9 +102,8 @@ tokenizer llama3 stream --bos=false --eos=false < input.txt
 Meta's Llama 3 tokenizer with 128,256 tokens (128,000 regular + 256 special tokens).
 
 **Commands:**
-- `encode` - Convert text to token IDs
+- `encode` - Convert text to token IDs (memory-efficient for stdin)
 - `decode` - Convert token IDs to text  
-- `stream` - Process text in streaming mode
 - `info` - Display tokenizer information
 
 ## Examples

diff --git a/llama3/IMPLEMENTATION.md b/llama3/IMPLEMENTATION.md
@@ -204,7 +204,7 @@ type Scanner interface {
 }
 ```
 
-Create a scanner with `tokenizer.NewScanner(reader)` or `NewScannerOptions` for custom configuration.
+Create a scanner with `tokenizer.NewScanner(reader, opts...)` with optional configuration.
 
 ### Pipeline Interfaces
 
@@ -285,7 +285,7 @@ if err := scanner.Err(); err != nil {
 
 Custom buffer configuration:
 ```go
-scanner := tokenizer.NewScannerOptions(reader,
+scanner := tokenizer.NewScanner(reader,
     llama3.WithBufferSize(8192),
     llama3.WithMaxBuffer(1024*1024),
     llama3.WithEncodeOptions(&llama3.EncodeOptions{