Releases: KRLabsOrg/squeez
Releases · KRLabsOrg/squeez
v0.1.3
Fixed
- Model-agnostic inference: replaced hardcoded Qwen ChatML template with
tokenizer.apply_chat_template()— local transformers backend now works with any model family - vLLM model name bug: server model name from env var was not being passed through to API calls, causing 400 errors on providers like Groq
Added
- Pooled line classifier backend: new
pooledbackend for sentence-level classification - LoRA auto-detection: transformers backend auto-detects and loads LoRA/PEFT checkpoints
- Batch extraction:
extract_many()with concurrent requests for remote backends - Encoder model: token-level line classification with mmBERT
Changed
- Inference refactor: shared
_build_messages()used by both vLLM and transformers backends - Query tag: renamed
<task>to<query>in prompt formatting
Full changelog: v0.1.2...v0.1.3
PyPI: https://pypi.org/project/squeez/0.1.3/
v0.1.2
What's Changed
Fixed
- Chat template: switched from custom tokens to Qwen ChatML format (
<|im_start|>/<|im_end|>) — critical for correct fine-tuning and inference - Evaluation metrics: replaced line-number-based metrics with span-level exact match, precision/recall/F1, partial overlap, and empty accuracy
- Dataset metadata: recomputed all metadata fields from actual response content
Changed
- Training: added Unsloth support for memory-efficient LoRA training on Qwen 3.5
- Training config: tuned for A100 80GB (effective batch size 32, max_length 16384)
- Data splits:
download_data.pynow creates train/dev/test (dev for checkpoint selection)
Added
- Data quality: manually reviewed test split (55/436 corrected), traceback curation on train split (123/7148 corrected)
- Documentation: new data quality page documenting the full QA pipeline
Full Changelog: v0.1.1...v0.1.2
v0.1.1
What's Changed
- Optional heavy dependencies:
torch,transformers,peft,datasetsare no longer required for API-only usagepip install squeez— lightweight, justopenai+pyyaml(for vLLM/Groq/OpenAI-compatible servers)pip install squeez[local]— addstorch,transformers,peftfor local inferencepip install squeez[train]— addsdatasetsfor trainingpip install squeez[all]— everything
Full Changelog: v0.1.0...v0.1.1
v0.1.0 - Initial Release
squeez v0.1.0
First release of squeez — squeeze verbose LLM agent tool output down to only the relevant lines.
Highlights
- Two inference backends: local transformers and vLLM/OpenAI-compatible server (Groq, etc.)
- CLI + Python API:
cat output.txt | squeez "Fix the bug"or useToolOutputExtractordirectly - LoRA fine-tuning: train on the SWE-bench tool output dataset (7,148 train / 436 eval samples)
- Flexible configuration: CLI args > env vars > config file (
squeez.yaml) - Subcommands:
squeez extract,squeez train,squeez eval,squeez pipeline - Claude Code integration: use as a tool in your CLAUDE.md
Install
pip install squeez