Skip to content

Latest commit

 

History

History
67 lines (57 loc) · 9.36 KB

File metadata and controls

67 lines (57 loc) · 9.36 KB

AGENTS

This file helps AI agents discover and understand how to work with this repository.

Discovery

  • Primary entry points: README.md, include/, src/, and tests/ describe the architecture and entry points for this library. Use rg to locate interesting symbols before jumping into implementation.
  • Python bindings: The new python/ directory holds the pybind11-based module and tests/python/test_bindings.py exercises it; toggle T81LIB_BUILD_PYTHON_BINDINGS when configuring CMake to build the module.
  • Build tooling: The project uses CMake. Inspect CMakeLists.txt and related files in cmake/ or docs/ for build and test instructions before making changes.

Agent guidelines

  • Follow the existing coding style in include/t81/core/ and use ASCII-only edits unless a file already includes other Unicode characters.
  • Prefer rg for searching and avoid destructive operations (git reset --hard, etc.).
  • Respect non-AI manual edits in the working tree; do not revert unless asked.

Suggested workflow

  1. Run any relevant unit tests in tests/unit/ via CTest or the provided scripts whenever you touch critical paths to verify behavior.
  2. Document significant algorithm changes in docs/ or README.md as appropriate.
  3. Mention new files or important updates back in this file so future agents can find your work quickly.

Recent updates

  • Reworked the top-level CMakeLists.txt, rewrote run-tests.sh to execute configure/build/tests, and reordered tests/unit/test_limb_basic.cpp so t81/t81lib.hpp is included before the SIMD helpers to keep limb defined.
  • Balanced ternary bigint logic in include/t81/core/bigint.hpp now normalizes signed limbs more efficiently and fixes ~/division helpers so later agents can spot the modern bitwise/division flow.
  • tests/unit/test_numeric_types.cpp now exercises Complex, Polynomial, and F2m helpers so the umbrella numeric helpers stay locked down.
  • README.md now documents the high-level helpers (Float, Ratio, Complex, Polynomial, F2m, Fixed<N>, Modulus, and MontgomeryInt) plus the t81::Int alias exposed through t81/t81lib.hpp.
  • include/t81/t81lib.hpp now exposes Float::from_string, a RatioFloat conversion, the Int81 Fixed<48> alias, and std::hash hooks for limb/bigint so hashing and string-based floats land in the umbrella header.
  • README.md plus the umbrella header now document the FloatN template, ternary _t3 literal, R3 NTT helpers, and std::formatter specializations so overlined ternary floats behave nicely in std::format, and t81::Vector provides a ready-to-use coefficient container with arithmetic helpers.
  • README.md/umbrella header now mention t81::Matrix<Element, R, C> and how it complements Vector for linear algebra over FloatN/Fixed scalars.
  • F2m now lives in include/t81/gf2m.hpp (still re-exported through t81/t81lib.hpp), and Fixed<N> gained balanced / and % helpers so division/magnitude math stays accessible in the umbrella header.
  • Added t81::linalg::gemm_ternary and the Python binding t81lib.gemm_ternary so packed ternary GEMMs with alpha/beta semantics are exposed across the C++/Python API surface.
  • Documented the t81.torch/t81.nn PyTorch helpers in README.md and docs/index.md, pointing to the examples/demo_llama_conversion.py, examples/scaling_laws_ternary.py, and examples/ternary_sparse_preview.py demos so future agents can locate the torch bridge.
  • Added production-ready Python bindings (python/bindings.cpp) plus packaging helpers (setup.py, pyproject.toml) that expose Limb/BigInt helpers, Montgomery contexts, NumPy quantization utilities, and a tutorial notebook examples/ternary_quantization_demo.ipynb.
  • Added t81.hardware.TernaryEmulator, documentation for hardware simulation, and examples/ternary_hardware_sim_demo.ipynb so agents can explore ternary gate/circuit modeling, fuzzy AI decisions, and power-aware PyTorch inference workflows.
  • Added docs/references/cli-usage.md (linked from docs/index.md) to cover t81-convert, t81-gguf, and t81-qat usage with the CPU/offloading tips we surfaced for low-memory Apple Silicon.
  • Added a unified t81 console script that exposes convert/gguf subcommands while preserving the legacy t81-convert/t81-gguf wrappers, plus updated docs/tests to reference the new entry point.
  • Added docs/diagrams/cli-workflows-mermaid.md to visualize the t81-convert, t81-gguf, and t81-qat workflows for future contributors looking at the CLI surface.
  • Extended examples/ternary_qat_inference_comparison.py so it now runs train + validation loops, logs compression ratios + per-step losses, and correlates the ternary threshold history with measured GEMM latencies.
  • Added scripts/quantize_measure.py, which chains t81-convertAutoModel.from_pretrained_t81 → latency/compression stats so you can automate quantize→measure in other pipelines.
  • Added docs/references/hardware-emulation.md to explain how t81.hardware.TernaryEmulator, the Python quantization helpers, and the CLI automation fit together for energy-aware AI reasoning.
  • Added scripts/quantize_energy_benchmark.py to orchestrate quantize→latency+energy benchmarks, logging compression, timing, and emulator energy stats into CSV/JSON outputs for reuse in reports.
  • Added examples/quantization_config_report.py so you can sweep synthetic datasets (dims, thresholds, sizes) and capture accuracy, latency, and storage comparisons for multi-module configs before quantizing real models.
  • Added t81/cli_validator.py plus a --validate flag for t81-convert/t81-gguf so the CLI reruns gguf.read_gguf (and llama.cpp’s gguf_validate/gguf_to_gguf when available) to ensure exported GGUF bundles stay compatible before a run returns success.
  • Added t81/cli_progress.py plus progress logging to t81-convert, t81-gguf, and t81-qat so the CLIs print bar/percentage updates while converting, exporting, or fine-tuning checkpoints.
  • Documented the automation scripts (scripts/quantize_measure.py, scripts/quantize_energy_benchmark.py) plus the CLI telemetry/progress experience so future agents can quickly measure quantization impact, latency, and hardware energy from the console.
  • Added examples/cli-examples.md with ready-to-copy CLI snippets showing conversion, GGUF export, and QAT flows for the three helpers.
  • Updated README.md to highlight the CLI docs/diagrams/examples so newcomers can find the new references through the main overview.
  • Added docs/ROADMAP.md to capture an executive summary, analysis, and next-step recommendations for steering t81lib toward wider adoption and smoother contributions.
  • Added mkdocs.yml, docs/python-api.md, and docs/python-cookbook.md so MkDocs + mkdocstrings can publish the Python API reference and cookbook, and linked them from docs/index.md.
  • Expanded python/t81/__init__.py so the higher-level t81 package re-exports the compiled binding helpers (t81lib, BigInt, Limb, gemm_ternary, etc.) while staying import-safe when the extension is unavailable.
  • Added scripts/ternary_quantization_benchmark.py plus BENCHMARKS.md so contributors can reproduce a Fashion-MNIST FP32/PTQ/QAT benchmark and log accuracy/latency/storage for each mode; README now links the benchmark doc.
  • Rewrote pyproject.toml with valid TOML sections so editable installs (and pip install -e '.[torch]') can parse the metadata cleanly before building the extension.
  • Restructured README.md into a onboarding-focused front door and added companion docs (docs/use-cases.md, docs/hardware.md, docs/api-overview.md, docs/python-install.md, docs/torch.md, docs/gpu.md, examples/README.md) so heavy reference material lives outside the visitor-facing overview.
  • Added optional CUDA/ROCm toggles plus a GPU dispatcher sketch (include/t81/linalg/gemm_gpu.hpp, src/linalg/{gemm_cuda.cu,gemm_dispatch.cpp,gemm_rocm.cpp}) so future teams can wire the new where/clamp/lerp/addcmul helpers into GPU kernels, introduced t81::TensorMetadata + Python helpers (python/bindings.cpp) that extract metadata from NumPy/Torch tensors, and expanded tests/python/test_gpu_ops.py to cover the metadata-backed bindings on both CPU and GPU paths.
  • Enhanced tests/python/test_gguf.py with quant-parameterized round-trip checks, metadata assertions, and a regression case for invalid quant identifiers to spotlight the GGUF helpers before future agents touch them.
  • Hardened the SIMD detection helpers in include/t81/core/detail/simd.hpp with CPUID/xgetbv fallbacks, documented the add_trytes_* overflow semantics, and made NEON runtime checks opt-out via T81_DISABLE_NEON.
  • Added the compression-first GGUF export profile (metadata + CLI flags), plus scripts/gguf_benchmark.py and CLI docs that walk FP16 to ternary GGUF before/after measurements.
  • Added examples/ternary_phi3_ptq_qat_demo.ipynb to showcase Phi-3-mini PTQ/QAT size, latency, and perplexity comparisons in one compact notebook.
  • Added Metal pack/quantize kernels (src/linalg/pack_kernel.metal, src/linalg/pack_metal.mm) plus include/t81/linalg/pack_gpu.hpp and Python binding dispatch so PTQ packing can run on Apple Metal when enabled.
  • Documented GGUF helper APIs (read_gguf, repack_gguf, dequantize_gguf) plus the experimental TQ1_1 note in the GGUF and Python docs.