Working version with nvidia geforce 1050 ti 4GB#299
Open
ssingh10 wants to merge 10 commits intokarpathy:masterfrom
Open
Working version with nvidia geforce 1050 ti 4GB#299ssingh10 wants to merge 10 commits intokarpathy:masterfrom
ssingh10 wants to merge 10 commits intokarpathy:masterfrom
Conversation
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
IgorTavcar
added a commit
to IgorTavcar/autoresearch
that referenced
this pull request
Mar 17, 2026
…context mgmt, low-VRAM, eval guide PR karpathy#291 — Data integrity verification for downloads Adds Content-Length size verification and Parquet metadata validation (pq.read_metadata) before committing downloaded shards. Catches truncated or corrupted files from network interruptions before they get sealed with a SHA-256 hash. Layered on top of our existing atomic .tmp rename and SHA-256 sidecar verification. PR karpathy#282 — Bake reflection into the experiment loop Adds musings.md initialization to setup, plus pre-experiment rationale (step 2: explain the idea and its ML grounding) and post-experiment reflection (step 9: record outcome and interpretation). Leaves a learning trail for humans and may improve agent idea generation quality. Issue karpathy#298 — Subagent delegation for context window preservation Adds a "Context management" section to program.md with a subagent prompt template. The main agent holds research state; subagents handle mechanical steps (commit, train, extract metrics). Verbose output dies with the subagent, keeping the primary context clean over 50+ experiment runs. PR karpathy#299 — Low-VRAM auto-detection (cherry-picked universal parts) Adds VRAM detection: GPUs with < 6GB automatically get reduced hyperparameters (batch=32, seq=256, depth=4, SSSL window pattern). Introduces TRAIN_SEQ_LEN variable used throughout model config, dataloader, and evaluation. Also adds seq_len and max_steps optional parameters to evaluate_bpb() for flexible eval on constrained hardware. Skipped: hardware-specific torch/kernels downgrades, 1050 Ti tuning. PR karpathy#303 — Guide for evaluating experiment results at scale New docs/evaluating-results.md covering noise floor estimation (awk one-liner for median pairwise delta), when to trust an improvement (1.5x noise floor rule), Pareto efficiency analysis, and useful one-liners for results.tsv at scale. Optional: PR karpathy#276 — Deterministic keep/discard policy engine Standalone contrib/policy_engine.py (60 lines) + test suite (9 tests). Evaluates experiments by val_bpb improvement vs complexity tradeoff. NOT wired into the training loop — available as an optional decision aid. Placed in contrib/ to signal its optional nature. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR to get autoresearch working with commodity hardware
commit val_bpb memory_gb status description
beffca5 1.802948 0.2 keep baseline
7a9ac45 1.812531 0.4 discard depth 6 (worse than baseline)
488bb6c 1.709017 0.2 keep batch 2 (improved gradient signal)
cbf0477 1.644740 0.3 keep batch 4 (further improvement)
57a782f 1.625506 0.6 keep batch 8
dcd8e21 1.610689 1.0 keep batch 16
d1a4f1c 1.581806 1.8 keep batch 32
e49bf5d 1.569257 1.8 keep MATRIX_LR 0.05
c0213f7 0.000000 0.0 crash seq 384 (TOTAL_BATCH_SIZE divisibility)
aedf903 1.576799 1.8 discard MATRIX_LR 0.06 (worse)
682811e 1.563258 1.8 keep WARMUP_RATIO 0.1
86e99fc 1.570927 1.8 discard EMBEDDING_LR 0.8 (worse)
44ac775 1.715129 2.5 discard depth 5 (worse)
efc797c 1.563657 1.8 discard WARMUP_RATIO 0.15 (slightly worse)
75d1c1d 1.564486 1.8 discard WEIGHT_DECAY 0.1 (worse)
cc05e34 1.616739 1.8 discard TOTAL_BATCH 32K (fewer steps)
b5a12d4 1.566375 1.8 discard ADAM_BETAS (0.85, 0.95)
b9492b9 1.562771 1.8 keep WINDOW_PATTERN SSSL
909f61e 1.568223 1.8 discard WARMUP_RATIO 0.12