Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
6e2bd3e
fix: transformers engine was patched
arm-diaz Jun 4, 2025
eef633a
refactor: notebook rename model to llama 3 8b
arm-diaz Jun 4, 2025
57deb0f
commit updated fine-tune artifacts
nithiyn Jun 5, 2025
e7822d7
Update README.md and pin build
nithiyn Jun 5, 2025
6b5a6cc
Update README.md
nithiyn Jun 5, 2025
bc1c0dc
refactor: CLI & bash scripting
arm-diaz Jun 6, 2025
a038daa
Delete logs/benchmarks directory
arm-diaz Jun 6, 2025
e82a194
fix: gitignore - logging
arm-diaz Jun 6, 2025
8995087
commit minor updates to Readme and shell
nithiyn Jun 6, 2025
298418f
docs: add comments
arm-diaz Jun 6, 2025
79a1a99
Merge branch 'agents' of https://github.com/arm-diaz/nki-llama into a…
nithiyn Jun 7, 2025
56dae34
feat: calculate score for finetuning
arm-diaz Jun 10, 2025
9fc4afd
Merge branch 'agents' of https://github.com/arm-diaz/nki-llama into a…
arm-diaz Jun 10, 2025
c36c14d
docs: update docs
arm-diaz Jun 10, 2025
c04d6cd
chore: bump vllm to upstream and update configs
nithiyn Jun 10, 2025
99bc3ba
fix:update req.txt path
nithiyn Jun 10, 2025
20067f2
fix: handle cache during inference and improve readme
arm-diaz Jun 11, 2025
1f7a1b3
feat: improve CLI
arm-diaz Jun 11, 2025
9b5f858
fix: commit check for tf<4.50
nithiyn Jun 11, 2025
416fb3d
Merge branch 'agents' of https://github.com/arm-diaz/nki-llama into a…
nithiyn Jun 11, 2025
86dac49
feat: reasoning bench scripts
nithiyn Jun 11, 2025
3c877c4
feat: nki scores & llama.py
arm-diaz Jun 12, 2025
4803f6f
fix: rename function
arm-diaz Jun 12, 2025
1a7bc1b
add step to check and downgrade transformers
nithiyn Jun 13, 2025
33105d7
commit config update
nithiyn Jun 13, 2025
7dea685
docs: improve docs and workflow
arm-diaz Jun 13, 2025
e3d4b8f
Merge branch 'agents' of https://github.com/arm-diaz/nki-llama into a…
arm-diaz Jun 13, 2025
e1f9413
fix: docs
arm-diaz Jun 13, 2025
fb5a898
fix: vllm reinstall rem, check tf and add env vars
nithiyn Jun 13, 2025
d3c20a2
Merge branch 'agents' of https://github.com/arm-diaz/nki-llama into a…
nithiyn Jun 13, 2025
256f942
fix: set chat to false for base model
nithiyn Jun 13, 2025
d5d619d
refactor: clean up env variables
arm-diaz Jun 14, 2025
45db233
fix: commit updates for env var mapping and docs
nithiyn Jun 15, 2025
93f5041
Update nki-llama.sh
nithiyn Jun 16, 2025
1fa2e86
commit vllm updates
nithiyn Jun 17, 2025
d803b1e
fix: vllm path
nithiyn Jun 17, 2025
2a1d6c3
fix: uninstall prev vllm wheels
nithiyn Jun 17, 2025
ccd9f60
fix: use neuron fork
nithiyn Jun 17, 2025
49715f1
fix: downgrade tf to 4.48.8
nithiyn Jun 17, 2025
b554084
add dir name
nithiyn Jun 17, 2025
f1fc5a0
test with 2.22
nithiyn Jun 18, 2025
3f868d9
fix: disable reasoning parser for vllm 2.22 compatibility
nithiyn Jun 19, 2025
b28f87b
fix:reasoning bench standalone script
nithiyn Jun 19, 2025
21de053
feat: reasoning bench datasets update
nithiyn Jun 23, 2025
44dcb30
chore:update reasoning bench doc
nithiyn Jun 23, 2025
ba462f8
chore: fix docs
nithiyn Jun 23, 2025
57a22f6
Update reasoning-score-guide.md
nithiyn Jun 23, 2025
ba4a97a
Added Lora Merge Script
Jun 25, 2025
fb65ff9
chore: update docs and env
nithiyn Jun 26, 2025
ca9ef49
fix: nki score normalization
nithiyn Jun 26, 2025
5c5af13
feat: updated handler
nithiyn Jun 27, 2025
10ac55f
Added Neuron Profile Test
Jul 3, 2025
17cbb97
docs: add documentation for each path
arm-diaz Jul 9, 2025
8e8fad2
docs: add documentation for each path
arm-diaz Jul 9, 2025
e0286b4
docs: add documentation for each path
arm-diaz Jul 9, 2025
473f961
docs: add documentation for each path
arm-diaz Jul 9, 2025
4767568
docs: add documentation for each path
arm-diaz Jul 9, 2025
cf3b8e0
Self-Attention Path Updated
Arhamama-AMZ Jul 10, 2025
0b75910
docs: add hyperlink for deployment.yaml
arm-diaz Jul 11, 2025
14f9ec3
docs: add hyperlink for deployment.yaml
arm-diaz Jul 11, 2025
7a2c19f
feat: Llama 1B/8B Implementation
Jul 11, 2025
b0444fa
docs: delete unecessary content
arm-diaz Jul 11, 2025
62d7ecf
feat: self-attention score implementation
Arhamama-AMZ Jul 15, 2025
6d1962e
feat: Doc implementation and script integration for self attention
Arhamama-AMZ Jul 15, 2025
87d9fc1
docs: improve docs & deployment
arm-diaz Jul 22, 2025
b1923ba
fix: add a fixed release for the dependecies with neuron
arm-diaz Jul 22, 2025
aec05d1
fix: inference benchmark - missing env variable for model IDs
Jul 23, 2025
70cc058
docs: improve error handling & documentation
Jul 23, 2025
9b280fc
docs: add support for trn1.2xlarge in cloudformation
Jul 23, 2025
5b9316f
fix: env variables for nki status
Jul 24, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 21 additions & 10 deletions .env.example
Original file line number Diff line number Diff line change
@@ -1,13 +1,24 @@
# Model configuration
## HuggingFace Model ID (https://huggingface.co/meta-llama/Meta-Llama-3-8B)
# Example environment file for NKI-LLAMA
# Copy this to .env and update with your values

# Hugging Face Configuration
HF_TOKEN=your_huggingface_token_here
MODEL_ID=meta-llama/Meta-Llama-3-8B
## Short name for model ID
MODEL_NAME=meta-llama-3-8b
MODEL_NAME=llama-3-8b

# Inference Configuration
INFERENCE_PORT=8080
MAX_MODEL_LEN=8192 # used by vllm- ensure it is the same as seq len
SEQ_LEN=8192 #used by main.py

MAX_NUM_SEQS=4
TENSOR_PARALLEL_SIZE=8

# Dataset Configuration
DATASET_NAME=databricks/databricks-dolly-15k

# Server configurations
PORT=8080
MAX_MODEL_LEN=2048
TENSOR_PARALLEL_SIZE=32
# Neuron Configuration
NEURON_RT_NUM_CORES=8

# HuggingFace token for downloading models
HF_TOKEN=your_token_here
# Jupyter Configuration
JUPYTER_PORT=8888
10 changes: 10 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -267,6 +267,16 @@ test/inference/output
**/neuronxcc-*
global_metric_store.json
benchmark_report.json
benchmark_inference.json
cached_requirements.txt
benchmark_finetuning.json
benchmark_results.json
**/logs/
compiled_merged_model/
compiled_model/
merged_model/
src/self-attention/config
requirements.txt.**
model_env.sh

# End of https://www.toptal.com/developers/gitignore/api/macos,windows,linux,jupyternotebooks,python
127 changes: 0 additions & 127 deletions Makefile

This file was deleted.

Loading