perf(bpe): 压缩词表内容以降低空间占用并提升局部性 #7
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# This workflow uses actions that are not certified by GitHub. | |
# They are provided by a third-party and are governed by | |
# separate terms of service, privacy policy, and support | |
# documentation. | |
# rust-clippy is a tool that runs a bunch of lints to catch common | |
# mistakes in your Rust code and help improve your Rust code. | |
# More details at https://github.com/rust-lang/rust-clippy | |
# and https://rust-lang.github.io/rust-clippy/ | |
name: CI | |
on: | |
pull_request: | |
push: | |
paths-ignore: | |
- '**.md' | |
- 'LICENSE' | |
jobs: | |
rust-clippy-analyze: | |
name: Run rust-clippy analyzing | |
runs-on: ubuntu-latest | |
permissions: | |
security-events: write | |
steps: | |
- name: Checkout code | |
uses: actions/checkout@v4 | |
- name: Check format | |
run: cargo fmt --check | |
- name: Download tokenizer.model | |
run: wget https://huggingface.co/TinyLlama/TinyLlama_v1.1/resolve/main/tokenizer.model | |
# run on windows: wget -Uri https://huggingface.co/TinyLlama/TinyLlama_v1.1/resolve/main/tokenizer.model -OutFile tokenizer.model | |
- name: Run test | |
run: cargo test | |
- name: Install required cargo | |
run: cargo install clippy-sarif sarif-fmt | |
- name: Run rust-clippy | |
run: | |
cargo clippy | |
--all-features | |
--message-format=json | clippy-sarif | tee rust-clippy-results.sarif | sarif-fmt | |
continue-on-error: true | |
- name: Upload analysis results to GitHub | |
uses: github/codeql-action/upload-sarif@v3 | |
with: | |
sarif_file: rust-clippy-results.sarif | |
wait-for-processing: true |