feat: support downloading multiple model files #3216

umialpha · 2024-09-28T19:47:12Z

Changes

Add two more more fields for ModelInfo

pub struct ModelInfo {
pub urls_sha256: Option<Vec<String>>,
pub entrypoint: String,
}

urls and urls_sha256 are for multiple files besides the original use(find the download host)
entrypoint is for the entrypoint of the models. it is set to "model.gguf" for legacy case, if there are multiple parts of model, entrypoint must be set.

If urls are set, the filename are inferenced by the function tryget_download_filename

example:

{
    "license_name": "Apache 2.0",
    "license_url": "https://choosealicense.com/licenses/apache-2.0/",
    "prompt_template": "<|fim_prefix|>{prefix}<|fim_suffix|>{suffix}<|fim_middle|>",
    "name": "Qwen2.5-Coder-7B-Instruct-GGUF",
    "provider_url": "https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct-GGUF",
    "urls": [
      "https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct-GGUF/resolve/main/qwen2.5-coder-7b-instruct-q8_0-00001-of-00003.gguf",
      "https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct-GGUF/resolve/main/qwen2.5-coder-7b-instruct-q8_0-00002-of-00003.gguf",
      "https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct-GGUF/resolve/main/qwen2.5-coder-7b-instruct-q8_0-00003-of-00003.gguf"
    ],
    "urls_sha256":[
   "e2fc5918a2b579d8e03a3752ad74dd191bc0f43204c90a29070f273f5283fee1",
   "912b7876d43dc19bbcf09368f4472f6cfea3458067a5bcaa660a68a9958276db",
   "478f6a6b37072eeda02a98a59b6ef0b1a9131c9eae9a1181b6077f5e255fa6b2",
],
    "sha256": "e2fc5918a2b579d8e03a3752ad74dd191bc0f43204c90a29070f273f5283fee1",
"entrypoint": "qwen2.5-coder-7b-instruct-q8_0-00001-of-00003.gguf"
  }

umialpha · 2024-09-28T19:58:55Z

@zwpaper @wsxiaoys PTAL

umialpha · 2024-09-28T20:06:36Z

related registry-tabby pr
TabbyML/registry-tabby#16

wsxiaoys · 2024-09-29T06:02:57Z

I would not recommend reusing the urls field for a partitioned model file. Instead:

Create a new partitioned_urls field for models with partitions.
For each partition, have a corresponding sha256 value.

umialpha · 2024-09-29T07:06:47Z

I would not recommend reusing the urls field for a partitioned model file. Instead:

Create a new partitioned_urls field for models with partitions.

For each partition, have a corresponding sha256 value.

I considered adding partitioned_urls and partitioned_sha256, but this requires considering:

The structure of partitioned_urls is the same with urls, which may lead to confusion.
If we continue supporting different download_hosts semantics, the content of partitioned_urls are much like urls.

So, I've expanded the semantic of urls to include various partitioned_sha256 for different download_hosts, then filtering them based on download_hosts and override variables.

fn filter_download_urls(model_info: &ModelInfo) -> Vec<String> {
    let download_host = tabby_common::env::get_download_host();
    model_info
        .urls
        .iter()
        .flatten()
        .filter_map(|f| {
            if f.contains(&download_host) {
                if let Some(mirror_host) = tabby_common::env::get_huggingface_mirror_host() {
                    Some(f.replace("huggingface.co", &mirror_host))
                } else {
                    Some(f.to_owned())
                }
            } else {
                None
            }
        })
        .collect()

Let me know if partitioned_urls is necessary and how to maintain compatibility with existing urls. Adding partitioned_urls is not troublesome for me.

wsxiaoys · 2024-10-04T08:48:35Z

I still recommend adding partitioned_urls (perhaps using a typed struct that contains both the URL and the SHA256 value).
With proper refactoring, the mirror host semantics can also be applied to partitioned_urls.

umialpha · 2024-10-06T04:27:09Z

ok i will work on this when i'm back from vacation Meng Zhang ***@***.***> 于 2024年10月4日周五下午5:48写道：

…

1. I still recommend adding partitioned_urls (perhaps using a typed struct that contains both the URL and the SHA256 value). 2. With proper refactoring, the mirror host semantics can also be applied to partitioned_urls. — Reply to this email directly, view it on GitHub <#3216 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADGXMPU4AGYPZIKZUKMCUI3ZZZI7RAVCNFSM6AAAAABPA4NTCWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOJTGE3TMMJSGU> . You are receiving this because you authored the thread.Message ID: ***@***.***>

zwpaper · 2024-10-09T04:30:39Z

Hi @umialpha, sorry for the late reply and thanks for raising the PR!

before heading to the code changes, may I ask some questions to make sure we are on the same page.

I would also stand for creating a new field partitioned_urls
with 1., we should also support multiple hosts for each part of the partitioned_urls
what is the filenames for partitioned model? it seems that 00001-00003 is some kinds of standard the splitted gguf models used, maybe we can keep this convention
like 3., llama-cpp-server use the first part, 00001-00003, as default entrypoint of splitted models, maybe we can also follow this and drop the entry point field?

WDYT @wsxiaoys ?

umialpha · 2024-10-10T03:55:03Z

Hi @umialpha, sorry for the late reply and thanks for raising the PR!

before heading to the code changes, may I ask some questions to make sure we are on the same page.

I would also stand for creating a new field partitioned_urls

with 1., we should also support multiple hosts for each part of the partitioned_urls

what is the filenames for partitioned model? it seems that 00001-00003 is some kinds of standard the splitted gguf models used, maybe we can keep this convention

like 3., llama-cpp-server use the first part, 00001-00003, as default entrypoint of splitted models, maybe we can also follow this and drop the entry point field?

WDYT @wsxiaoys ?

@zwpaper The filename is inferred, first by looking for "CONTENT_DISPOSITION" in the HTTP header, and second by deducing from the URL.

I reviewed the llma.cpp documentation and found only one relevant content, here. It doesn't provide a detailed explanation of name. I am not sure whether it is a good idea to couple our implementation with other internal implementations.

leili added 2 commits September 29, 2024 02:35

finish main logic

53ee0ae

add ut

0a553be

[autofix.ci] apply automated fixes

34f5195

zwpaper mentioned this pull request Oct 10, 2024

feat(download): allow fetching model files with multiple partitions #3258

Merged

wsxiaoys marked this pull request as draft October 16, 2024 02:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support downloading multiple model files #3216

feat: support downloading multiple model files #3216

umialpha commented Sep 28, 2024 •

edited

Loading

umialpha commented Sep 28, 2024

umialpha commented Sep 28, 2024

wsxiaoys commented Sep 29, 2024

umialpha commented Sep 29, 2024 •

edited

Loading

wsxiaoys commented Oct 4, 2024

umialpha commented Oct 6, 2024 via email •

edited

Loading

zwpaper commented Oct 9, 2024

umialpha commented Oct 10, 2024 •

edited

Loading

feat: support downloading multiple model files #3216

Are you sure you want to change the base?

feat: support downloading multiple model files #3216

Conversation

umialpha commented Sep 28, 2024 • edited Loading

Changes

umialpha commented Sep 28, 2024

umialpha commented Sep 28, 2024

wsxiaoys commented Sep 29, 2024

umialpha commented Sep 29, 2024 • edited Loading

wsxiaoys commented Oct 4, 2024

umialpha commented Oct 6, 2024 via email • edited Loading

zwpaper commented Oct 9, 2024

umialpha commented Oct 10, 2024 • edited Loading

umialpha commented Sep 28, 2024 •

edited

Loading

umialpha commented Sep 29, 2024 •

edited

Loading

umialpha commented Oct 6, 2024 via email •

edited

Loading

umialpha commented Oct 10, 2024 •

edited

Loading