-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quickstart example not working #489
Comments
what's the error you are seeing? and logs? |
Just by executing this: model=mistralai/Mistral-7B-Instruct-v0.1
volume=$PWD/data
docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data
ghcr.io/predibase/lorax:main --model-id $model I get the following error: 2024-05-27T11:40:55.235184Z INFO lorax_launcher: Args { model_id: "mistralai/Mistral-7B-Instruct-v0.1", adapter_id: None, source: "hub", default_adapter_source: None, adapter_source: "hub", revision: None, validation_workers: 2, sharded: None, embedding_model: None, num_shard: None, quantize: None, compile: false, speculative_tokens: None, dtype: None, trust_remote_code: false, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_input_length: 1024, max_total_tokens: 2048, waiting_served_ratio: 1.2, max_batch_prefill_tokens: 4096, max_batch_total_tokens: None, max_waiting_tokens: 20, max_active_adapters: 1024, adapter_cycle_time_s: 2, adapter_memory_fraction: 0.1, hostname: "252cfb445bd6", port: 80, shard_uds_path: "/tmp/lorax-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/data"), weights_cache_override: None, disable_custom_kernels: false, cuda_memory_fraction: 1.0, json_output: false, otlp_endpoint: None, cors_allow_origin: [], cors_allow_header: [], cors_expose_header: [], cors_allow_method: [], cors_allow_credentials: None, watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_edge: None, env: false, download_only: false }
2024-05-27T11:40:55.235284Z INFO download: lorax_launcher: Starting download process.
2024-05-27T11:40:58.738448Z ERROR download: lorax_launcher: Download encountered an error:
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 270, in hf_raise_for_status
response.raise_for_status()
File "/opt/conda/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/api/models/mistralai/Mistral-7B-Instruct-v0.1
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/conda/bin/lorax-server", line 8, in <module>
sys.exit(app())
File "/opt/conda/lib/python3.10/site-packages/lorax_server/cli.py", line 124, in download_weights
_download_weights(model_id, revision, extension, auto_convert, source, api_token)
File "/opt/conda/lib/python3.10/site-packages/lorax_server/utils/weights.py", line 447, in download_weights
model_source.weight_files()
File "/opt/conda/lib/python3.10/site-packages/lorax_server/utils/sources/hub.py", line 179, in weight_files
return weight_files(self.model_id, self.revision, extension, self.api_token)
File "/opt/conda/lib/python3.10/site-packages/lorax_server/utils/sources/hub.py", line 69, in weight_files
filenames = weight_hub_files(model_id, revision, extension, api_token)
File "/opt/conda/lib/python3.10/site-packages/lorax_server/utils/sources/hub.py", line 34, in weight_hub_files
info = api.model_info(model_id, revision=revision)
File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 1922, in model_info
hf_raise_for_status(r)
File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 286, in hf_raise_for_status
raise GatedRepoError(message, response) from e
huggingface_hub.utils._errors.GatedRepoError: 401 Client Error. (Request ID: Root=1-6654714a-7fc4308c45580b7328298ca4;876ae9f0-b94e-4384-a4a9-fd3139261aa7)
Cannot access gated repo for url https://huggingface.co/api/models/mistralai/Mistral-7B-Instruct-v0.1.
Access to model mistralai/Mistral-7B-Instruct-v0.1 is restricted. You must be authenticated to access it.
Error: DownloadError If I add my token I and execute it this way: model=mistralai/Mistral-7B-Instruct-v0.1
volume=$PWD/data
docker run --gpus all --shm-size 1g -p 8080:80 -e HUGGING_FACE_HUB_TOKEN=$HUGGING_FACE_HUB_TOKEN -v $volume:/data ghcr.io/predibase/lorax:main --model-id $model I get the following error: 2024-05-27T11:45:16.081927Z INFO lorax_launcher: Args { model_id: "mistralai/Mistral-7B-Instruct-v0.1", adapter_id: None, source: "hub", default_adapter_source: None, adapter_source: "hub", revision: None, validation_workers: 2, sharded: None, embedding_model: None, num_shard: None, quantize: None, compile: false, speculative_tokens: None, dtype: None, trust_remote_code: false, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_input_length: 1024, max_total_tokens: 2048, waiting_served_ratio: 1.2, max_batch_prefill_tokens: 4096, max_batch_total_tokens: None, max_waiting_tokens: 20, max_active_adapters: 1024, adapter_cycle_time_s: 2, adapter_memory_fraction: 0.1, hostname: "8b4a73dd40ee", port: 80, shard_uds_path: "/tmp/lorax-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/data"), weights_cache_override: None, disable_custom_kernels: false, cuda_memory_fraction: 1.0, json_output: false, otlp_endpoint: None, cors_allow_origin: [], cors_allow_header: [], cors_expose_header: [], cors_allow_method: [], cors_allow_credentials: None, watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_edge: None, env: false, download_only: false }
2024-05-27T11:45:16.082040Z INFO download: lorax_launcher: Starting download process.
Error: DownloadError
2024-05-27T11:45:18.784725Z ERROR download: lorax_launcher: Download encountered an error:
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 270, in hf_raise_for_status
response.raise_for_status()
File "/opt/conda/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://huggingface.co/api/models/mistralai/Mistral-7B-Instruct-v0.1
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/conda/bin/lorax-server", line 8, in <module>
sys.exit(app())
File "/opt/conda/lib/python3.10/site-packages/lorax_server/cli.py", line 124, in download_weights
_download_weights(model_id, revision, extension, auto_convert, source, api_token)
File "/opt/conda/lib/python3.10/site-packages/lorax_server/utils/weights.py", line 447, in download_weights
model_source.weight_files()
File "/opt/conda/lib/python3.10/site-packages/lorax_server/utils/sources/hub.py", line 179, in weight_files
return weight_files(self.model_id, self.revision, extension, self.api_token)
File "/opt/conda/lib/python3.10/site-packages/lorax_server/utils/sources/hub.py", line 69, in weight_files
filenames = weight_hub_files(model_id, revision, extension, api_token)
File "/opt/conda/lib/python3.10/site-packages/lorax_server/utils/sources/hub.py", line 34, in weight_hub_files
info = api.model_info(model_id, revision=revision)
File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 1922, in model_info
hf_raise_for_status(r)
File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 286, in hf_raise_for_status
raise GatedRepoError(message, response) from e
huggingface_hub.utils._errors.GatedRepoError: 403 Client Error. (Request ID: Root=1-6654724e-68506e57290cd75960e9177c;090b4284-8d6f-441f-916f-2fa3e5ab57c3)
Cannot access gated repo for url https://huggingface.co/api/models/mistralai/Mistral-7B-Instruct-v0.1.
Access to model mistralai/Mistral-7B-Instruct-v0.1 is restricted and you are not in the authorized list. Visit https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1 to ask for access. It's fine since I don't have access to that model but I would use another example for a quickstart. I tried with Phi-3 since it's not a gated model but it didn't work either: $ export model=microsoft/Phi-3-small-8k-instruct
$ docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/predibase/lorax:main --model-id $model
2024-05-27T11:52:11.862867Z INFO lorax_launcher: Args { model_id: "microsoft/Phi-3-small-8k-instruct", adapter_id: None, source: "hub", default_adapter_source: None, adapter_source: "hub", revision: None, validation_workers: 2, sharded: None, embedding_model: None, num_shard: None, quantize: None, compile: false, speculative_tokens: None, dtype: None, trust_remote_code: false, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_input_length: 1024, max_total_tokens: 2048, waiting_served_ratio: 1.2, max_batch_prefill_tokens: 4096, max_batch_total_tokens: None, max_waiting_tokens: 20, max_active_adapters: 1024, adapter_cycle_time_s: 2, adapter_memory_fraction: 0.1, hostname: "99cd7f793e22", port: 80, shard_uds_path: "/tmp/lorax-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/data"), weights_cache_override: None, disable_custom_kernels: false, cuda_memory_fraction: 1.0, json_output: false, otlp_endpoint: None, cors_allow_origin: [], cors_allow_header: [], cors_expose_header: [], cors_allow_method: [], cors_allow_credentials: None, watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_edge: None, env: false, download_only: false }
2024-05-27T11:52:11.862989Z INFO download: lorax_launcher: Starting download process.
2024-05-27T11:52:14.407194Z INFO lorax_launcher: hub.py:121 Download file: model-00001-of-00004.safetensors
2024-05-27T11:52:31.144230Z INFO lorax_launcher: hub.py:130 Downloaded /data/models--microsoft--Phi-3-small-8k-instruct/snapshots/1adb635233ffce9e13385862a4111606d4382762/model-00001-of-00004.safetensors in 0:00:16.
2024-05-27T11:52:31.144324Z INFO lorax_launcher: hub.py:150 Download: [1/4] -- ETA: 0:00:48
2024-05-27T11:52:31.156145Z INFO lorax_launcher: hub.py:121 Download file: model-00002-of-00004.safetensors
2024-05-27T11:53:19.520065Z INFO lorax_launcher: hub.py:130 Downloaded /data/models--microsoft--Phi-3-small-8k-instruct/snapshots/1adb635233ffce9e13385862a4111606d4382762/model-00002-of-00004.safetensors in 0:00:48.
2024-05-27T11:53:19.520245Z INFO lorax_launcher: hub.py:150 Download: [2/4] -- ETA: 0:01:05
2024-05-27T11:53:19.520607Z INFO lorax_launcher: hub.py:121 Download file: model-00003-of-00004.safetensors
2024-05-27T11:54:38.927900Z INFO lorax_launcher: hub.py:130 Downloaded /data/models--microsoft--Phi-3-small-8k-instruct/snapshots/1adb635233ffce9e13385862a4111606d4382762/model-00003-of-00004.safetensors in 0:01:19.
2024-05-27T11:54:38.928020Z INFO lorax_launcher: hub.py:150 Download: [3/4] -- ETA: 0:00:48
2024-05-27T11:54:38.928307Z INFO lorax_launcher: hub.py:121 Download file: model-00004-of-00004.safetensors
2024-05-27T11:54:46.533325Z INFO lorax_launcher: hub.py:130 Downloaded /data/models--microsoft--Phi-3-small-8k-instruct/snapshots/1adb635233ffce9e13385862a4111606d4382762/model-00004-of-00004.safetensors in 0:00:07.
2024-05-27T11:54:46.533440Z INFO lorax_launcher: hub.py:150 Download: [4/4] -- ETA: 0
2024-05-27T11:54:47.004372Z INFO download: lorax_launcher: Successfully downloaded weights.
2024-05-27T11:54:47.004674Z INFO shard-manager: lorax_launcher: Starting shard rank=0
2024-05-27T11:54:52.749530Z ERROR lorax_launcher: server.py:265 Error when initializing model
Traceback (most recent call last):
File "/opt/conda/bin/lorax-server", line 8, in <module>
sys.exit(app())
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 311, in __call__
return get_command(self)(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 778, in main
return _main(
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 216, in _main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 683, in wrapper
return callback(**use_params) # type: ignore
File "/opt/conda/lib/python3.10/site-packages/lorax_server/cli.py", line 84, in serve
server.serve(
File "/opt/conda/lib/python3.10/site-packages/lorax_server/server.py", line 318, in serve
asyncio.run(
File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
self.run_forever()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
self._run_once()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once
handle._run()
File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
> File "/opt/conda/lib/python3.10/site-packages/lorax_server/server.py", line 252, in serve_inner
model = get_model(
File "/opt/conda/lib/python3.10/site-packages/lorax_server/models/__init__.py", line 328, in get_model
raise ValueError(f"Unsupported model type {model_type}")
ValueError: Unsupported model type phi3small
2024-05-27T11:54:53.511126Z ERROR shard-manager: lorax_launcher: Shard complete standard error output:
Traceback (most recent call last):
File "/opt/conda/bin/lorax-server", line 8, in <module>
sys.exit(app())
File "/opt/conda/lib/python3.10/site-packages/lorax_server/cli.py", line 84, in serve
server.serve(
File "/opt/conda/lib/python3.10/site-packages/lorax_server/server.py", line 318, in serve
asyncio.run(
File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/opt/conda/lib/python3.10/site-packages/lorax_server/server.py", line 252, in serve_inner
model = get_model(
File "/opt/conda/lib/python3.10/site-packages/lorax_server/models/__init__.py", line 328, in get_model
raise ValueError(f"Unsupported model type {model_type}")
ValueError: Unsupported model type phi3small
rank=0
2024-05-27T11:54:53.609061Z ERROR lorax_launcher: Shard 0 failed to start
2024-05-27T11:54:53.609080Z INFO lorax_launcher: Shutting down shards
Error: ShardCannotStart Is there any model I could use to start playing around with LoRAX? |
@jmorenobl You just need to log in to HuggingFace Hub, open the model, accept the EULA, and then run LoRaX by passing your own |
System Info
Pre-built Docker image on g4dn.xlarge with Deep Learning OSS Nvidia Driver AMI GPU PyTorch 2.2.0 (Amazon Linux 2)
Information
Tasks
Reproduction
Execute the example on the home page:
model=mistralai/Mistral-7B-Instruct-v0.1
volume=$PWD/data
docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data
ghcr.io/predibase/lorax:main --model-id $model
Expected behavior
A server starts successfully serving mistral.
The text was updated successfully, but these errors were encountered: