-
Notifications
You must be signed in to change notification settings - Fork 30
Open
Description
$ cog run python -m transformers.models.llama.convert_llama_weights_to_hf --input_dir unconverted-weights --model_size 7B --output_dir weights
⚠ Cog doesn't know if CUDA 11.7 is compatible with PyTorch 1.13.1. This might cause CUDA problems.
Building Docker image from environment in cog.yaml...
[+] Building 5.5s (22/22) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 2.25kB 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 203B 0.0s
=> resolve image config for docker.io/docker/dockerfile:1.2 2.9s
=> [auth] docker/dockerfile:pull token for registry-1.docker.io 0.0s
=> CACHED docker-image://docker.io/docker/dockerfile:1.2@sha256:e2a8561e419ab1ba6b2fe6cbdf49fd92b95912df1cf7d313c3e2230a333fdbcc 0.0s
=> [internal] load metadata for docker.io/nvidia/cuda:11.7.0-cudnn8-devel-ubuntu22.04 1.8s
=> [auth] nvidia/cuda:pull token for registry-1.docker.io 0.0s
=> [internal] load build context 0.4s
=> => transferring context: 42.00kB 0.4s
=> [stage-0 1/12] FROM docker.io/nvidia/cuda:11.7.0-cudnn8-devel-ubuntu22.04@sha256:de480887e91e99fffd701a96cf96b88a4ee8449481a6d9eec5849092934ffd2e 0.0s
=> CACHED [stage-0 2/12] RUN --mount=type=cache,target=/var/cache/apt set -eux; apt-get update -qq; apt-get install -qqy --no-install-recommends curl; rm -rf /var/lib/apt/l 0.0s
=> CACHED [stage-0 3/12] RUN --mount=type=cache,target=/var/cache/apt apt-get update -qq && apt-get install -qqy --no-install-recommends make build-essential libssl-dev 0.0s
=> CACHED [stage-0 4/12] RUN curl -s -S -L https://raw.githubusercontent.com/pyenv/pyenv-installer/master/bin/pyenv-installer | bash && git clone https://github.com/momo-l 0.0s
=> CACHED [stage-0 5/12] COPY .cog/tmp/build1947155000/cog-0.0.1.dev-py3-none-any.whl /tmp/cog-0.0.1.dev-py3-none-any.whl 0.0s
=> CACHED [stage-0 6/12] RUN --mount=type=cache,target=/root/.cache/pip pip install /tmp/cog-0.0.1.dev-py3-none-any.whl 0.0s
=> CACHED [stage-0 7/12] COPY .cog/tmp/build1947155000/requirements.txt /tmp/requirements.txt 0.0s
=> CACHED [stage-0 8/12] RUN --mount=type=cache,target=/root/.cache/pip pip install -r /tmp/requirements.txt 0.0s
=> CACHED [stage-0 9/12] RUN pip install git+https://github.com/huggingface/transformers.git@786092a35e18154cacad62c30fe92bac2c27a1e1 0.0s
=> CACHED [stage-0 10/12] RUN mkdir /gc && cd /gc && curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-cli-426.0.0-linux-x86_64.tar.gz && tar - 0.0s
=> CACHED [stage-0 11/12] RUN pip install google-cloud-storage 0.0s
=> CACHED [stage-0 12/12] WORKDIR /src 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:5131a0ce7c4c0513b4efc7f5408b7a69b9e90b30e0b636e75b00bfec33ae4aae 0.0s
=> => naming to docker.io/library/cog-cog-llama-base 0.0s
=> exporting cache 0.0s
=> => preparing build cache for export 0.0s
Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
Running 'python -m transformers.models.llama.convert_llama_weights_to_hf --input_dir unconverted-weights --model_size 7B --output_dir weights' in Docker with the current directory mounted as a volume...
Fetching all parameters from the checkpoint at unconverted-weights/7B.
Loading the checkpoint in a Llama model.
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 33/33 [01:35<00:00, 2.90s/it]
Saving in the Transformers format.
Fetching the tokenizer from unconverted-weights/tokenizer.model.
$ cog predict -i prompt="Simply put, the theory of relativity states that"
⚠ Cog doesn't know if CUDA 11.7 is compatible with PyTorch 1.13.1. This might cause CUDA problems.
Building Docker image from environment in cog.yaml...
[+] Building 4.8s (22/22) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 2.25kB 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 35B 0.0s
=> resolve image config for docker.io/docker/dockerfile:1.2 2.7s
=> [auth] docker/dockerfile:pull token for registry-1.docker.io 0.0s
=> CACHED docker-image://docker.io/docker/dockerfile:1.2@sha256:e2a8561e419ab1ba6b2fe6cbdf49fd92b95912df1cf7d313c3e2230a333fdbcc 0.0s
=> [internal] load metadata for docker.io/nvidia/cuda:11.7.0-cudnn8-devel-ubuntu22.04 1.7s
=> [auth] nvidia/cuda:pull token for registry-1.docker.io 0.0s
=> [stage-0 1/12] FROM docker.io/nvidia/cuda:11.7.0-cudnn8-devel-ubuntu22.04@sha256:de480887e91e99fffd701a96cf96b88a4ee8449481a6d9eec5849092934ffd2e 0.0s
=> [internal] load build context 0.1s
=> => transferring context: 42.00kB 0.1s
=> CACHED [stage-0 2/12] RUN --mount=type=cache,target=/var/cache/apt set -eux; apt-get update -qq; apt-get install -qqy --no-install-recommends curl; rm -rf /var/lib/apt/l 0.0s
=> CACHED [stage-0 3/12] RUN --mount=type=cache,target=/var/cache/apt apt-get update -qq && apt-get install -qqy --no-install-recommends make build-essential libssl-dev 0.0s
=> CACHED [stage-0 4/12] RUN curl -s -S -L https://raw.githubusercontent.com/pyenv/pyenv-installer/master/bin/pyenv-installer | bash && git clone https://github.com/momo-l 0.0s
=> CACHED [stage-0 5/12] COPY .cog/tmp/build1140255148/cog-0.0.1.dev-py3-none-any.whl /tmp/cog-0.0.1.dev-py3-none-any.whl 0.0s
=> CACHED [stage-0 6/12] RUN --mount=type=cache,target=/root/.cache/pip pip install /tmp/cog-0.0.1.dev-py3-none-any.whl 0.0s
=> CACHED [stage-0 7/12] COPY .cog/tmp/build1140255148/requirements.txt /tmp/requirements.txt 0.0s
=> CACHED [stage-0 8/12] RUN --mount=type=cache,target=/root/.cache/pip pip install -r /tmp/requirements.txt 0.0s
=> CACHED [stage-0 9/12] RUN pip install git+https://github.com/huggingface/transformers.git@786092a35e18154cacad62c30fe92bac2c27a1e1 0.0s
=> CACHED [stage-0 10/12] RUN mkdir /gc && cd /gc && curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-cli-426.0.0-linux-x86_64.tar.gz && tar - 0.0s
=> CACHED [stage-0 11/12] RUN pip install google-cloud-storage 0.0s
=> CACHED [stage-0 12/12] WORKDIR /src 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:396358402a8fcf74a30d0fc6b03c39669b099a73ae3b6f831f0fd01e557f218f 0.0s
=> => naming to docker.io/library/cog-cog-llama-base 0.0s
=> exporting cache 0.0s
=> => preparing build cache for export 0.0s
Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
Starting Docker image cog-cog-llama-base and running setup()...
deserializing weights
WARNING: Omitting file://llama_weights/llama-7b because it is a container, and recursion is not enabled.
ERROR: (gcloud.storage.cp) The following URLs matched no objects or files:
-llama_weights/llama-7b
Traceback (most recent call last):
File "/root/.pyenv/versions/3.8.16/lib/python3.8/site-packages/cog/server/worker.py", line 185, in _setup
run_setup(self._predictor)
File "/root/.pyenv/versions/3.8.16/lib/python3.8/site-packages/cog/predictor.py", line 81, in run_setup
predictor.setup(weights=weights)
File "predict.py", line 18, in setup
self.model = load_tensorizer(
File "/src/config.py", line 57, in load_tensorizer
f"gcloud storage cp command failed with return code {res.returncode}: {res.stderr.decode('utf-8')}"
AttributeError: 'NoneType' object has no attribute 'decode'
ⅹ Model setup failed
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels