Classification using Vision Model Pixtral-12b #1284
Unanswered
RajSimpi9988
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I tried using the code available in the documentation for the Image classification using the exact code for the classification I was not able to work around with the code.
`
from outlines import models
import outlines
from transformers import LlavaNextForConditionalGeneration, BitsAndBytesConfig, AutoProcessor
import torch
from PIL import Image
from io import BytesIO
from urllib.request import urlopen
model = models.transformers_vision(
"mistral-community/pixtral-12b",
model_class=LlavaNextForConditionalGeneration,
device="cuda:1",
model_kwargs={
"torch_dtype": torch.bfloat16,
'quantization_config': BitsAndBytesConfig(load_in_8bit=True)
}
)
processor = AutoProcessor.from_pretrained("mistral-community/pixtral-12b")
def img_from_url(url):
img_byte_stream = BytesIO(urlopen(url).read())
return Image.open(img_byte_stream).convert("RGB")
pattern = "Mercury|Venus|Earth|Mars|Saturn|Jupiter|Neptune|Uranus|Pluto"
planet_generator = outlines.generate.regex(model, pattern)
planet_generator(
",
"What planet is this:
[img_from_url("https://upload.wikimedia.org/wikipedia/commons/e/e3/Saturn_from_Cassini_Orbiter_%282004-10-06%29.jpg")]
)
print(planet_generator)`
I keep on getting this Error :
You are using a model of type llava to instantiate a model of type llava_next. This is not supported for all configurations of models and can yield errors. Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:15<00:00, 2.63s/it] Some weights of LlavaNextForConditionalGeneration were not initialized from the model checkpoint at mistral-community/pixtral-12b and are newly initialized: ['image_newline'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Traceback (most recent call last): File "/home/raj/fd/workspace/outlines/test.py", line 32, in <module> planet_generator( File "/home/raj/fd/workspace/pyenv/pixtral/lib/python3.10/site-packages/outlines/generate/api.py", line 556, in __call__ completions = self.model.generate( File "/home/raj/fd/workspace/pyenv/pixtral/lib/python3.10/site-packages/outlines/models/transformers_vision.py", line 56, in generate generated_ids = self._generate_output_seq(prompts, inputs, **generation_kwargs) File "/home/raj/fd/workspace/pyenv/pixtral/lib/python3.10/site-packages/outlines/models/transformers.py", line 350, in _generate_output_seq output_ids = self.model.generate( File "/home/raj/fd/workspace/pyenv/pixtral/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) File "/home/raj/fd/workspace/pyenv/pixtral/lib/python3.10/site-packages/transformers/generation/utils.py", line 2215, in generate result = self._sample( File "/home/raj/fd/workspace/pyenv/pixtral/lib/python3.10/site-packages/transformers/generation/utils.py", line 3206, in _sample outputs = self(**model_inputs, return_dict=True) File "/home/raj/fd/workspace/pyenv/pixtral/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/raj/fd/workspace/pyenv/pixtral/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl return forward_call(*args, **kwargs) File "/home/raj/fd/workspace/pyenv/pixtral/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = module._old_forward(*args, **kwargs) File "/home/raj/fd/workspace/pyenv/pixtral/lib/python3.10/site-packages/transformers/models/llava_next/modeling_llava_next.py", line 850, in forward if pixel_values is not None and pixel_values.size(0) > 0: AttributeError: 'list' object has no attribute 'size'
Beta Was this translation helpful? Give feedback.
All reactions