Classification using Vision Model Pixtral-12b #1284

RajSimpi9988 · 2024-11-26T04:38:51Z

RajSimpi9988
Nov 26, 2024

I tried using the code available in the documentation for the Image classification using the exact code for the classification I was not able to work around with the code.

`
from outlines import models
import outlines
from transformers import LlavaNextForConditionalGeneration, BitsAndBytesConfig, AutoProcessor
import torch
from PIL import Image
from io import BytesIO
from urllib.request import urlopen

model = models.transformers_vision(
"mistral-community/pixtral-12b",
model_class=LlavaNextForConditionalGeneration,
device="cuda:1",
model_kwargs={
"torch_dtype": torch.bfloat16,
'quantization_config': BitsAndBytesConfig(load_in_8bit=True)
}
)

processor = AutoProcessor.from_pretrained("mistral-community/pixtral-12b")

def img_from_url(url):
img_byte_stream = BytesIO(urlopen(url).read())
return Image.open(img_byte_stream).convert("RGB")

planet_generator(
"What planet is this: ",
[img_from_url("https://upload.wikimedia.org/wikipedia/commons/e/e3/Saturn_from_Cassini_Orbiter_%282004-10-06%29.jpg")]
)

print(planet_generator)`

I keep on getting this Error :

You are using a model of type llava to instantiate a model of type llava_next. This is not supported for all configurations of models and can yield errors. Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:15<00:00, 2.63s/it] Some weights of LlavaNextForConditionalGeneration were not initialized from the model checkpoint at mistral-community/pixtral-12b and are newly initialized: ['image_newline'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Traceback (most recent call last): File "/home/raj/fd/workspace/outlines/test.py", line 32, in <module> planet_generator( File "/home/raj/fd/workspace/pyenv/pixtral/lib/python3.10/site-packages/outlines/generate/api.py", line 556, in __call__ completions = self.model.generate( File "/home/raj/fd/workspace/pyenv/pixtral/lib/python3.10/site-packages/outlines/models/transformers_vision.py", line 56, in generate generated_ids = self._generate_output_seq(prompts, inputs, **generation_kwargs) File "/home/raj/fd/workspace/pyenv/pixtral/lib/python3.10/site-packages/outlines/models/transformers.py", line 350, in _generate_output_seq output_ids = self.model.generate( File "/home/raj/fd/workspace/pyenv/pixtral/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) File "/home/raj/fd/workspace/pyenv/pixtral/lib/python3.10/site-packages/transformers/generation/utils.py", line 2215, in generate result = self._sample( File "/home/raj/fd/workspace/pyenv/pixtral/lib/python3.10/site-packages/transformers/generation/utils.py", line 3206, in _sample outputs = self(**model_inputs, return_dict=True) File "/home/raj/fd/workspace/pyenv/pixtral/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/raj/fd/workspace/pyenv/pixtral/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl return forward_call(*args, **kwargs) File "/home/raj/fd/workspace/pyenv/pixtral/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = module._old_forward(*args, **kwargs) File "/home/raj/fd/workspace/pyenv/pixtral/lib/python3.10/site-packages/transformers/models/llava_next/modeling_llava_next.py", line 850, in forward if pixel_values is not None and pixel_values.size(0) > 0: AttributeError: 'list' object has no attribute 'size'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Classification using Vision Model Pixtral-12b #1284

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Classification using Vision Model Pixtral-12b #1284

Uh oh!

RajSimpi9988 Nov 26, 2024

Replies: 0 comments

RajSimpi9988
Nov 26, 2024