OVModelForCausalLM OOM #217

Zjq9409 · 2023-03-06T03:19:13Z

#from transformers import AutoTokenizer, BloomModel
from optimum.intel.openvino import OVModelForCausalLM
from transformers import AutoTokenizer, BloomModel

import torch
from tqdm import tqdm
from time import time
from time import sleep

model_str = 'bigscience/bloom'
tokenizer = AutoTokenizer.from_pretrained(model_str, device=0)
ov_model = OVModelForCausalLM.from_pretrained(model_str, from_transformers=True,
        device_map='auto',
        torch_dtype=torch.bfloat16,
        low_cpu_mem_usage=True)
#model = BloomModel.from_pretrained(model_str,device_map='auto',torch_dtype=torch.bfloat16, low_cpu_mem_usage=True)
#model.eval()
print('# [INFO] model loading complete, wait for 3 sec to start inference')
sleep(3.)

inputs = tokenizer('Hello, my dog is cute', return_tensors='pt')
n_tokens = len(inputs['input_ids'][0])
avg_latency = 0.
#with torch.inference_mode(), torch.cpu.amp.autocast():
for t in tqdm(range(100)):
  t0 = time()
  #outputs = model(**inputs)
  outputs = ov_model(**inputs)
  if t > 9:
    avg_latency += time() - t0

DRAM is 512G, use huggingface interface have torch_dtype parameter, It can be load on cpu, but OVModelForCausalLM interface result OOM,- Are the patameters similar to torch_dtype in OVModelForCausalLM

The text was updated successfully, but these errors were encountered:

echarlaix · 2023-03-31T09:08:07Z

Hi @Zjq9409,

Currently the torch_dtype parameter is ignored but enabling the loading of the model in bf16 before exporting it to the OpenVINO format is something that we plan to integrate in the future, thanks for letting us know that it was an important feature for users !

echarlaix · 2024-10-14T11:30:49Z

This option was added in #778

echarlaix closed this as completed Oct 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OVModelForCausalLM OOM #217

OVModelForCausalLM OOM #217

Zjq9409 commented Mar 6, 2023 •

edited

Loading

echarlaix commented Mar 31, 2023

echarlaix commented Oct 14, 2024

OVModelForCausalLM OOM #217

OVModelForCausalLM OOM #217

Comments

Zjq9409 commented Mar 6, 2023 • edited Loading

echarlaix commented Mar 31, 2023

echarlaix commented Oct 14, 2024

Zjq9409 commented Mar 6, 2023 •

edited

Loading