Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OVModelForCausalLM OOM #217

Closed
Zjq9409 opened this issue Mar 6, 2023 · 2 comments
Closed

OVModelForCausalLM OOM #217

Zjq9409 opened this issue Mar 6, 2023 · 2 comments

Comments

@Zjq9409
Copy link

Zjq9409 commented Mar 6, 2023

#from transformers import AutoTokenizer, BloomModel
from optimum.intel.openvino import OVModelForCausalLM
from transformers import AutoTokenizer, BloomModel

import torch
from tqdm import tqdm
from time import time
from time import sleep

model_str = 'bigscience/bloom'
tokenizer = AutoTokenizer.from_pretrained(model_str, device=0)
ov_model = OVModelForCausalLM.from_pretrained(model_str, from_transformers=True,
        device_map='auto',
        torch_dtype=torch.bfloat16,
        low_cpu_mem_usage=True)
#model = BloomModel.from_pretrained(model_str,device_map='auto',torch_dtype=torch.bfloat16, low_cpu_mem_usage=True)
#model.eval()
print('# [INFO] model loading complete, wait for 3 sec to start inference')
sleep(3.)

inputs = tokenizer('Hello, my dog is cute', return_tensors='pt')
n_tokens = len(inputs['input_ids'][0])
avg_latency = 0.
#with torch.inference_mode(), torch.cpu.amp.autocast():
for t in tqdm(range(100)):
  t0 = time()
  #outputs = model(**inputs)
  outputs = ov_model(**inputs)
  if t > 9:
    avg_latency += time() - t0

DRAM is 512G, use huggingface interface have torch_dtype parameter, It can be load on cpu, but OVModelForCausalLM interface result OOM,- Are the patameters similar to torch_dtype in OVModelForCausalLM

@echarlaix
Copy link
Collaborator

Hi @Zjq9409,

Currently the torch_dtype parameter is ignored but enabling the loading of the model in bf16 before exporting it to the OpenVINO format is something that we plan to integrate in the future, thanks for letting us know that it was an important feature for users !

@echarlaix
Copy link
Collaborator

This option was added in #778

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants