You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am running the IndicTrans models from HuggingFace and need to return multiple translations per source sentence (e.g., 8). Using the code provided in the README.md of the repository, whenever I increase the num_return_sequences parameter, the postprocessing step hangs indefinitely without any error message.
Expected behavior: I expect the postprocessing step to handle multiple return sequences and provide the output without hanging.
Actual behavior: When increasing num_return_sequences, the postprocessing step hangs, and there is no further message or output.
Here is the code I’m using (adapted from the README.md):
import torch
from IndicTransToolkit import IndicProcessor
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
import time
ip = IndicProcessor(inference=True)
tokenizer = AutoTokenizer.from_pretrained("ai4bharat/indictrans2-en-indic-dist-200M", trust_remote_code=True)
model = AutoModelForSeq2SeqLM.from_pretrained("ai4bharat/indictrans2-en-indic-dist-200M", trust_remote_code=True)
sentences = [
"This is a test sentence.",
"This is another longer different test sentence.",
"Please send an SMS to 9876543210 and an email on newemail123@xyz.com by 15th October, 2023.",
]
batch = ip.preprocess_batch(sentences, src_lang="eng_Latn", tgt_lang="hin_Deva")
batch = tokenizer(batch, padding="longest", truncation=True, max_length=256, return_tensors="pt")
num_return_sequences = 2
print(f"num_return_sequences == {num_return_sequences}")
with torch.inference_mode():
outputs = model.generate(**batch, num_beams=5, num_return_sequences=num_return_sequences, max_length=256)
with tokenizer.as_target_tokenizer():
# This scoping is absolutely necessary, as it will instruct the tokenizer to tokenize using the target vocabulary.
# Failure to use this scoping will result in gibberish/unexpected predictions as the output will be de-tokenized with the source vocabulary instead.
outputs = tokenizer.batch_decode(outputs, skip_special_tokens=True, clean_up_tokenization_spaces=True)
start = time.time()
print("Starting posprocessing")
outputs = ip.postprocess_batch(outputs, lang="hin_Deva")
end = time.time()
total_time = end-start
print(f"Postprocessing took {total_time:.2f} seconds")
print(outputs)
I'm using Python 3.10.12, with the following libraries installed:
torch==2.4.1
transformers==4.45.2
Thanks!
The text was updated successfully, but these errors were encountered:
Hi @onadegibert, thank you for reaching out. Yes, the processor is not designed to handle multiple return sequences from generate. As of now the post-processing assumes a 1-to-1 mapping of input to output to patch the placeholders. We will try to incorporate this request in the next release. If you have a fix by then, please feel free to make a PR.
Hello,
I am running the IndicTrans models from HuggingFace and need to return multiple translations per source sentence (e.g., 8). Using the code provided in the README.md of the repository, whenever I increase the num_return_sequences parameter, the postprocessing step hangs indefinitely without any error message.
Expected behavior: I expect the postprocessing step to handle multiple return sequences and provide the output without hanging.
Actual behavior: When increasing num_return_sequences, the postprocessing step hangs, and there is no further message or output.
Here is the code I’m using (adapted from the README.md):
I'm using Python 3.10.12, with the following libraries installed:
Thanks!
The text was updated successfully, but these errors were encountered: