Open
Description
Hi there! I am a new and frequent user of this great package, which also comes with a few inevitable GitHub issues 😅
When I initialize the pipeline as follows:
name = "absa/classifier-rest-0.2"
model = absa.BertABSClassifier.from_pretrained(name)
tokenizer = BertTokenizer.from_pretrained(name)
reference_recognizer = absa.aux_models.BasicReferenceRecognizer()
professor = absa.Professor(reference_recognizer)
nlp = absa.Pipeline(model=model, tokenizer=tokenizer, professor=professor)
I receive the following error:
TypeError Traceback (most recent call last)
/tmp/ipykernel_514/72277120.py in <module>
2 model = absa.BertABSClassifier.from_pretrained(name)
3 tokenizer = BertTokenizer.from_pretrained(name)
----> 4 reference_recognizer = absa.aux_models.BasicReferenceRecognizer()
5 professor = absa.Professor(reference_recognizer)
6 nlp = absa.Pipeline(model=model, tokenizer=tokenizer, professor=professor)
TypeError: __init__() missing 1 required positional argument: 'weights'
I realise this is because the BasicReferenceRecognizer
needs to be trained in order to select weights. This leads me to two questions/issues:
- The
BasicReferenceRecognizer
class has notrain
method. Is there another way in which to train it, or any ways to load a pretrained model from the package? From the unit tests for theBasicReferenceRecognizer
I found there were two pre-trained models,'absa/basic_reference_recognizer-rest-0.1'
and'absa/basic_reference_recognizer-lapt-0.1'
, but on trying to initialize with these I received anImportError
. - I also tried directly initializing the
BasicReferenceRecognizer
withweights=(-0.025, 44)
as is done in this line. However, upon making predictions I get an error in thePipeline
at thepostprocess
step:
TypeError Traceback (most recent call last)
/tmp/ipykernel_514/3162923628.py in <module>
3 for row in df.itertuples():
4 print(row)
----> 5 prediction = predict(row.Review, row.Aspect)
6 sentiment = get_sentiment(prediction)
7 certainty_score = get_certainty_score(prediction)
/tmp/ipykernel_514/1002360698.py in predict(text, aspect)
16 output_batch = nlp.predict(input_batch)
17 predictions = nlp.review(tokenized_examples, output_batch)
---> 18 completed_task = nlp.postprocess(task, predictions)
19 completed_subtask = completed_task.subtasks[aspect]
20 return completed_subtask
/pyenv/versions/3.8.5/envs/seo-advice-page/lib/python3.8/site-packages/aspect_based_sentiment_analysis/pipelines.py in postprocess(task, batch_examples)
301 aspect, = {e.aspect for e in examples}
302 scores = np.max([e.scores for e in examples], axis=0)
--> 303 scores /= np.linalg.norm(scores, ord=1)
304 sentiment_id = np.argmax(scores).astype(int)
305 aspect_document = CompletedSubTask(
TypeError: ufunc 'true_divide' output (typecode 'd') could not be coerced to provided output parameter (typecode 'l') according to the casting rule ''same_kind''
I believe that this error is related to a TypeError
between int
and float
. If instead I initialize with weights = (1,1)
, for example, I receive no error.
I wanted to flag these issues for your awareness. Thank you very much for any advice you can provide 😄
Metadata
Metadata
Assignees
Labels
No labels