Skip to content

Commit

Permalink
Fix dataset reader
Browse files Browse the repository at this point in the history
  • Loading branch information
Riccorl committed Aug 10, 2020
1 parent 009c3e3 commit 7fe37c8
Show file tree
Hide file tree
Showing 3 changed files with 8 additions and 9 deletions.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,14 @@

# Semantic Role Lableing with BERT

Semantic Role Labeling based on [AllenNLP implementation](https://demo.allennlp.org/semantic-role-labeling) of [Shi et al, 2019](https://arxiv.org/abs/1904.05255). It uses [VerbAatlas](http://verbatlas.org/) inventory and it's trained also on predicate disambiguation, in addition to arguments identification and disambiguation.
Semantic Role Labeling based on [AllenNLP implementation](https://demo.allennlp.org/semantic-role-labeling) of [Shi et al, 2019](https://arxiv.org/abs/1904.05255). Can be trained using both PropBank and [VerbAatlas](http://verbatlas.org/) inventories and implements also the predicate disambiguation task, in addition to arguments identification and disambiguation.

### To-Dos

- [x] Works with both PropBank and VerbAtlas (infer inventory from dataset reader)
- [ ] Compatibility with all models from Huggingface's Transformers.
- Now works only with models that accept 1 as token type id
- [ ] Predicate identification (without using spacy)

### Infos

Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

setuptools.setup(
name="transformer_srl", # Replace with your own username
version="2.2rc13",
version="2.2rc14",
author="Riccardo Orlando",
author_email="orlandoricc@gmail.com",
description="SRL Transformer model",
Expand Down
12 changes: 5 additions & 7 deletions transformer_srl/dataset_readers.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
import logging
import logging
from typing import Dict, List, Iterable, Tuple, Any
from typing import Dict, Tuple, List

from allennlp.common.file_utils import cached_path
from allennlp.data.dataset_readers.dataset_reader import DatasetReader
Expand All @@ -13,15 +15,10 @@
from allennlp.data.tokenizers import Token
from allennlp_models.common.ontonotes import Ontonotes, OntonotesSentence
from allennlp_models.structured_prediction import SrlReader
from conllu import parse_incr
from overrides import overrides
from transformers import AutoTokenizer

from typing import Dict, Tuple, List
import logging

from conllu import parse_incr


logger = logging.getLogger(__name__)

"""
Expand Down Expand Up @@ -357,7 +354,6 @@ def _convert_tags_to_wordpiece_tags(self, tags: List[str], offsets: List[int]) -
return ["O"] + new_tags + ["O"]

def _get_predicate_labels(self, sentence, verb_indicator):
frames = [f if v == 1 else "O" for f, v in zip(frame_labels, verb_indicator)]
labels = []
for i, v in enumerate(verb_indicator):
if v == 1:
Expand All @@ -367,6 +363,8 @@ def _get_predicate_labels(self, sentence, verb_indicator):
else sentence.predicate_framenet_ids[i]
)
labels.append(label)
else:
labels.append("O")
return labels


Expand Down

0 comments on commit 7fe37c8

Please sign in to comment.