Lawfluence - LegalEval Subtask A

This project is an implementation of the Sem-Eval 2023 LegalEval Subtask A, submitted as a final project for the course Wi23/24 Advanced Natural Language Processing at the University of Potsdam.

1. Task

The task outlined by Sem-Eval 2023 is as follows: Given an annotated corpus of legal documents with which to train, predict each sentence in a set of documents as corresponding to a set of 13 distinct Rhetorical Role (RR) labels. Rhetorical roles are relevant sentence categories present in judgement texts. All 13 RR labels can be found in the appendix.

2. Data

The data is comprised of Indian court judgement texts. The documents were annotated at the sentence level with respect to 13 distinct RRs.

2.1 Input Data Format

The top level structure of each JSON file is a list, where each entry represents a judgement-labels data point. Each data point is a dict with the following keys:

id: a unique id for this data point. This is useful for evaluation.
annotations:list of dict.The items in the dict are:
- resulta list of dictionaries containing sentence text and corresponding labels pair.The keys are:
  - id:unique id of each sentence
  - value:a dictionary with the following keys:
    - start:integer.starting index of the text
    - end:integer.end index of the text
    - text:string.The actual text of the sentence
    - labels:list.the labels that correspond to the text
data: the actual text of the judgement.
meta: a string.It tells about the category of the case(Criminal,Tax etc.)

3. Training the model

Training (and evaluating) the various models in this project can be done via the following flags on your terminal (assuming you are in the projects directory)

3.1 Model flags

Choose one of the following flags to denote model architecture:

--bilstm OR --cnn_bilstm

3.2 Training flags

Choose one of the following flags to denote training:

--default_train OR --advanced_train OR --grid_search OR --advanced_grid_search

'--default_train' marks the standard training process with standard BERT embeddings

'--advanced_train' trains the model on (separately) standard BERT and LegalBERT embeddings, and applies custom class weights to the classes (tool for approaching the class imbalance issue in dataset)

'--grid_search' trains the model using grid search functionality in default mode (standard BERT embeddings). The model will train iteratively over different sets of hyper-parameters.

'--advanced_grid_search' trains the model using grid search functionality with the advanced training feature (both standard BERT and legalBERT embeddings + custom class weights). The model will train iteratively over different sets of hyper-parameters.

The parameters used for training can be found in 'main.py' on lines 79-95.

legaleval-subtask-a-main % python main.py --(model flag) --(training flag)

For example:

legaleval-subtask-a-main % python main.py --bilstm --advanced_train

4. Evaluating performance

4.1 Hyperparameters

The hyperparameters for the current run are displayed immediately after execution:

Working with: 
{'epochs': 100, 'learning_rate': 0.0005, 'learning_rate_floor': 5e-06, 'dropout': 0.25, 
'hidden_size': 512, 'num_layers': 1}
Model type: BiLSTM
Train type: advanced

4.2 Evaluation metrics

Once the training is complete, the relevant accuracies and F1-scores will printed, along with the accuracies and F1-scores for each class. For example:

Accuracies: [0.81181619 0.33333333 0.42105263 0.82233503 0.74509804 0.95604396
 0.99204771 0.         0.47311828 0.52054795 0.53246753 0.82178218
 0.625     ] 

 Average accuracy (standard BERT): 0.619587909791633

F1 Scores: [0.7856008  0.26548668 0.29090904 0.83076918 0.7524752  0.93548382
 0.99106251 0.         0.63007155 0.52413788 0.42487042 0.87830683
 0.56603768] 
 Average F1: 0.6057855070889212

Accuracies Legal BERT: [0.67161227 0.04761905 0.04761905 0.49598163 0.36842105 0.45238095
 0.81405896 0.         0.25757576 0.38636364 0.125      0.47058824
 0.09615385] 

 Average accuracy Legal BERT: 0.32564418676524204

F1 Scores Legal BERT: [0.68378646 0.02247187 0.06060601 0.59586202 0.31818177 0.16379307
 0.75978831 0.         0.24999995 0.2931034  0.01612902 0.57142852
 0.12345674] 
 Average F1 Legal BERT: 0.2968159349436752

*Note that because of the '--advanced_train' flag, two distinct training runs (one with standard BERT embeddings, the other with legalBERT embeddings) were executed.

5. Generating the pre-trained embeddings

We used two different pre-trained models in this project:

BERT ('bert-base-uncased')
LegalBERT ('nlpaueb/legal-bert-base-uncased')

The training of these models on this data was conducted in the file 'emb_label_generation.py'. Training is executed simply by running this file:

legaleval-subtask-a-main % python emb_label_generation.py

Appendix

Rhetorical roles (RRs):

Preamble (PREAMBLE)
Facts (FAC)
Ruling by Lower Court (RLC)
Issues (ISSUE)
Argument by Petitioner (ARG_PETITIONER)
Argument by Respondent (ARG_RESPONDENT)
Analysis (ANALYSIS)
Statute (STA)
Precendent Relied (PRE_RELIED)
Precendent Not Relied (PRE_NOT_RELIED)
Ratio of the decision (RATIO)
Ruling by Present Court (RPC)
NONE

Name		Name	Last commit message	Last commit date
Latest commit History 126 Commits
__pycache__		__pycache__
data		data
jupyter_notebooks		jupyter_notebooks
max_length_dicts		max_length_dicts
models		models
output		output
test_document		test_document
train_document		train_document
.DS_Store		.DS_Store
.gitignore		.gitignore
Dataset_Reader.py		Dataset_Reader.py
README.md		README.md
emb_label_generation.py		emb_label_generation.py
evaluation_functions.py		evaluation_functions.py
helper.py		helper.py
main.py		main.py
models.py		models.py
results.ipynb		results.ipynb
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lawfluence - LegalEval Subtask A

1. Task

2. Data

2.1 Input Data Format

3. Training the model

3.1 Model flags

3.2 Training flags

4. Evaluating performance

4.1 Hyperparameters

4.2 Evaluation metrics

5. Generating the pre-trained embeddings

Appendix

About

Releases

Packages

Contributors 2

Languages

Kshitij301199/Rhetorical-Role-Prediction-in-Legal-Documents

Folders and files

Latest commit

History

Repository files navigation

Lawfluence - LegalEval Subtask A

1. Task

2. Data

2.1 Input Data Format

3. Training the model

3.1 Model flags

3.2 Training flags

4. Evaluating performance

4.1 Hyperparameters

4.2 Evaluation metrics

5. Generating the pre-trained embeddings

Appendix

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages