Skip to content

Latest commit

 

History

History
2 lines (2 loc) · 740 Bytes

README.md

File metadata and controls

2 lines (2 loc) · 740 Bytes

Bachelor Thesis

This repository includes the code associated with my Bachelor Thesis called "Analysis of Contract Lifecycles Based on Multiclass Classification Using Natural Language Processing and Machine Learning". In this NLP project, real-world contract documents are classified (into, e.g., attachment, amendment, agreement) employing several feature extraction & selection, dimensionality reduction and machine / deep learning techniques. Finally, the results of the different classification learning algorithms are used in a semi-supervised fashion (self-training) to label the unlabeled documents. Based on the classification, the contract lifecycles consisting of different document types are visualized using a Sankey diagram.