A tool that scans legal/financial contracts and flags high-risk clauses (indemnification, liability caps, warranty terms, etc.), helping businesses quickly assess potential exposure.
This project focuses on detecting the following high-risk contract clauses:
- Indemnification – Obligations to cover losses or damages.
- Limitation of Liability – Caps on damages a party must pay.
- Warranty / Defects – Promises about product/service quality.
- Termination Clauses – Conditions under which the contract ends.
- Confidentiality / Trade Secrets – Restrictions on sharing information.
This project uses NLP (starting with regex, later with spaCy/Legal-BERT) to identify high-risk clauses in contracts, such as indemnification, limitation of liability, warranty, termination, and confidentiality.
You can run this project directly in Google Colab:
contracts/→ sample agreementsscripts/target_clauses.json→ list of clauses & regex patternsscripts/detect_clauses_regex.py→ main detector scriptoutputs/→ reports (CSV)notebooks/demo.ipynb→ interactive demo notebook
| file | clause | paragraph | snippet |
|---|---|---|---|
NDA.docx |
confidentiality | 12 | “The parties agree to keep confidential…” |
MSA.pdf |
limitation of liability | 18 | “Liability shall not exceed fees paid…” |
The project now includes a full JavaScript + TypeScript implementation for clause detection.
- Browser-based scanner
- Node-compatible script
- Shared pattern file (
clauses.json) - Upload → Scan → Highlight high-risk clauses
/js
├── src
│ ├── detectClauses.js
│ ├── clauses.json
│ └── detectClauses.ts
└── demo
├── index.html
├── styles.css
└── app.js
- Download js/demo/index.html
- Upload a
.txt,.docx, or.pdffile (must be text-extractable) - Click Scan
- High-risk clauses will be highlighted automatically