Skip to content

Commit

Permalink
Merge branch 'main' into vector-database
Browse files Browse the repository at this point in the history
  • Loading branch information
milistu authored May 27, 2024
2 parents 2989a18 + e53e3be commit e0cbd6c
Show file tree
Hide file tree
Showing 10 changed files with 218 additions and 290 deletions.
34 changes: 24 additions & 10 deletions .github/workflows/tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ on:
- main

jobs:
changed_files:
test_router:
runs-on: ubuntu-latest # windows-latest || macos-latest
name: Test Router
steps:
Expand All @@ -19,28 +19,42 @@ jobs:
with:
files: router/**
# files_ignore: docs/static.js

- name: Install Dependencies
if: steps.changed-files-router.outputs.any_changed == 'true'
run: pip install -r requirements.txt

- name: Run step if any file(s) in the router folder change
- name: Run Tests if any file(s) in the router folder change
if: steps.changed-files-router.outputs.any_changed == 'true'
env:
ALL_CHANGED_FILES: ${{ steps.changed-files-router.outputs.all_changed_files }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: |
pip install -r requirements.txt
python -m unittest tests/test_router.py
LANGFUSE_SECRET_KEY: ${{ secrets.LANGFUSE_SECRET_KEY }}
LANGFUSE_PUBLIC_KEY: ${{ secrets.LANGFUSE_PUBLIC_KEY }}
LANGFUSE_HOST: ${{ secrets.LANGFUSE_HOST }}
run: python -m unittest tests/test_router.py

test_database:
runs-on: ubuntu-latest # windows-latest || macos-latest
name: Test Database
steps:
- uses: actions/checkout@v4
# Test Database
- name: Get changed files in the database folder
id: changed-files-database
uses: tj-actions/changed-files@v44
with:
files: database/**

- name: Install Dependencies
if: steps.changed-files-database.outputs.any_changed == 'true'
run: pip install -r requirements.txt

- name: Run step if any file(s) in the database folder change
- name: Run Tests if any file(s) in the database folder change
if: steps.changed-files-database.outputs.any_changed == 'true'
env:
ALL_CHANGED_FILES: ${{ steps.changed-files-database.outputs.all_changed_files }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: |
pip install -r requirements.txt
python -m unittest tests/test_database.py
QDRANT_API_KEY: ${{ secrets.QDRANT_API_KEY }}
QDRANT_CLUSTER_URL: ${{ secrets.QDRANT_CLUSTER_URL }}
run: python -m unittest tests/test_database.py
8 changes: 7 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,10 @@
# Legal ChatBot Documentation
# Legal ChatBot 👩‍⚖️

Legal ChatBot is an innovative project designed to assist users in navigating the complex world of legal documents.

Utilizing a combination of RAG (Retrieval-Augmented Generation) technology and a deep knowledge base of law articles, this bot can intelligently reference relevant legal texts during interactions. It offers an interactive platform for querying legal information, making it a valuable tool for professionals, students, and anyone needing quick insights into legal matters.

Setup involves **Poetry** for dependency management, **Qdrant** for vector database functionality, and **Langfuse** for enhancing chatbot performance, ensuring a robust and efficient user experience.

## Setting Up the Project

Expand Down
3 changes: 2 additions & 1 deletion app.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ def response_generator(query: str):
# Rout query
collections = semantic_query_router(
client=openai_client,
model=config["openai"]["gpt_model"]["router"],
query=query,
prompt=ROUTER_PROMPT,
temperature=config["openai"]["gpt_model"]["temperature"],
Expand All @@ -82,7 +83,7 @@ def response_generator(query: str):

stream = get_answer(
client=openai_client,
model=config["openai"]["gpt_model"]["name"],
model=config["openai"]["gpt_model"]["llm"],
temperature=config["openai"]["gpt_model"]["temperature"],
messages=get_messages(
context=context, query=query, conversation=st.session_state.messages
Expand Down
2 changes: 1 addition & 1 deletion chat-dev.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -225,7 +225,7 @@
"outputs": [],
"source": [
"response = openai_client.chat.completions.create(\n",
" model=config[\"openai\"][\"gpt_model\"][\"name_light\"],\n",
" model=config[\"openai\"][\"gpt_model\"][\"router\"],\n",
" temperature=config[\"openai\"][\"gpt_model\"][\"temperature\"],\n",
" messages=messages,\n",
")"
Expand Down
1 change: 1 addition & 0 deletions llm/prompts.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
Tvoj zadatak je da identifikuješ potrebe klijenta i na osnovu toga pružite najrelevantnije informacije.
Kada pružaš odgovore ili savete, naglasiti iz kojeg tačno pravnog člana dolazi informacija i obavezno obezbedi link ka tom članu kako bi klijent mogao dodatno da se informiše.
Cilj je da komunikacija bude efikasna i da klijent oseti da je u dobrim rukama.
Korisnik može da postavi pitanje na bilo kom jeziku i tvoj zadatak je da na pitanje odgovriš na istom jeziku kao i pitanje korisnika.
Format odgovora:
- Ispod naslova **Sažetak** prvo odgovori kratko i direktno na pitanje klijenta koristeći laičke izraze bez složene pravne terminologije.
Expand Down
6 changes: 4 additions & 2 deletions router/router_prompt.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,17 @@
- Zakon o zaštiti potrošača osigurava da potrošači u Srbiji imaju prava na sigurnost i kvalitet proizvoda i usluga. Zakon propisuje obaveze trgovaca u pogledu pravilnog informisanja potrošača o proizvodima, uslugama, cenama i pravu na reklamaciju. Takođe, uključuje prava potrošača na odustanak od kupovine unutar određenog roka i prava u slučaju neispravnosti proizvoda.
- porodicni_zakon
- Porodični zakon reguliše pravne odnose unutar porodice, uključujući brak, roditeljstvo, starateljstvo, hraniteljstvo i usvojenje. Zakon definiše prava i obaveze bračnih partnera, kao i prava dece i roditeljske odgovornosti. Takođe se bavi pitanjima nasleđivanja i alimentacije.
- nema_zakona
- Korisnikovo pitanje ne odgovara ni jednom zakonu.
**FORMAT ODGOVORA:**
- Odgovor vratiti u JSON formatu.
- Odgovor treba da sadrzi samo JSON output, bez dodataka.
- Odgovor mora da bude string koji moze da se ucita uz pmoc komande json.loads().
- Imena zakona mogu biti samo sledeca: zakon_o_radu, zakon_o_porezu_na_dohodak_gradjana, zakon_o_zastiti_podataka_o_licnosti, zakon_o_zastiti_potrosaca, porodicni_zakon.
- Imena zakona mogu biti samo sledeca: zakon_o_radu, zakon_o_porezu_na_dohodak_gradjana, zakon_o_zastiti_podataka_o_licnosti, zakon_o_zastiti_potrosaca, porodicni_zakon, nema_zakona.
- Jedno pitanje korisnika moze da se odnosi na vise zakona.
- Ukoliko mislis da zakon odgovara korisnikovom pitanju ali nisi 100% siguran onda ga svakako stavi u odgovor.
- Ukoliko korisnikovo pitanje ne odgovara ni jednom zakonu vrati genericki string: "nema_zakona".
- Ukoliko korisnikovo pitanje ne odgovara ni jednom zakonu vrati listu sa generickim stringom: ["nema_zakona"].
- Zakone uvek moras vracati kao listu stringova bez obzira da li ih je 1 ili vise.
- Primer JSON odgovora:
Expand Down
31 changes: 31 additions & 0 deletions scraper/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Scraper

This script scrapes law articles from a list of URLs and saves them as JSON files.

## Usage

To run the script, use the following command:

```bash
python scraper/scraper.py --file scraper/urls.txt --output-dir laws_test
```

## Arguments
- `--url`: A single URL to scrape.
- `--file`: Path to a text file containing URLs separated by newlines.
- `--output-dir`: Directory to save the JSON files (default is scraper/laws).

## Example
To scrape law articles from a single URL (example: Serbian Labor Law) and save the output in the `scraper/laws` directory:
```bash
python scraper/scraper.py --url "https://www.paragraf.rs/propisi/zakon_o_radu.html" --output-dir scraper/laws
```

To scrape law articles from a list of URLs in urls.txt and save the output in the `scraper/laws` directory:
```bash
python scraper/scraper.py --file scraper/urls.txt --output-dir scraper/laws
```
> ⚠️ _**Note**: Ensure you are in the root directory of the project before running the script._
## Output
The output JSON files will be saved in the specified output directory, with each file named after the corresponding URL's stem.
Loading

0 comments on commit e0cbd6c

Please sign in to comment.