Skip to content

Commit 1e13cf3

Browse files
committed
🤖 Bump v2024.10.28
1 parent 25449ef commit 1e13cf3

File tree

3 files changed

+97
-7
lines changed

3 files changed

+97
-7
lines changed

README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
[![Dataset on HF](https://huggingface.co/datasets/huggingface/badges/resolve/main/dataset-on-hf-md-dark.svg)](https://huggingface.co/datasets/antoinejeannot/jurisprudence) [![GitHub](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](https://github.com/antoinejeannot/jurisprudence)
44

5-
# ✨ Jurisprudence, release v2024.10.25 🏛️
5+
# ✨ Jurisprudence, release v2024.10.28 🏛️
66

77
Jurisprudence is an open-source project that automates the collection and distribution of French legal decisions. It leverages the Judilibre API provided by the Cour de Cassation to:
88

@@ -17,12 +17,12 @@ Whether you're conducting legal research, developing AI models, or simply intere
1717

1818
| Jurisdiction | Jurisprudences | Oldest | Latest | Tokens | JSONL (gzipped) | Parquet |
1919
|--------------|----------------|--------|--------|--------|-----------------|---------|
20-
| Cour d'Appel | 395,224 | 1996-03-25 | 2024-10-18 | 1,977,104,348 | [Download (1.73 GB)](https://huggingface.co/datasets/antoinejeannot/jurisprudence/resolve/main/cour_d_appel.jsonl.gz?download=true) | [Download (2.89 GB)](https://huggingface.co/datasets/antoinejeannot/jurisprudence/resolve/main/cour_d_appel.parquet?download=true) |
21-
| Tribunal Judiciaire | 79,721 | 2023-12-14 | 2024-10-17 | 283,788,133 | [Download (256.39 MB)](https://huggingface.co/datasets/antoinejeannot/jurisprudence/resolve/main/tribunal_judiciaire.jsonl.gz?download=true) | [Download (425.64 MB)](https://huggingface.co/datasets/antoinejeannot/jurisprudence/resolve/main/tribunal_judiciaire.parquet?download=true) |
22-
| Cour de Cassation | 537,065 | 1860-08-01 | 2024-10-24 | 1,107,898,877 | [Download (932.22 MB)](https://huggingface.co/datasets/antoinejeannot/jurisprudence/resolve/main/cour_de_cassation.jsonl.gz?download=true) | [Download (1.58 GB)](https://huggingface.co/datasets/antoinejeannot/jurisprudence/resolve/main/cour_de_cassation.parquet?download=true) |
23-
| **Total** | **1,012,010** | **1860-08-01** | **2024-10-24** | **3,368,791,358** | **2.89 GB** | **4.88 GB** |
20+
| Cour d'Appel | 396,317 | 1996-03-25 | 2024-10-22 | 1,981,675,335 | [Download (1.74 GB)](https://huggingface.co/datasets/antoinejeannot/jurisprudence/resolve/main/cour_d_appel.jsonl.gz?download=true) | [Download (2.90 GB)](https://huggingface.co/datasets/antoinejeannot/jurisprudence/resolve/main/cour_d_appel.parquet?download=true) |
21+
| Tribunal Judiciaire | 82,085 | 2023-12-14 | 2024-10-22 | 291,028,506 | [Download (263.20 MB)](https://huggingface.co/datasets/antoinejeannot/jurisprudence/resolve/main/tribunal_judiciaire.jsonl.gz?download=true) | [Download (436.65 MB)](https://huggingface.co/datasets/antoinejeannot/jurisprudence/resolve/main/tribunal_judiciaire.parquet?download=true) |
22+
| Cour de Cassation | 537,252 | 1860-08-01 | 2024-10-24 | 1,107,801,271 | [Download (932.25 MB)](https://huggingface.co/datasets/antoinejeannot/jurisprudence/resolve/main/cour_de_cassation.jsonl.gz?download=true) | [Download (1.58 GB)](https://huggingface.co/datasets/antoinejeannot/jurisprudence/resolve/main/cour_de_cassation.parquet?download=true) |
23+
| **Total** | **1,015,654** | **1860-08-01** | **2024-10-24** | **3,380,505,112** | **2.90 GB** | **4.90 GB** |
2424

25-
<i>Latest update date: 2024-10-25</i>
25+
<i>Latest update date: 2024-10-28</i>
2626

2727
<i># Tokens are computed using GPT-4 tiktoken and the `text` column.</i>
2828

jurisprudence/settings.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
JURISPRUDENCE_LAST_EXPORT_DATETIME = "2024-10-25 01:05:25"
1+
JURISPRUDENCE_LAST_EXPORT_DATETIME = "2024-10-28 01:05:49"

release_notes/v2024.10.28.md

Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
<p align="center"><img src="https://raw.githubusercontent.com/antoinejeannot/jurisprudence/artefacts/jurisprudence.svg" width=650></p>
2+
3+
[![Dataset on HF](https://huggingface.co/datasets/huggingface/badges/resolve/main/dataset-on-hf-md-dark.svg)](https://huggingface.co/datasets/antoinejeannot/jurisprudence) [![GitHub](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](https://github.com/antoinejeannot/jurisprudence)
4+
5+
# ✨ Jurisprudence, release v2024.10.28 🏛️
6+
7+
Jurisprudence is an open-source project that automates the collection and distribution of French legal decisions. It leverages the Judilibre API provided by the Cour de Cassation to:
8+
9+
- Fetch rulings from major French courts (Cour de Cassation, Cour d'Appel, Tribunal Judiciaire)
10+
- Process and convert the data into easily accessible formats
11+
- Publish & version updated datasets on Hugging Face every few days.
12+
13+
It aims to democratize access to legal information, enabling researchers, legal professionals and the public to easily access and analyze French court decisions.
14+
Whether you're conducting legal research, developing AI models, or simply interested in French jurisprudence, this project might provide a valuable, open resource for exploring the French legal landscape.
15+
16+
## 📊 Exported Data
17+
18+
| Jurisdiction | Jurisprudences | Oldest | Latest | Tokens | JSONL (gzipped) | Parquet |
19+
|--------------|----------------|--------|--------|--------|-----------------|---------|
20+
| Cour d'Appel | 396,317 | 1996-03-25 | 2024-10-22 | 1,981,675,335 | [Download (1.74 GB)](https://huggingface.co/datasets/antoinejeannot/jurisprudence/resolve/main/cour_d_appel.jsonl.gz?download=true) | [Download (2.90 GB)](https://huggingface.co/datasets/antoinejeannot/jurisprudence/resolve/main/cour_d_appel.parquet?download=true) |
21+
| Tribunal Judiciaire | 82,085 | 2023-12-14 | 2024-10-22 | 291,028,506 | [Download (263.20 MB)](https://huggingface.co/datasets/antoinejeannot/jurisprudence/resolve/main/tribunal_judiciaire.jsonl.gz?download=true) | [Download (436.65 MB)](https://huggingface.co/datasets/antoinejeannot/jurisprudence/resolve/main/tribunal_judiciaire.parquet?download=true) |
22+
| Cour de Cassation | 537,252 | 1860-08-01 | 2024-10-24 | 1,107,801,271 | [Download (932.25 MB)](https://huggingface.co/datasets/antoinejeannot/jurisprudence/resolve/main/cour_de_cassation.jsonl.gz?download=true) | [Download (1.58 GB)](https://huggingface.co/datasets/antoinejeannot/jurisprudence/resolve/main/cour_de_cassation.parquet?download=true) |
23+
| **Total** | **1,015,654** | **1860-08-01** | **2024-10-24** | **3,380,505,112** | **2.90 GB** | **4.90 GB** |
24+
25+
<i>Latest update date: 2024-10-28</i>
26+
27+
<i># Tokens are computed using GPT-4 tiktoken and the `text` column.</i>
28+
29+
## 🤗 Hugging Face Dataset
30+
31+
The up-to-date jurisprudences dataset is available at: https://huggingface.co/datasets/antoinejeannot/jurisprudence in JSONL (gzipped) and parquet formats.
32+
33+
This allows you to easily fetch, query, process and index all jurisprudences in the blink of an eye!
34+
35+
### Usage Examples
36+
#### HuggingFace Datasets
37+
```python
38+
# pip install datasets
39+
import datasets
40+
41+
dataset = load_dataset("antoinejeannot/jurisprudence")
42+
dataset.shape
43+
>> {'tribunal_judiciaire': (58986, 33),
44+
'cour_d_appel': (378392, 33),
45+
'cour_de_cassation': (534258, 33)}
46+
47+
# alternatively, you can load each jurisdiction separately
48+
cour_d_appel = load_dataset("antoinejeannot/jurisprudence", "cour_d_appel")
49+
tribunal_judiciaire = load_dataset("antoinejeannot/jurisprudence", "tribunal_judiciaire")
50+
cour_de_cassation = load_dataset("antoinejeannot/jurisprudence", "cour_de_cassation")
51+
```
52+
53+
Leveraging datasets allows you to easily ingest data to [PyTorch](https://huggingface.co/docs/datasets/use_with_pytorch), [Tensorflow](https://huggingface.co/docs/datasets/use_with_tensorflow), [Jax](https://huggingface.co/docs/datasets/use_with_jax) etc.
54+
55+
#### BYOL: Bring Your Own Lib
56+
For analysis, using polars, pandas or duckdb is quite common and also possible:
57+
```python
58+
url = "https://huggingface.co/datasets/antoinejeannot/jurisprudence/resolve/main/cour_de_cassation.parquet" # or tribunal_judiciaire.parquet, cour_d_appel.parquet
59+
60+
# pip install polars
61+
import polars as pl
62+
df = pl.scan_parquet(url)
63+
64+
# pip install pandas
65+
import pandas as pd
66+
df = pd.read_parquet(url)
67+
68+
# pip install duckdb
69+
import duckdb
70+
table = duckdb.read_parquet(url)
71+
```
72+
73+
## 🪪 Citing & Authors
74+
75+
If you use this code in your research, please use the following BibTeX entry:
76+
```bibtex
77+
@misc{antoinejeannot2024,
78+
author = {Jeannot Antoine and {Cour de Cassation}},
79+
title = {Jurisprudence},
80+
year = {2024},
81+
howpublished = {\url{https://github.com/antoinejeannot/jurisprudence}},
82+
note = {Data source: API Judilibre, \url{https://www.data.gouv.fr/en/datasets/api-judilibre/}}
83+
}
84+
```
85+
86+
This project relies on the [Judilibre API par la Cour de Cassation](https://www.data.gouv.fr/en/datasets/api-judilibre/), which is made available under the Open License 2.0 (Licence Ouverte 2.0)
87+
88+
It scans the API every 3 days at midnight UTC and exports its data in various formats to Hugging Face, without any fundamental transformation but conversions.
89+
90+
<p align="center"><a href="https://www.etalab.gouv.fr/licence-ouverte-open-licence/"><img src="https://raw.githubusercontent.com/antoinejeannot/jurisprudence/artefacts/license.png" width=50 alt="license ouverte / open license"></a></p>

0 commit comments

Comments
 (0)