-
Notifications
You must be signed in to change notification settings - Fork 17
/
index.qmd
35 lines (29 loc) · 1.43 KB
/
index.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
---
title: "LeTourDataSet"
---
# LeTourDataSet
![Distance and winner average pace](https://raw.githubusercontent.com/camminady/LeTourDataSet/master/data/TDF_Distance_And_Pace.png)
## TL;DR
If you use `pandas`, just get the data via:
```python
import pandas as pd
df = pd.read_csv("https://raw.githubusercontent.com/camminady/LeTourDataSet/master/data/TDF_Riders_History.csv")
```
If you use `R` instead of `python`, you can run:
```R
library(readr)
df <- read_csv("https://raw.githubusercontent.com/camminady/LeTourDataSet/master/data/TDF_Riders_History.csv")
```
## Disclaimer
For issues with this data set, see the [Issues tab](https://github.com/camminady/LeTourDataSet/issues). There are some entries that are incorrect. However, so far it seems that the mistake stems from wrong data on the letour.fr website. Looking back, I should have probably scraped another website.
## Data
Every cyclist of the Tour de France in a single CSV file, stored in the file `data/TDF_Riders_History.csv`.
There's also data on every stage in `data/TDF_Stages_History.csv`.
## How to run
In your shell, just run these commands:
```python
poetry install # to install the environment
poetry run python letourdataset/Downloader.py # get the data
```
## Legacy code
This code has been completely rewritten. The previous code, including the output, is in the [legacy repository](https://github.com/camminady/LeTourDataSetLegacy). Especially `legacy/README.txt` should be read.