Skip to content

Vorratsdatenspeichert alle Tagesschau Artikel mithilfe der Tagesschau-Api. Die Daten sollen später ausgewertet werden.

Notifications You must be signed in to change notification settings

emilianscheel/Tagesschau-data-fetching

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tagesschau-data-fetching

Useful snippets

# gets json length from input file
jq length database.json
# gets number of files in currrent dir
find . -type f | wc -l
# gets size of current dir
du -hs
# gets last modified of file
stat database.json

Depends on

  • python3
  • urllib3
  • json
  • os
  • datetime
  • glob
  • BeautifulSoup
  • re

for analytics

  • numpy
  • pandas
  • matplotlib
  • seaborn
  • networkx

Setup (Linux)

mkdir ~/apps
cd ~/apps
git clone https://github.com/emilianscheel/tagesschau-data-fetching
# Create system service
sudo nano /etc/systemd/system/tagesschau-data-fetching.service
  1. Replace <user> with your username
  2. Paste the configuration into the file ends with .service
[Unit]
Description=Tagesschau data fetching
User=<user>
After=multi-user.target
Wants=tagesschau-data-fetching.timer

[Service]
Type=oneshot
WorkingDirectory=/home/<user>/apps/Tagesschau-data-fetching/
ExecStart=/usr/bin/python3 main.py

[Install]
WantedBy=multi-user.target
# Create system timer
sudo nano /etc/systemd/system/tagesschau-data-fetching.timer
  1. Replace <user> with your username
  2. Paste the configuration into the file ends with .timer
[Unit]
Description=Fetches Tagesschau.de for data and saves it
Requires=tagesschau-data-fetching.service

[Timer]
Unit=tagesschau-data-fetching.service
OnCalendar=*:0/11

[Install]
WantedBy=timers.target
# starts and enables service, view status
sudo systemctl enable tagesschau-data-fetching.service
sudo systemctl start tagesschau-data-fetching.service
sudo systemctl status tagesschau-data-fetching.service

# starts and enables timer, view status
sudo systemctl enable tagesschau-data-fetching.timer
sudo systemctl start tagesschau-data-fetching.timer
sudo systemctl status tagesschau-data-fetching.timer

That configuration starts our system service every eleven minutes. The system service triggers the main.py script which is the fetching the tagesschau api.

About

Vorratsdatenspeichert alle Tagesschau Artikel mithilfe der Tagesschau-Api. Die Daten sollen später ausgewertet werden.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages