Skip to content

A web scraping tool that extracts soccer match schedules from specified websites to update calendars or create .ics files.

License

Notifications You must be signed in to change notification settings

caiofrota/web-soccer-match-crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Web Soccer Match Crawler

Overview

A web scraping tool that extracts soccer match schedules from placardefutebol.com.br and either create a .ics calendar file or sync with your google calendar.

Google pre-configuration

  1. Create Google Cloud Platform account.

  2. Enable Google Calendar API on Google Cloud Platform.

    Please move to “APIs & Services” > “Dashboard”.

    image

    Please move to “ENABLE APIS AND SERVICES”.

    image

    Please type “Google Calendar API” in the search window and select “Google Calendar API”, and then enable Google Calendar API by clicking “ENABLE” button.

    image

  3. Create Service Account on Google Cloud Platform. Service Account is for non-human users.

    Please move to “APIs & Services” > “Service Accounts”.

    image

    And then please click “CREATE SERVICE ACCOUNT”.

    image

    Please input service account name and click “CREATE” button.

    image

    Other things are optional. So, I’ll skip inputting them because this time is just test. Please click “CONTINUE” and “DONE” buttons.

    image

  4. Generate Service Account key.

    Please select “Actions” > “Manage keys” at Service Account page.

    image

    Please click “ADD KEY” > “Create new key”.

    image

    Please click “CREATE” button with “JSON” key type. After that, you can see a dialog box for save and please save and keep your key. The key will be used by Python script.

    image

  5. Add Service Account to Google Calendar’s share member.

    Please copy Service Account email address. After that, Please open Google Calendar and move to “Settings and sharing”.

    image

    Please click “Add people” button at “Share with specific people”.

    image

    Please input your Service Account email address and click “Send” button.

    image

Installation

This example uses Python 3.10.12 and pip 22.0.2

  1. Install required libs

    pip install bs4 lxml google-api-python-client google-auth

  2. Change the constants PRODID, CALNAME, CALDESC and TIMEZONE.

    SOURCE is the website you want to scrap (search for a team or a league in placardefutebol.com.br

    PRODID is the ics calendar id (free text)

    CALNAME is the ics calendar name (free text)

    CALDESC is the ics calendar description (free text)

    TIMEZONE is the time zone where you want to see in your calendar

  3. Run:

    python3 crawler.py
    python3 crawler.py ics <website>
    python3 crawler.py gcalendar <website> <google-calendar-id>
    
  4. Import the created ICS file in your web calendar!

Support or contact

Contact me at caiofrota@gmail.com for questions and we'll help you sort it out.

Issues

Find a bug or want to request a new feature? Please let us know by submitting an issue.

Contributing

Contributions are welcome! If you have ideas for improvements, bug fixes, or new features, please feel free to submit an issue or pull request.

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.

License

Web Soccer Match Crawler is released under MIT License. Feel free to use, modify, and distribute the application as per the license terms.

Disclaimer

This tool is intended for personal use. Users are responsible for adhering to the terms of service of the websites they scrape.

About

A web scraping tool that extracts soccer match schedules from specified websites to update calendars or create .ics files.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages