Skip to content

A tool that scrap a specific pdf format table and update google calendars.

License

Notifications You must be signed in to change notification settings

caiofrota/pdf-calendar-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PDF Calendar Scraper

Overview

A tool that scrap a specific pdf table format and update google calendars.

Google pre-configuration

  1. Create Google Cloud Platform account.

  2. Enable Google Calendar API on Google Cloud Platform.

    Please move to “APIs & Services” > “Dashboard”.

    image

    Please move to “ENABLE APIS AND SERVICES”.

    image

    Please type “Google Calendar API” in the search window and select “Google Calendar API”, and then enable Google Calendar API by clicking “ENABLE” button.

    image

  3. Create Service Account on Google Cloud Platform. Service Account is for non-human users.

    Please move to “APIs & Services” > “Service Accounts”.

    image

    And then please click “CREATE SERVICE ACCOUNT”.

    image

    Please input service account name and click “CREATE” button.

    image

    Other things are optional. So, I’ll skip inputting them because this time is just test. Please click “CONTINUE” and “DONE” buttons.

    image

  4. Generate Service Account key.

    Please select “Actions” > “Manage keys” at Service Account page.

    image

    Please click “ADD KEY” > “Create new key”.

    image

    Please click “CREATE” button with “JSON” key type. After that, you can see a dialog box for save and please save and keep your key. The key will be used by Python script.

    image

  5. Add Service Account to Google Calendar’s share member.

    Please copy Service Account email address. After that, Please open Google Calendar and move to “Settings and sharing”.

    image

    Please click “Add people” button at “Share with specific people”.

    image

    Please input your Service Account email address and click “Send” button.

    image

Installation

This example uses Python 3.10.12 and pip 22.0.2

  1. Install required libs

    pip install google-api-python-client google-auth pdfplumber

  2. Change the constants PDF_FILE, CREDENTIALS_FILE, CAL_ID and TIMEZONE.

    PDF_FILE is your PDF to be scraped

    CREDENTIALS_FILE is the file downloaded in the step 5 of the google pre-configuration

    CAL_ID is the Google Calendar Id you can get on calendar “Settings and sharing”

    TIMEZONE is the time zone where you want to see in your calendar

  3. python3 scraper.py

  4. See the magic!

Example of PDF

image

Support or contact

Contact me at caiofrota@gmail.com for questions and we'll help you sort it out.

Issues

Find a bug or want to request a new feature? Please let us know by submitting an issue.

Contributing

Contributions are welcome! If you have ideas for improvements, bug fixes, or new features, please feel free to submit an issue or pull request.

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.

License

PDF Scraper is released under MIT License. Feel free to use, modify, and distribute the application as per the license terms.

Disclaimer

This tool is intended for personal use. Users are responsible for adhering to the terms of service of the websites they scrape.

About

A tool that scrap a specific pdf format table and update google calendars.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages