Skip to content

Latest commit

 

History

History
40 lines (31 loc) · 1.78 KB

README.md

File metadata and controls

40 lines (31 loc) · 1.78 KB

CAU E3 WebCrawller /w telegram

Background

  • During School I usually miss the notice that could be helpful at achieving higher career. I thought it will helpfull, if there's robot that notice me when new articles come out.
  • By these needs I made it by python(bs4) and telegram-bot-python.

Ability

  • It crawlles e3home Polaris CAU CAU NOTICE every 1 mins by server.
  • It checks with latest files
  • latest.txt contains title of article which was latest during before run.

ToDo

  • add another website helpful to achieve higher career.
  • link with MongoDB to substitue latest.txt files.
    • Code Structure:
      • Save date and title if new_title is not located in db
        • How? : Save whole titles at standard time.
        • Managing Data : Delete it from DB after 2 days from article uploaded.
  • change MongoDB to Oracle postgresql
  • Convert programming language from Python 2 C

edited 2023-03-08

  • deleted parameter flag which delays code
  • changed parse_format of telegram-bot to 'MarkdownV2'
  • bot automatically gets articles body by entering article's address at POLARIS CAU Notice
  • defined function name 'md2' that reconfigure texts in html to adjust it to 'Markdown Grammar'

edited 2023-03-10

  • added MongoDB Syncronization.
  • modified md2 function.

edited 2023-03-17

  • Changed MongoDB to PostgreSQL

edited 2023-08-07

  • Separated try-catch block by each website segments
  • CHANGED req.urlopen → requests.get(url)