Skip to content

J1wanSeo/CAU-E3-WebCrawller--w-telegram

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CAU E3 WebCrawller /w telegram

Background

  • During School I usually miss the notice that could be helpful at achieving higher career. I thought it will helpfull, if there's robot that notice me when new articles come out.
  • By these needs I made it by python(bs4) and telegram-bot-python.

Ability

  • It crawlles e3home Polaris CAU CAU NOTICE every 1 mins by server.
  • It checks with latest files
  • latest.txt contains title of article which was latest during before run.

ToDo

  • add another website helpful to achieve higher career.
  • link with MongoDB to substitue latest.txt files.
    • Code Structure:
      • Save date and title if new_title is not located in db
        • How? : Save whole titles at standard time.
        • Managing Data : Delete it from DB after 2 days from article uploaded.
  • change MongoDB to Oracle postgresql
  • Convert programming language from Python 2 C

edited 2023-03-08

  • deleted parameter flag which delays code
  • changed parse_format of telegram-bot to 'MarkdownV2'
  • bot automatically gets articles body by entering article's address at POLARIS CAU Notice
  • defined function name 'md2' that reconfigure texts in html to adjust it to 'Markdown Grammar'

edited 2023-03-10

  • added MongoDB Syncronization.
  • modified md2 function.

edited 2023-03-17

  • Changed MongoDB to PostgreSQL

edited 2023-08-07

  • Separated try-catch block by each website segments
  • CHANGED req.urlopen → requests.get(url)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages