Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow Type: RSS Feed Based Crawl #697

Open
tuehlarsen opened this issue Mar 12, 2023 · 2 comments
Open

Workflow Type: RSS Feed Based Crawl #697

tuehlarsen opened this issue Mar 12, 2023 · 2 comments
Assignees
Labels
enhancement New feature or request workflow settings Issues related to adding or changing settings to instruct the crawler

Comments

@tuehlarsen
Copy link

we need rss feed based crawls similar to https://github.com/Landsbokasafn/crawlrss/tree/master/src/main/java/is/landsbokasafn/crawler/rss

@Shrinks99 Shrinks99 added the enhancement New feature or request label Mar 15, 2023
@Shrinks99 Shrinks99 added the workflow settings Issues related to adding or changing settings to instruct the crawler label Mar 28, 2023
@Shrinks99 Shrinks99 self-assigned this Nov 7, 2023
@ikreymer ikreymer moved this from Triage to Todo in Webrecorder Projects Nov 7, 2023
@Shrinks99 Shrinks99 changed the title add rssfeed based crawls Workflow Type: RSS Feed Based Crawl Nov 7, 2023
@Shrinks99 Shrinks99 added the blocked This issue is blocked by something else, please specify and remove the label once unblocked label Feb 11, 2024
@Shrinks99
Copy link
Member

Shrinks99 commented Feb 11, 2024

Blocked by #1372 EDIT: Removing as now that we have a pages list, I'm not sure if this is the case anymore.

@tw4l
Copy link
Member

tw4l commented May 30, 2024

Based on some conversations with our collaborators at Ouinet, this issue may be an easier way to implement de-duplication than #1372 (though we want to eventually support both), that could work for the majority of news sites.

@Shrinks99 Shrinks99 removed the blocked This issue is blocked by something else, please specify and remove the label once unblocked label May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request workflow settings Issues related to adding or changing settings to instruct the crawler
Projects
Status: Todo
Development

No branches or pull requests

3 participants