Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion of flexibility #6

Open
whyynoot opened this issue Feb 4, 2025 · 5 comments
Open

Suggestion of flexibility #6

whyynoot opened this issue Feb 4, 2025 · 5 comments

Comments

@whyynoot
Copy link

whyynoot commented Feb 4, 2025

I think it would be good to use LMStudio API for LLM request, so we can use local models.
SerpAPI migrate to Google Search API, so it will be limited to 100 search quires per day not for 100 / months on Serp.
If anyone have Jira alternatives or localhosted write here please.
I would do it myself and commit if find any time, but leave for now my ideas here. Great project anyway, thanks!

@benhaotang
Copy link

benhaotang commented Feb 5, 2025

Haven't made Jina local yet, but I have now made

  • SerpAPI to local Searxng
  • Openrouter to Ollama

checkout here: https://github.com/benhaotang/OpenDeepResearcher-via-searxng

I would guess making Jina also local can be challenging due to so many websites have DDoS protection and rate limits, maybe need to set some cool down interval.

@benhaotang
Copy link

benhaotang commented Feb 5, 2025

checkout here: https://github.com/benhaotang/OpenDeepResearcher-via-searxng

Note, with playwright, reader-lm and docling, web parsing is also completely local, you can definitely have a try if you have a good enough rig.

If Matt is interested I can definitely make a PR back

@alexgusevski
Copy link

Whats the difference between using Jina and a traditional scraper-as-a-service?I'm using a scraper called webunlocker in one of my projects, its pay as you go with 1.5 or 3$ per 1k successful scrapes. Then you can just do BS4 on it for free and you got the website content?

I mean Jina got pretty low RPM limits (40rpm) and I dont know what the scraper-aas has but I cant imagine it would be as low as Jina and there are lots of identical scraper services like this

Cant be bothered to calculate the price of Jina but as I can tell they charge per scraped tokens so probably it will cost more than just scraping traditionally even with a service?

Antything I'm missing?

@juzarantri
Copy link

I have one suggestion. If we can use bowseruse tool that is opensource library to make any decision based task. Let me know if that fits in or not.

@whyynoot
Copy link
Author

Whats the difference between using Jina and a traditional scraper-as-a-service?I'm using a scraper called webunlocker in one of my projects, its pay as you go with 1.5 or 3$ per 1k successful scrapes. Then you can just do BS4 on it for free and you got the website content?

I mean Jina got pretty low RPM limits (40rpm) and I dont know what the scraper-aas has but I cant imagine it would be as low as Jina and there are lots of identical scraper services like this

Cant be bothered to calculate the price of Jina but as I can tell they charge per scraped tokens so probably it will cost more than just scraping traditionally even with a service?

Antything I'm missing?

I think the reason for using multimodal compibitabilty, parsing not only text, rather including the photo OCR, pdf document analysis etc...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants