Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a Vimium fork to support multimodal models #5

Open
ishan0102 opened this issue Nov 9, 2023 · 4 comments
Open

Create a Vimium fork to support multimodal models #5

ishan0102 opened this issue Nov 9, 2023 · 4 comments
Labels
enhancement New feature or request

Comments

@ishan0102
Copy link
Owner

It might make sense to create a fork of Vimium designed specifically for making it easier for multimodal LLMs to choose relevant elements on a page. This might involve messing around with annotation colors, sizes, fonts, etc.

@ishan0102 ishan0102 added the enhancement New feature or request label Nov 9, 2023
@philc
Copy link

philc commented Nov 9, 2023

Vimium author here. I have no opinion about whether to fork. I just heard about this project today and wanted to say, this is cool! Good luck!

@ishan0102
Copy link
Owner Author

ishan0102 commented Nov 10, 2023

@philc Wow thank you so much, means a lot coming from you! I love your work!

@asim-shrestha
Copy link

We just open sourced a utility library that can tagify web pages for you: https://github.com/reworkd/tarsier

Could be a drop in replacement for vimium. We have plans to be able to customize tag appearance / positioning if that's interesting

@aincube
Copy link

aincube commented Nov 13, 2023

Just an idea, but maybe possible to use qutebrowser :

  • written in Python (using QtWebEngine)
  • has vim-mode par default
  • has userscripts - really neat feature to automate things

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants