Replies: 16 comments 18 replies
-
I forget, do you use org-mode or markdown with gptel?
…On Sun, Mar 10, 2024, 7:27 AM daedsidog ***@***.***> wrote:
A lot of times I need to pass information to ChatGPT that I can't copy,
such as a snippet from an old, scanned document or formatted mathematics.
Right now I have a manual process where I query a visual model (though a
website) so that he tells me what he sees. E.g., when I paste him a snippet
of mathematics, he will give me LaTeX code, which I then pass down to
ChatGPT.
Would be very nice to have something like this.
—
Reply to this email directly, view it on GitHub
<#244>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACBVOLFYMVX46CWSCGOBZ43YXRUW7AVCNFSM6AAAAABEPASGXKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGE3TONZUGUYTKNY>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
I used to use Org mode but I switched to Markdown because I was tired of gptel sometimes doing weird things (like removing underscores). I think it's fixed in the latest version, but I haven't switched back. Why is that relevant, though? |
Beta Was this translation helpful? Give feedback.
-
I used to use Org mode but I switched to Markdown because I was tired of gptel sometimes doing weird things (like removing underscores). I think it's fixed in the latest version, but I haven't switched back.
It should be fixed now, yeah.
Why is that relevant, though?
It's easier to support vision models in Org mode. That said, please see the discussion in #231.
|
Beta Was this translation helpful? Give feedback.
-
Interesting. I honestly think the most power from gptel comes with just the abstraction layer it provides when interacting with various models. I, for one, have completely eliminated the process of manually typing code by implementing context generation. Below is a demonstration of me constructing a context buffer, and with a keypress I use gptel's replace-in-place with it. Works exceedingly well. This is kind of its own separate thing from gptel, but I was wondering if the scope of gptel should include this sort of thing. What I want now is just a way, totally unrelated to Org mode or MD, which will allow me to "send" ChatGPT queries with images (i.e., send current image saved on clip) and get an input in place. |
Beta Was this translation helpful? Give feedback.
-
I see.
Sorry, I had trouble following your demo. My best guess is that the buffer on the right is sent as the context (or system message), and you're asking it to do something with those functions.
I'm not sure gptel is set up to do that -- it's a very buffer-oriented system. At minimum it will need to distinguish between text as text and text that represents a file path and act on the file instead. A common way to do this would be to define a Basically, handling images is not ruled out, but right now I don't know the best way of doing so that conforms to a simple mental model like the chat usage does. |
Beta Was this translation helpful? Give feedback.
-
I'm interested to understand what you mean here -- I just had trouble following the demo. |
Beta Was this translation helpful? Give feedback.
-
My apologies, my explanation was terrible. The demo showcases a way to mark areas in different buffers, and aggregate them in their own dedicated buffer. That buffer can then be copied and handed to gptel as context. This is much easier than manually copy pasting sections of context into the dedicated chat buffer/external ChatGPT website, and also has the added bonus of minimizing the context by collapsing code that doesn't contribute to the context. You can manually remove context snippets from the context buffer. In a nutshell, it's a glorified yanker, but I found it incredibly useful.
I am wondering if you would be open for this to be integrated into gptel, or should this remain its own separate package. It's pretty useless outside of gptel, though. |
Beta Was this translation helpful? Give feedback.
-
I like the idea! I'll have to think about how to integrate it into gptel though. Right now the best idea I have is "Add an option to the transient menu to append a selected region to the system prompt". This won't work well across buffers since each buffer has its own system prompt. You've developed a more sophisticated UI for this style of usage, it's interesting. |
Beta Was this translation helpful? Give feedback.
-
Converting to a discussion since there's nothing to fix in gptel right now. |
Beta Was this translation helpful? Give feedback.
-
Just mind you, it's not a system prompt. I lexically set a system prompt that tells it how to treat the text it's supposed to replace, then I insert the user prompt in place with what is supposed to be replaced, and then I use the gptel refactor to handle everything. |
Beta Was this translation helpful? Give feedback.
-
@daedsidog your demo is very interesting If you don't mind, can share your gptel add-on so we can try it out? Thanks |
Beta Was this translation helpful? Give feedback.
-
@doctorguile I'll add all the things to my fork sometime soon. |
Beta Was this translation helpful? Give feedback.
-
I have added vision support to gptel in the It's actually a little more general than vision support -- a lot of the changes are about specifying per-model capabilities, to pave the way to add function calling, JSON output and image output (DALL-E etc) uniformly to To set it up correctly,
There are two ways to use it.
|
Beta Was this translation helpful? Give feedback.
-
Hello Karthik what wonderful news! I was hoping for a long time you would implement vision capabilites. Method 1.Using this method works perfectly.
Method 2:i marked the region and pressed my
it conforms. to the org mode image link defaults. But this doesnt work. |
Beta Was this translation helpful? Give feedback.
-
Image support has been merged, and is available in gptel 0.9.5. |
Beta Was this translation helpful? Give feedback.
-
There is no support for Markdown mode yet, right? I tried multiple link variations and none of them were detected. Looking in the code, only org mode has link parsing. |
Beta Was this translation helpful? Give feedback.
-
A lot of times I need to pass information to ChatGPT that I can't copy, such as a snippet from an old, scanned document or formatted mathematics.
Right now I have a manual process where I query a visual model (though a website) so that he tells me what he sees. E.g., when I paste him a snippet of mathematics, he will give me LaTeX code, which I then pass down to ChatGPT.
Would be very nice to have something like this.
Beta Was this translation helpful? Give feedback.
All reactions