Generating the conversation automatically. #26
Closed
arthurwolf
started this conversation in
Data
Replies: 1 comment
-
I demonstrated that Claude can do a full check of the data and can serve as the user agent. Do not get ahead of the conversation, I have done multi-step data synthesis plenty of times. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
In the video, you have a human (you) checking if the solution is correct or not, and ask the model to think through its mistakes.
I think we can (in a lot of cases, maybe not all) do without that (costly) human requirement, using prompt engineering.
I tried some things, this works pretty well (and can surely be improved with experience) :
(note the question comes from https://arxiv.org/html/2405.19616 )
(note this would be much more powerful if along with the instructions came multiple examples of using the instructions on real world questions, I just have not had time to do that for this demo).
The important thing here is really step 7.
Now, in my testing, this finds the solution to a lot of difficult questions, with reasonning, and from that output, you can "extract" chain of thought reasonning, by giving it an example of what chain of thought reasonning looks like (for example using the o1 demo examples), and asking it to "transform" the loop into chain of thought.
I don't think this will work for all problems, but in my testing it works for a lot of those I tried out.
The main issue I'm running into right now, is that with claude at least, I often hit the answer length limit before getting to the end of the loop. But in most cases even if that happens, the data generated is enough to convert to CoT (and I believe there are now models with longer answer lengths)
Here's an example of using the prompt, note how similar the actual reasonning is to what o1 does, without requiring a human to tell it whether or not it made a mistake:
And poof, hit the limit.
Now from there, we need to convert this into a CoT conversation.
The prompt for this is super long, because it contains an example of what a "CoT" conversation should look like, so I'm not pasting it here, but you can find it at
https://gist.github.com/arthurwolf/7011e5f5ba06867f558c2da0dbf4f6df#file-conversion-md
It gives us an answer in CoT format like this:
As you can see, this process, using two LLM calls, converted a question/puzzle, into a "chain of thought" answer that can be used in a dataset, without requiring a human to participate in the process.
(note, this tries to do everything with two llm calls, but I believe there is a way to do this much better / with a higher success rate, by extracting the logic (looping, going step by step) outside of the prompts, and doing it with code, which would result in much more llm calls, but make it much less likely the llm gets "confused" by context. I can write down an example of doing that if somebody is interrested)
Beta Was this translation helpful? Give feedback.
All reactions