Question of the post-training motivatioin. #13

doubleZ0108 · 2025-02-20T10:00:00Z

Hi, thanks for your work. The demo on Hugging-Face is charming.

I am using Qwen-2.5-VL family for object detection these days, both 3B / 7B / 72B demonstrate strong visual positioning capabilities. I ask for the bounding box coordinates of the target object and make it arranged on .json format. After several rounds of prompt word engineering, I can get promising results from the model. Normally this is not a regular VL task, but the Qwen does really well.

So what I'm curious about is whether the SFT and R1 training strategies you mentioned are essentially "how to automatically organize prompts to make object recognition tasks easier", which I mean the automatically prompt engineering, but no new knowledge emerged for the bbox-style detection task?

Looking forward to hearing your answer.

snakeztc · 2025-02-21T00:15:59Z

The current results we show is training the model with only a few hundreds steps on the REC task, so we believe it's more like aligning the model to the task better with GRPO loss (i.e. not much new knowledge is introduced to the model).

That said the parameters are updated via GRPO training, and when we training the model longer with more tasks, we do believe new knowledge can be injected.

doubleZ0108 · 2025-02-21T01:32:28Z

It's an interesting finding. I'll keep an eye on the project.

doubleZ0108 closed this as completed Feb 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question of the post-training motivatioin. #13

Question of the post-training motivatioin. #13

doubleZ0108 commented Feb 20, 2025 •

edited

Loading

snakeztc commented Feb 21, 2025

doubleZ0108 commented Feb 21, 2025

Question of the post-training motivatioin. #13

Question of the post-training motivatioin. #13

Comments

doubleZ0108 commented Feb 20, 2025 • edited Loading

snakeztc commented Feb 21, 2025

doubleZ0108 commented Feb 21, 2025

doubleZ0108 commented Feb 20, 2025 •

edited

Loading