After completing the beauty_align stage, I merged the model with checkpoint-5591 and launched the beauty-sid-rec training.
When I tested directly with checkpoint-67092 (without CoT), I obtained:
hit@1: 0.0181
hit@5: 0.0459
hit@10: 0.0471
ndcg@5: 0.0322
ndcg@10: 0.0380
Are these numbers far below the reported results simply because I skipped CoT, or could there also be a problem with my experimental settings (e.g., checkpoint selection)?