Skip to content

RSE实验的评估细节 #4

@xy-660

Description

@xy-660

Dear Authors,

Thank you for your insightful paper, "Quantifying Distillation in Large Language Models." We found the proposed RSE (Response Similarity Evaluation) method particularly interesting and have been studying it in detail.

In Section 3.2 and Section 4.1.2, it is mentioned that RSE employs an "LLM-as-a-judge" approach to assess response similarity. However, the specific model used as the judge is not explicitly stated. While the ICE experiment (Section 4.1.1) clearly uses GPT4o-mini for evaluation, it is unclear whether the same model was used for RSE or if a different model (e.g., GPT-4o) was adopted.

Could you kindly clarify which specific model served as the judge in the RSE experiments? This information would greatly assist us in better understanding the methodology and replicating the experiments.

Thank you once again for your valuable work. We look forward to your response.

Best regards,
xy

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions