Skip to content

Pointwise vs. Pairwise paradigms from the information retrieval literature #7

@jd-coderepos

Description

@jd-coderepos

In traditional Information Retrieval (IR) literature [1,2]:

  1. Pointwise paradigm (score each item independently)
    Idea: Learn a function that assigns an absolute relevance score to a single query–document pair.

  2. Pairwise paradigm (learn preferences between two items)
    Idea: Learn from comparisons: for a given query, which of two documents should rank higher?

Maping these notions to LLM-as-a-judge

  1. Pointwise judge: Score answer A for question Q on rubric R in Likert scale which can be defined either from 1–5 (less granular) or 1–10 (more granular). In the choice of granularity of the Likert rating, the user should provide concrete definitions what each rating number means in the context of the rubric. In this case, the LLM is expected to provide a number as the output with a rationale for its decision.
  2. Pairwise judge: Between answer A and B (optionally for question Q), which better satisfies rubric R (or tie)? The answer options can be as follows: 1) A/B, or 2) a Likert rating for both A & B given the relative degree to which they satisfy the rubric R compared to each other. If the answer option 1 is chosen, the LLM has to output either A or B and give a rationale for its decision.

From the traditional IR notions to the LLM-as-a-judge approach, is the exact same structural shift: absolute scoring of one item vs relative comparison of two items.

References

  1. Liu, TY. (2011). The Pointwise Approach. In: Learning to Rank for Information Retrieval. Springer, Berlin, Heidelberg. https://link.springer.com/chapter/10.1007/978-3-642-14267-3_2
  2. Liu, TY. (2011). The Pairwise Approach. In: Learning to Rank for Information Retrieval. Springer, Berlin, Heidelberg. https://link.springer.com/chapter/10.1007/978-3-642-14267-3_3

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentation

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions