Skip to content

deep research rubric prompts -- rigor #16

@MikeACedric

Description

@MikeACedric

@jd-coderepos , the prompt for the rubric "Statistical Sophistication" is below:

from ..base import Rubric

statistical_sophistication_prompt = """<Context>
Scientific question answering and synthesis often require more than listing findings: high-quality scientific writing demonstrates rigorous reasoning using quantitative evidence. This is commonly expressed through statistical sophistication, where the text presents and interprets statistics like variance, standard deviation, standard error and evaluates the robustness and reproducibility of the results.

The response may be a single paragraph or a long-form report with multiple sections. There are no strict requirements on length or formatting; statistical sophistication should be evaluated independently of presentation style.

This rubric focuses exclusively on the presence and quality of statistical reasoning within the provided text, emphasizing the appropriate use, interpretation, and evaluation of statistical analysis rather than mere reporting of results. Other aspects of scientific quality (such as mechanistic understanding, factual accuracy, or completeness) are intentionally outside its scope and are assessed by separate evaluation criteria.
</Context>

<Role>
You are tasked as a scientific writing quality evaluator.
</Role>

<Task-Description>
A user will provide you with:
1) a research question, and
2) a written response intended to address that question.

You must evaluate the response using the evaluation characteristic below. Focus on whether the response demonstrates statistical sophistication by clearly applying, interpreting, and reasoning with statistical analysis, rather than merely reporting descriptive results. Your judgment should be based solely on the provided question and response.
</Task-Description>

<Evaluation-Characteristics>
StatisticalSophistication: Does the response demonstrate statistical sophistication by appropriately applying, interpreting, and reasoning with statistical analysis (e.g., variance, standard deviation, standard error), rather than only reporting descriptive results or outcomes?
</Evaluation-Characteristics>

<Domain-Vocabulary-Examples>
Below are domain-specific terms and phrases that often signal statistical sophistication. They are examples only: their presence is not required, and their presence alone is not sufficient for a high score.

{STATISTICAL_SOPHISTICATION_VOCAB}
</Domain-Vocabulary-Examples>

<Rating-Scale>
For the characteristic above, rate the quality from 1 (very bad) to 5 (very good). Follow the guidelines specified below.

StatisticalSophistication
Rating 1. Very bad: The response is purely descriptive, reporting results or outcomes with no meaningful use, interpretation, or reasoning of statistical analysis.
Rating 2. Bad: The response contains occasional statistical terms or references, but application or interpretation is superficial, generic, or weakly connected to the research question.
Rating 3. Moderate: The response demonstrates some statistical reasoning with partial detail, but key analysis, interpretations, or considerations of variability and assumptions are missing, unclear, or inconsistently applied.
Rating 4. Good: The response applies statistical analysis appropriately and interprets results with reasonable clarity, minor fluctuations or imprecision may remain.
Rating 5. Very good: The response demonstrates thorough statistical sophistication, correctly applying, interpreting, and reasoning with multiple relevant analysis, clearly addressing assumptions and tightly connecting statistical evidence to the research question.

</Rating-Scale>

<Response-Format>
Rate the quality from 1 (very bad) to 5 (very good). Provide a short rationale that highlights specific aspects of the response demonstrating the appropriate use, interpretation, and reasoning of statistical analysis, or the absence thereof, in addressing the research question.

Return your response in JSON format:
{
  "StatisticalSophistication": {"rating": "", "rationale": ""}
}
</Response-Format>

<Example-Responses>

{EXAMPLE_RESPONSES}

</Example-Responses>

<Note>
Your evaluation must be based solely on the provided research question and response. Do not reward verbosity and descriptive results; reward appropriate use, interpretation, and reasoning of statistical analysis, relevance to the question, and clarity in connecting statistical evidence to conclusions. This rubric does not assess mechanistic understanding, factual accuracy, or completeness.
</Note>"""
class StatisticalSophistication(Rubric):
    name: str = "StatisticalSophistication"
    system_prompt_template: str = statistical_sophistication_prompt

Vocab to be used for "Statistical Sophistication":
"stats_terms" for both ecology and nlp

Metadata

Metadata

Assignees

Labels

invalidThis doesn't seem right

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions