Skip to content

Conversation

enitrat
Copy link
Contributor

@enitrat enitrat commented Sep 30, 2025

Summary

Issue with the XMLAdapter

  • XMLAdapter inherited ChatAdapter's bracketed completion marker ([[ ## completed ## ]]) behavior, leading to prompts that mixed XML tags with ChatAdapter markers. This confused models and produced [[ ## completed ## ]] instead of the expected XML-only output (Closes [Bug] XMLAdapter inserting wrong markers #8875)

Cause of the issue

  • XMLAdapter was designed to emit XML tags (e.g., <answer>...</answer>), but it only overrode parts of the inherited ChatAdapter class:
    • Overrode: format_field_with_value (to XML), and parse.
    • Did not override all ChatAdapter-provided formatting paths that inject the bracketed style:
      • format_field_structure (system structure example) added [[ ## completed ## ]] in ChatAdapter.
      • format_assistant_message_content appended [[ ## completed ## ]] in ChatAdapter
    • Incorrectly included <completed> tag instruction in user_message_output_requirements.

As a result, the final prompt contained the ChatAdapter marker [[ ## completed ## ]] in both the system message structure example and the assistant message output.

How I fixed it

  • Ensure XMLAdapter formats output fields and structure examples with XML tags only (no ChatAdapter markers).

  • Changes:

    • Override format_field_structure (dspy/adapters/xml_adapter.py:24) to format the system message structure example using XML tags only.
    • Override format_assistant_message_content (dspy/adapters/xml_adapter.py:44) to format assistant messages with XML tags only (no [[ ## completed ## ]] marker).
    • Remove <completed> tag instruction from user_message_output_requirements (dspy/adapters/xml_adapter.py:57).
    • Note: Input fields continue to use ChatAdapter's [[ ## field ## ]] format for now, as per discussions with maintainers.
  • Tests added:

    • test_xml_adapter_full_prompt asserts on the full prompt generated, ensuring there's no unwanted [[ ## completed ## ]] markers in system or assistant messages, and no <completed> tag instruction.
    • Updated streaming tests to remove expected <completed> tag from mock responses.

@TomeHirata TomeHirata requested a review from Copilot October 1, 2025 02:29
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes issues with the XML adapter where it was mixing XML tags with ChatAdapter's bracketed completion markers, resulting in confusing prompts and incorrect model outputs.

  • Overrides XMLAdapter methods to ensure consistent XML-only formatting throughout the prompt
  • Removes the <completed> tag from completion requirements
  • Adds comprehensive tests to verify XML-only formatting behavior

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
dspy/adapters/xml_adapter.py Overrides format methods to use XML tags exclusively and removes completion markers
tests/adapters/test_xml_adapter.py Adds tests to verify XML input formatting and complete prompt structure

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@TomeHirata
Copy link
Collaborator

TomeHirata commented Oct 1, 2025

Thanks @enitrat. I left some comments, can you take a look?

@enitrat
Copy link
Contributor Author

enitrat commented Oct 1, 2025

@TomeHirata Do you agree that the completed marker shuold be <completed/>, not <completed>? as it's a "self closing" tag

edit: after further reading about the purpose of a completion marker, it seems that it's only useful for the ChatAdapter but the XMLAdapter should not include it.

The StreamListener class (dspy/streaming/streaming_listener.py) uses a token buffering mechanism to detect end markers in an LLM response. Notably, it looks for the end marker that matches the regex defined for the given adapter

  • ChatAdapter: r"\[\[ ## (\w+) ## \]\]" (matches next field or [[ ## completed ## ]])
  • JSONAdatper: r"\w*\"(,|\s*})"
  • XMLAdapter: rf"</{self.signature_field_name}>"
  • Only when this pattern is found, it sets stream_end = True

As such, a <completed> marker is useless for XMLAdapter and should then be removed

@enitrat
Copy link
Contributor Author

enitrat commented Oct 3, 2025

Hey @TomeHirata, thanks for the answer. I've properly applied the suggestions and restricted the scope of the changes, and updated the PR summary to match the current changes implemented in this PR. Thanks for the review!

@enitrat enitrat requested a review from TomeHirata October 3, 2025 08:22
Copy link
Collaborator

@TomeHirata TomeHirata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, can you take a look at my comment?

@enitrat
Copy link
Contributor Author

enitrat commented Oct 4, 2025

@TomeHirata should be all good now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] XMLAdapter inserting wrong markers
2 participants