Skip to content

Remove XML comments from generated prompts with proper code block marker matching (fixes #822)#827

Merged
pelikhan merged 5 commits intomainfrom
copilot/fix-37209513-5f03-4a5e-af74-e261cb66b76d
Sep 15, 2025
Merged

Remove XML comments from generated prompts with proper code block marker matching (fixes #822)#827
pelikhan merged 5 commits intomainfrom
copilot/fix-37209513-5f03-4a5e-af74-e261cb66b76d

Conversation

Copy link
Contributor

Copilot AI commented Sep 15, 2025

This PR implements the feature requested in #822 to remove XML comments from the final prompts generated by the Go compiler.

Problem

XML comments (<!-- -->) are "builtin" comments in markdown and should be removed when rendered to prompts, similar to how HTML rendering engines handle them. This allows developers to keep commented text in workflow markdown files without polluting the final prompt sent to AI engines.

Solution

Added XML comment removal functionality to the prompt generation process:

  • removeXMLComments function: Removes XML comments while preserving comments that appear within code blocks (``` or ~~~ delimited)
  • Smart handling: Supports single-line, multiline, and multiple comments per line
  • Spacing preservation: Replaces multiline comments with appropriate empty lines to maintain document structure
  • Code block protection: Comments within code blocks are preserved since they may be part of examples or documentation
  • Proper marker matching: Code block opening and closing markers must match exactly (same type and count)

Recent Updates

  • ✅ Fixed code block detection: Updated to properly handle code blocks with 3 or more backticks/tildes (not just exactly 3)
  • ✅ Fixed marker matching logic: Opening and closing markers must now match properly (e.g., `````pythoncan only be closed by ``````, not by~~~` or different backtick counts)
  • ✅ Simplified marker API: Merged marker rune and count into single string return value for cleaner API
  • Enhanced helper functions: Added extractCodeBlockMarker, isValidCodeBlockMarker, and isMatchingCodeBlockMarker for robust code block boundary detection
  • Enhanced test coverage: Added tests for mixed marker types, nested blocks, and proper marker matching scenarios

Example

Before (in markdown file):

# My Workflow

This is the main content.

````python
echo "Starting"
<!-- this comment should be preserved in code -->
~~~
still in python block

More content here.


**After (in generated prompt):**
```markdown
# My Workflow

This is the main content.

````python
echo "Starting"
<!-- this comment should be preserved in code -->
~~~
still in python block

More content here.


## Implementation Details

- **`extractCodeBlockMarker`**: Returns marker string (e.g., "```", "~~~~") and language specifier for cleaner API
- **`isValidCodeBlockMarker`**: Checks if a line is a valid code block opening marker (3+ backticks/tildes)
- **`isMatchingCodeBlockMarker`**: Ensures closing markers match the opening marker type and count
- Modified `generatePrompt()` function in `pkg/workflow/compiler.go` to process markdown content through XML comment removal
- Added comprehensive test suite covering edge cases including proper code block marker matching
- No changes to JavaScript code as requested
- Maintains full backward compatibility

## Testing

- Added comprehensive test cases covering all edge scenarios including proper code block marker matching
- Integration test ensures the feature works end-to-end in workflow compilation
- All existing tests continue to pass (no regressions)
- Manual testing confirms XML comments are properly removed while preserving code block integrity

This change allows workflow authors to use XML comments for internal documentation and notes without affecting the final prompts sent to AI engines, while ensuring proper code block boundary detection.

<!-- START COPILOT CODING AGENT SUFFIX -->

<!-- START COPILOT CODING AGENT SUFFIX -->



<!-- START COPILOT CODING AGENT TIPS -->
---

✨ Let Copilot coding agent [set things up for you](https://github.com/githubnext/gh-aw/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Implement https://github.com/githubnext/gh-aw/issues/822 in the Go compiler that renders the prompt. - Do not modify the JavaScript. Remove XML comments from generated prompts (fixes #822) Sep 15, 2025
Copilot AI requested a review from pelikhan September 15, 2025 15:10
for _, line := range lines {
// Check for code block markers (``` or ~~~)
trimmedLine := strings.TrimSpace(line)
if strings.HasPrefix(trimmedLine, "```") || strings.HasPrefix(trimmedLine, "~~~") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code blocks may have 3 or more backticks @copilot

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed! Updated the code block detection logic to properly handle code blocks with 3 or more backticks/tildes instead of just exactly 3. Added a new isCodeBlockMarker helper function and comprehensive test coverage for various combinations (4 backticks, 5 tildes, etc.). Commit: 7aee975

… comment)

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Sep 15, 2025

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • api.contoso.com
    • Triggering command: /tmp/go-build3959406886/b278/cli.test -test.testlogfile=/tmp/go-build3959406886/b278/testlog.txt -test.paniconexit0 -test.timeout=10m0s -test.v=true (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

@pelikhan
Copy link
Contributor

@copilot match the opening marker for code region with the closing marker. Eg if the closing marker must be the opening marker.

…erent marker types

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Sep 15, 2025

@copilot match the opening marker for code region with the closing marker. Eg if the closing marker must be the opening marker.

Fixed! Updated the code block matching logic to ensure opening and closing markers match properly. Now `````pythoncan only be closed by `````` (same marker type and count), not by~~~` or different backtick counts. Added comprehensive test coverage for mixed marker scenarios. Commit: bfab6fb

Copilot AI changed the title Remove XML comments from generated prompts (fixes #822) Remove XML comments from generated prompts with proper code block marker matching (fixes #822) Sep 15, 2025
}
}
// All characters are backticks
return marker, count, ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot merge marker,count into returning a string instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the extractCodeBlockMarker function to return a single string containing the complete marker (e.g., "```", "~~~~") instead of separate marker character and count values. This simplifies the API and makes the code cleaner. Commit: 53c2139

…BlockMarker

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
@pelikhan pelikhan marked this pull request as ready for review September 15, 2025 18:17
@pelikhan pelikhan merged commit ac9efda into main Sep 15, 2025
15 of 17 checks passed
@pelikhan pelikhan deleted the copilot/fix-37209513-5f03-4a5e-af74-e261cb66b76d branch September 15, 2025 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants