-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
feat(task-logger): strip ANSI escape codes from logs and extend coverage #1411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
…with_detail() and TaskLogger.tool_end() Changes: - Import strip_ansi_codes from utils in logger.py - Apply strip_ansi_codes to detail parameter in log_with_detail() - Apply strip_ansi_codes to stored_detail in tool_end() - Fix circular import in utils.py using TYPE_CHECKING Co-Authored-By: Claude <noreply@anthropic.com>
…ture.pro - Import strip_ansi_codes from utils module - Apply sanitization to process_text() method before logging - Ensures ANSI escape codes are removed from streaming text output Co-Authored-By: Claude <noreply@anthropic.com>
Backend (Python): - Extend CSI pattern to support private mode parameters (?<>=) - Add None handling to strip_ansi_codes() function - Add sanitization to TaskLogger.log() method (critical gap fix) - Export strip_ansi_codes in task_logger public API - Add 25 comprehensive unit tests for strip_ansi_codes() Frontend (TypeScript): - Extend CSI pattern to support private mode parameters (?<>=) - Apply ANSI sanitization to merge preview error messages in Kanban This ensures clean display of task logs and error messages in the UI by removing terminal color/formatting escape sequences.
Summary of ChangesHello @StillKnotKnown, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses the issue of unreadable task logs and UI messages caused by the presence of ANSI escape codes. It introduces a robust mechanism to strip these terminal formatting codes from log content in the backend Python services before storage, and enhances the existing frontend sanitizer to handle more complex ANSI sequences, applying it to relevant UI elements. The overall impact is cleaner, more readable logs and a better user experience in the UI. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughANSI escape code sanitization was added: a new strip_ansi_codes utility and lifecycle helpers were introduced in the backend; logging and streaming capture now sanitize text before storing; frontend error paths use the sanitizer; comprehensive tests added. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
tests/test_task_logger.py
Outdated
| import os | ||
| import sys | ||
| import tempfile | ||
| from pathlib import Path |
Check notice
Code scanning / CodeQL
Unused import Note test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a valuable feature by stripping ANSI escape codes from logs, which significantly improves their readability in the UI. The implementation is well-structured, with a new utility function in Python and an update to the existing TypeScript sanitizer. The addition of comprehensive unit and integration tests for the new functionality is excellent and ensures its correctness and robustness.
My review includes a couple of suggestions to improve the performance of the ANSI stripping functions in both Python and TypeScript by combining the multiple regular expressions into a single one. This would make the sanitization process more efficient.
Overall, this is a solid contribution that enhances the user experience and maintainability of the logging system.
| if not text: | ||
| return "" | ||
|
|
||
| # Remove all ANSI escape sequences | ||
| result = ANSI_CSI_PATTERN.sub("", text) | ||
| result = ANSI_OSC_BEL_PATTERN.sub("", result) | ||
| result = ANSI_OSC_ST_PATTERN.sub("", result) | ||
|
|
||
| return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For better performance and conciseness, you can combine the three regex patterns (ANSI_CSI_PATTERN, ANSI_OSC_BEL_PATTERN, ANSI_OSC_ST_PATTERN) into a single pattern. This allows performing the substitution in one pass over the string, which is more efficient than three separate passes.
First, you can define a single combined pattern at the top of the file:
# Combined ANSI escape code pattern for CSI and OSC sequences.
ANSI_PATTERN = re.compile(
r"\x1b\[[?><=0-9;]*[A-Za-z]" # CSI sequences
r"|\x1b\][^�]*�" # OSC with BEL terminator
r"|\x1b\][^�]*�\\" # OSC with ST terminator
)Then, this function can be simplified to use it:
| if not text: | |
| return "" | |
| # Remove all ANSI escape sequences | |
| result = ANSI_CSI_PATTERN.sub("", text) | |
| result = ANSI_OSC_BEL_PATTERN.sub("", result) | |
| result = ANSI_OSC_ST_PATTERN.sub("", result) | |
| return result | |
| if not text: | |
| return "" | |
| # Remove all ANSI escape sequences using a single combined pattern | |
| return ANSI_PATTERN.sub("", text) |
Move strip_ansi_codes import from module-level to local imports in logger.py to avoid cyclic import when __init__.py imports both modules. Also use lazy import in get_task_logger() to avoid cyclic import at module level. Fixes NameError in test_planner_session_does_not_trigger_post_session_processing_on_retry
| """ | ||
| # Sanitize content to remove ANSI escape codes before storage | ||
| if content: | ||
| from .utils import strip_ansi_codes |
Check notice
Code scanning / CodeQL
Cyclic import Note
backend.task_logger.utils
|
|
||
| # Sanitize detail content before storage | ||
| if detail: | ||
| from .utils import strip_ansi_codes |
Check notice
Code scanning / CodeQL
Cyclic import Note
backend.task_logger.utils
|
|
||
| # Sanitize detail content before storage | ||
| if stored_detail: | ||
| from .utils import strip_ansi_codes |
Check notice
Code scanning / CodeQL
Cyclic import Note
backend.task_logger.utils
| from typing import TYPE_CHECKING | ||
|
|
||
| if TYPE_CHECKING: | ||
| from .logger import TaskLogger |
Check notice
Code scanning / CodeQL
Cyclic import Note
backend.task_logger.logger
|
|
||
| if _current_logger is None or _current_logger.spec_dir != spec_dir: | ||
| # Lazy import to avoid cyclic import | ||
| from .logger import TaskLogger |
Check notice
Code scanning / CodeQL
Cyclic import Note
backend.task_logger.logger
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
apps/backend/task_logger/logger.py (1)
474-495: Tool end summaries can still contain ANSI codes.
display_resultand the composedcontentare stored/emitted without sanitization, so ANSI sequences can still show up in the UI even after detail stripping. Sanitize the summary path as well.✅ Proposed fix
- status = "Done" if success else "Error" - content = f"[{tool_name}] {status}" - if display_result: - content += f": {display_result}" + status = "Done" if success else "Error" + if display_result: + display_result = strip_ansi_codes(display_result) + content = f"[{tool_name}] {status}" + if display_result: + content += f": {display_result}" + content = strip_ansi_codes(content)apps/backend/task_logger/utils.py (1)
61-109: TaskLogger is undefined at runtime due to TYPE_CHECKING-only import.
Line 79 instantiatesTaskLogger, and Line 108 referencesTaskLogger.LOG_FILE, butTaskLoggeris only imported underTYPE_CHECKING, so these will raiseNameErrorat runtime. Use a lazy import inget_task_logger()and avoid direct class reference when updating the log file path.🐛 Proposed fix
def get_task_logger( spec_dir: Path | None = None, emit_markers: bool = True ) -> "TaskLogger | None": @@ if spec_dir is None: return _current_logger + from .logger import TaskLogger if _current_logger is None or _current_logger.spec_dir != spec_dir: _current_logger = TaskLogger(spec_dir, emit_markers) @@ def update_task_logger_path(new_spec_dir: Path) -> None: @@ - _current_logger.log_file = _current_logger.spec_dir / TaskLogger.LOG_FILE + _current_logger.log_file = _current_logger.spec_dir / _current_logger.LOG_FILE
🤖 Fix all issues with AI agents
In `@apps/backend/task_logger/capture.py`:
- Around line 40-43: The current logic checks text.strip() before removing ANSI
codes which allows ANSI-only input to pass and then log empty entries; update
the flow in the method that calls strip_ansi_codes so you call sanitized_text =
strip_ansi_codes(text) first and then only call self.logger.log(sanitized_text,
phase=self.phase) if sanitized_text.strip() is non-empty, replacing the original
text.strip() guard to avoid blank log entries.
In `@apps/backend/task_logger/logger.py`:
- Around line 315-318: The summary string "content" in log_with_detail() is not
sanitized for ANSI codes before storage/emit; update log_with_detail() to call
strip_ansi_codes(content) (just like detail is sanitized) and use the stripped
value everywhere you store/emit (e.g., the same variables passed to DB save/emit
logic and any subsequent functions), referencing the function log_with_detail(),
the variables content and detail, and the helper strip_ansi_codes() so both
summary and detail are consistently cleaned prior to persistence or
broadcasting.
In `@tests/test_task_logger.py`:
- Around line 7-11: Remove the unused imports "tempfile" and "Path" from the
import block (they are currently imported but not used); edit the top-of-file
import statement that includes json, os, sys, tempfile, and Path so it only
imports json, os, and sys to satisfy static analysis and keep the tests clean.
| if text.strip(): | ||
| self.logger.log(text, phase=self.phase) | ||
| # Remove ANSI escape codes before logging | ||
| sanitized_text = strip_ansi_codes(text) | ||
| self.logger.log(sanitized_text, phase=self.phase) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid logging empty entries after sanitization.
Line 40 checks text.strip() before stripping ANSI codes; ANSI-only chunks pass the check but sanitize to empty, producing blank log entries. Sanitize first and then check.
✅ Proposed fix
- if text.strip():
- # Remove ANSI escape codes before logging
- sanitized_text = strip_ansi_codes(text)
- self.logger.log(sanitized_text, phase=self.phase)
+ # Remove ANSI escape codes before logging
+ sanitized_text = strip_ansi_codes(text)
+ if sanitized_text.strip():
+ self.logger.log(sanitized_text, phase=self.phase)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if text.strip(): | |
| self.logger.log(text, phase=self.phase) | |
| # Remove ANSI escape codes before logging | |
| sanitized_text = strip_ansi_codes(text) | |
| self.logger.log(sanitized_text, phase=self.phase) | |
| # Remove ANSI escape codes before logging | |
| sanitized_text = strip_ansi_codes(text) | |
| if sanitized_text.strip(): | |
| self.logger.log(sanitized_text, phase=self.phase) |
🤖 Prompt for AI Agents
In `@apps/backend/task_logger/capture.py` around lines 40 - 43, The current logic
checks text.strip() before removing ANSI codes which allows ANSI-only input to
pass and then log empty entries; update the flow in the method that calls
strip_ansi_codes so you call sanitized_text = strip_ansi_codes(text) first and
then only call self.logger.log(sanitized_text, phase=self.phase) if
sanitized_text.strip() is non-empty, replacing the original text.strip() guard
to avoid blank log entries.
tests/test_task_logger.py
Outdated
| import json | ||
| import os | ||
| import sys | ||
| import tempfile | ||
| from pathlib import Path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused imports to keep tests clean.
tempfile and Path are unused and flagged by static analysis.
🧹 Proposed fix
-import tempfile
-from pathlib import Path📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| import json | |
| import os | |
| import sys | |
| import tempfile | |
| from pathlib import Path | |
| import json | |
| import os | |
| import sys |
🧰 Tools
🪛 GitHub Check: CodeQL
[notice] 10-10: Unused import
Import of 'tempfile' is not used.
[notice] 11-11: Unused import
Import of 'Path' is not used.
🤖 Prompt for AI Agents
In `@tests/test_task_logger.py` around lines 7 - 11, Remove the unused imports
"tempfile" and "Path" from the import block (they are currently imported but not
used); edit the top-of-file import statement that includes json, os, sys,
tempfile, and Path so it only imports json, os, and sys to satisfy static
analysis and keep the tests clean.
Signed-off-by: StillKnotKnown <stillknotknown@users.noreply.github.com>
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/backend/task_logger/utils.py (1)
109-111: Consider creating anupdate_spec_dir()method onTaskLoggerfor encapsulation.
update_task_logger_pathdirectly mutates_current_logger.spec_dirand_current_logger.log_file. While no validation currently exists, ifTaskLoggeradds property setters or validation logic in the future, this direct mutation would bypass them. A dedicated method likeupdate_spec_dir()would provide a cleaner, more maintainable interface.
🤖 Fix all issues with AI agents
In `@tests/test_task_logger.py`:
- Around line 173-190: The test points out that TaskLogger.log_with_detail
sanitizes the detail field but not the content; update the log_with_detail
implementation (TaskLogger.log_with_detail) to apply the same
ANSI-stripping/sanitization used for detail to the content parameter before
writing to the log and before any console output; reuse the existing sanitize
function or helper (e.g., the function that removes "\x1b" sequences or
strip_ansi) so both content and detail are processed identically and ensure the
stored JSON entry uses the sanitized content.
♻️ Duplicate comments (2)
apps/backend/task_logger/logger.py (1)
316-321: Thecontentparameter is still not sanitized.The past review correctly identified that
content(the summary) inlog_with_detail()is stored on line 325 without ANSI stripping. Onlydetailis being sanitized. Summaries derived from tool output can still leak escape sequences into storage/UI.Proposed fix
phase_key = (phase or self.current_phase or LogPhase.CODING).value - # Sanitize detail content before storage + # Sanitize content and detail before storage + if content: + from .utils import strip_ansi_codes + + content = strip_ansi_codes(content) + if detail: from .utils import strip_ansi_codes detail = strip_ansi_codes(detail)tests/test_task_logger.py (1)
10-10: Remove unusedPathimport.The
Pathimport is not used in this file; all path operations usetmp_path(pytest fixture) oros.path. Static analysis correctly flagged this.Proposed fix
import json import os import sys -from pathlib import Path
- Apply ANSI stripping to both content and detail in log_with_detail() - Remove unused Path import from test file - Add test for content sanitization in log_with_detail() Fixes issue identified by CodeRabbit review where log_with_detail() only sanitized detail but not content, allowing ANSI codes to leak into stored logs and UI components. Signed-off-by: StillKnotKnown <stillknotknown@users.noreply.github.com>
- Apply ANSI stripping to display_result and content in tool_end() - Add test for result and content sanitization in tool_end() Fixes CodeRabbit review comment where tool_end() only sanitized detail but not result/content, allowing ANSI codes to leak into the stored log content and UI components. Signed-off-by: StillKnotKnown <stillknotknown@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/backend/task_logger/logger.py (1)
482-505: Sanitize tool_end() summary/result too.Line 489 builds
contentfromdisplay_resultwithout stripping ANSI codes. If tool summaries include ANSI (common with CLI output), the log entry still stores escape sequences, undermining the “sanitize before storage” requirement. Stripresult(orcontent) before building the entry.🐛 Proposed fix
# Truncate long results for display (increased limit to avoid hiding critical info) - display_result = result - if display_result and len(display_result) > 300: - display_result = display_result[:297] + "..." + from .utils import strip_ansi_codes + + display_result = strip_ansi_codes(result) if result else result + if display_result and len(display_result) > 300: + display_result = display_result[:297] + "..." @@ # Sanitize detail content before storage if stored_detail: - from .utils import strip_ansi_codes - stored_detail = strip_ansi_codes(stored_detail)
…rom-task-logs-in-python-b
Base Branch
Description
Implements ANSI escape code stripping in the Python task logger to remove terminal color/formatting codes from task logs before storage, making logs readable in the UI. Also extends the TypeScript ANSI sanitizer to handle private mode parameters and applies sanitization to merge preview error messages in the Kanban board.
Key Changes:
Related Issue
Closes #47
Type of Change
Area
Commit Message Format
Follow conventional commits: `: `
Types: feat, fix, docs, style, refactor, test, chore
Example: `feat: add user authentication system`
Checklist
Platform Testing Checklist
CRITICAL: This project supports Windows, macOS, and Linux. Platform-specific bugs are a common source of breakage.
If you only have access to one OS: CI now tests on all platforms. Ensure all checks pass before submitting.
CI/Testing Requirements
Screenshots
Feature Toggle
Breaking Changes
Breaking: No
Details:
None - this is a new utility function and sanitization is applied transparently during logging.
Summary by CodeRabbit
Bug Fixes
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.