Sanitize bare angle brackets and quotes in markup stripping#34
Open
assisted-by-ai wants to merge 1 commit intoKicksecure:masterfrom
Open
Sanitize bare angle brackets and quotes in markup stripping#34assisted-by-ai wants to merge 1 commit intoKicksecure:masterfrom
assisted-by-ai wants to merge 1 commit intoKicksecure:masterfrom
Conversation
…utput strip_markup had two security issues: 1. The malicious-input sanitization branch only replaced <, >, & with underscores but not " and ', which could allow HTML attribute injection if output is placed in an attribute context. 2. Bare < and > characters that HTMLParser doesn't treat as tags (e.g., "2 < 3") passed through both strip passes unchanged and were returned unsanitized, potentially dangerous in HTML contexts. Now quotes are sanitized in the malicious branch, and remaining < / > are always sanitized in the final output. https://claude.ai/code/session_01Wjn2KfitiA5iTADcLLfdbR
Contributor
|
Rejected, these changes are out-of-scope. strip-markup is intended to ensure that a string won't be interpreted as HTML markup in isolation. It isn't intended to prevent multiple adjacent strings from forming something weird, and it isn't intended to prepare text for use in an HTML attribute. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Enhanced the markup stripping and sanitization logic to handle additional security edge cases by sanitizing bare angle brackets (
<and>) that are not part of valid HTML tags, and quote characters in malicious input patterns.Key Changes
Expanded character sanitization in
strip_markup_lib.py:"and') in addition to<,>, and&to prevent attribute injection attacks<and>characters that survived the HTML parser (e.g., comparison operators like2 < 3)Added comprehensive test coverage:
_test_bare_angle_bracket_strings()test helper to verify sanitization of bare angle brackets in mathematical/comparison expressions_test_malicious_markup_quote_strings()test helper to verify quote character sanitization in malicious input patternsTestStripMarkupandTestSanitizeStringtest classessanitize_stringtests to reflect the new quote sanitization behaviorImplementation Details
The fix addresses two distinct security concerns:
<and>in expressions such as2 < 3are not recognized as HTML tags by Python's HTMLParser, but could be misinterpreted if the output is later placed in an HTML context<<b>b "onmouseover="alert(1)<</b>/b>) are now fully sanitized to prevent attribute injection attacksBoth sanitization strategies work together: the malicious input branch handles quotes, while the final pass handles any remaining bare angle brackets.
https://claude.ai/code/session_01Wjn2KfitiA5iTADcLLfdbR