Skip to content

Comments

Potential fix for code scanning alert no. 3: Inefficient regular expression#13

Open
gruble wants to merge 1 commit intomasterfrom
alert-autofix-3
Open

Potential fix for code scanning alert no. 3: Inefficient regular expression#13
gruble wants to merge 1 commit intomasterfrom
alert-autofix-3

Conversation

@gruble
Copy link
Collaborator

@gruble gruble commented Feb 11, 2026

Potential fix for https://github.com/NVE/regobs-ocr/security/code-scanning/3

In general terms, the issue stems from an ambiguous repeated construct: (?:[a-z\u00a1-\uffff0-9]-*)*. Here, the engine can match a run of characters and hyphens in multiple overlapping ways (e.g., it can decide whether to include or exclude a - from one repetition or the next), which leads to exponential backtracking on some failing inputs. To fix this, we need to rewrite the repeated portion into an equivalent expression that does not contain nested, overlapping quantifiers and does not allow multiple ways to match the same substring.

The best minimally invasive fix is to rewrite each (?:[a-z\u00a1-\uffff0-9]-*)* as (?:[a-z\u00a1-\uffff0-9]+-*)*. Adding the + inside ensures that each repetition must consume at least one non‑hyphen character, which removes the ambiguity where - * could “float” and be assigned to various iterations when backtracking. This keeps the intended semantics—labels consisting of alphanumerics/Unicode letters with optional internal hyphens—while preserving the outer structure of the URL regex. We only touch the flagged domain‑label component; the rest of the pattern remains unchanged.

Concretely, in SnowProfileScanner/wwwroot/lib/jquery-validation/dist/jquery.validate.js around line 1394, there are two occurrences of (?:[a-z\u00a1-\uffff0-9]-*)* inside the URL regex (one for the main hostname labels and one for subsequent labels after dots). Both should be updated to (?:[a-z\u00a1-\uffff0-9]+-*)*. No new methods, helpers, or imports are required, as we only adjust the literal regex.

Suggested fixes powered by Copilot Autofix. Review carefully before merging.

…ession

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
// see also https://mathiasbynens.be/demo/url-regex
// modified to allow protocol-relative URLs
return this.optional( element ) || /^(?:(?:(?:https?|ftp):)?\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})).?)(?::\d{2,5})?(?:[/?#]\S*)?$/i.test( value );
return this.optional( element ) || /^(?:(?:(?:https?|ftp):)?\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]+-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]+-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})).?)(?::\d{2,5})?(?:[/?#]\S*)?$/i.test( value );

Check failure

Code scanning / CodeQL

Inefficient regular expression High

This part of the regular expression may cause exponential backtracking on strings starting with '//' and containing many repetitions of '0'.

Copilot Autofix

AI 13 days ago

In general, to fix inefficient regular expressions you remove ambiguity inside quantified constructs by (1) tightening character classes, (2) avoiding overlapping alternatives inside (...|...) that are both under * / +, and/or (3) making the engine’s choices more deterministic using atomic groups or possessive quantifiers (where available). The aim is that each input prefix has essentially one way to be matched, so the engine does not have to backtrack exponentially.

Here, the problematic subpattern [a-z\u00a1-\uffff0-9]+ is used to match “labels” in the hostname. It appears in two places in the URL regex:

(?:(?:[a-z\u00a1-\uffff0-9]+-*)*[a-z\u00a1-\uffff0-9]+)
...
(?:\.(?:[a-z\u00a1-\uffff0-9]+-*)*[a-z\u00a1-\uffff0-9]+)*

The ambiguity arises because - is allowed separately in -* and digits/letters are allowed in the + piece, and those constructs are nested and repeated inside a bigger quantified expression, giving the backtracker a lot of overlapping ways to partition a long run of allowed characters. A classic way to reduce this is to force each label to have a more constrained internal structure and, especially, to avoid having a quantified group like (...)* where the inner part ends with * or + over a class that heavily overlaps with what the “outer” repetition can also consume.

We can keep existing functionality but make the pattern more deterministic by (a) factoring the label definition so that each label is a non‑empty sequence that may contain internal hyphens but doesn’t end with a hyphen, and (b) removing the nested (...)* around label-optional-hyphens that invites backtracking. A minimal, compatible improvement—used in other hardened variants of this regex—is to rewrite the “hostname label” part along these lines:

  • Replace (?:(?:[a-z\u00a1-\uffff0-9]+-*)*[a-z\u00a1-\uffff0-9]+) with (?:[a-z\u00a1-\uffff0-9](?:[a-z\u00a1-\uffff0-9-]*[a-z\u00a1-\uffff0-9])?).

This still allows labels that:

  • start and end with [a-z\u00a1-\uffff0-9] (no leading/trailing -),
  • and can contain any mix of those characters plus hyphens in between.

It removes the nested (...)* over a subpattern that itself ends with *, which is where exponential backtracking can occur. We apply this replacement everywhere that specific label pattern appears inside the hostname part of the URL regex (twice, as shown). No imports or new helpers are needed; we simply replace the literal regex in the url method inside SnowProfileScanner/wwwroot/lib/jquery-validation/dist/jquery.validate.js.

Suggested changeset 1
SnowProfileScanner/wwwroot/lib/jquery-validation/dist/jquery.validate.js

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/SnowProfileScanner/wwwroot/lib/jquery-validation/dist/jquery.validate.js b/SnowProfileScanner/wwwroot/lib/jquery-validation/dist/jquery.validate.js
--- a/SnowProfileScanner/wwwroot/lib/jquery-validation/dist/jquery.validate.js
+++ b/SnowProfileScanner/wwwroot/lib/jquery-validation/dist/jquery.validate.js
@@ -1391,7 +1391,7 @@
 			// https://gist.github.com/dperini/729294
 			// see also https://mathiasbynens.be/demo/url-regex
 			// modified to allow protocol-relative URLs
-			return this.optional( element ) || /^(?:(?:(?:https?|ftp):)?\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]+-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]+-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})).?)(?::\d{2,5})?(?:[/?#]\S*)?$/i.test( value );
+			return this.optional( element ) || /^(?:(?:(?:https?|ftp):)?\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:[a-z\u00a1-\uffff0-9](?:[a-z\u00a1-\uffff0-9-]*[a-z\u00a1-\uffff0-9])?)(?:\.(?:[a-z\u00a1-\uffff0-9](?:[a-z\u00a1-\uffff0-9-]*[a-z\u00a1-\uffff0-9])?))*(?:\.(?:[a-z\u00a1-\uffff]{2,})).?)(?::\d{2,5})?(?:[/?#]\S*)?$/i.test( value );
 		},
 
 		// https://jqueryvalidation.org/date-method/
EOF
@@ -1391,7 +1391,7 @@
// https://gist.github.com/dperini/729294
// see also https://mathiasbynens.be/demo/url-regex
// modified to allow protocol-relative URLs
return this.optional( element ) || /^(?:(?:(?:https?|ftp):)?\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]+-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]+-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})).?)(?::\d{2,5})?(?:[/?#]\S*)?$/i.test( value );
return this.optional( element ) || /^(?:(?:(?:https?|ftp):)?\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:[a-z\u00a1-\uffff0-9](?:[a-z\u00a1-\uffff0-9-]*[a-z\u00a1-\uffff0-9])?)(?:\.(?:[a-z\u00a1-\uffff0-9](?:[a-z\u00a1-\uffff0-9-]*[a-z\u00a1-\uffff0-9])?))*(?:\.(?:[a-z\u00a1-\uffff]{2,})).?)(?::\d{2,5})?(?:[/?#]\S*)?$/i.test( value );
},

// https://jqueryvalidation.org/date-method/
Copilot is powered by AI and may make mistakes. Always verify output.
// see also https://mathiasbynens.be/demo/url-regex
// modified to allow protocol-relative URLs
return this.optional( element ) || /^(?:(?:(?:https?|ftp):)?\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})).?)(?::\d{2,5})?(?:[/?#]\S*)?$/i.test( value );
return this.optional( element ) || /^(?:(?:(?:https?|ftp):)?\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]+-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]+-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})).?)(?::\d{2,5})?(?:[/?#]\S*)?$/i.test( value );

Check failure

Code scanning / CodeQL

Inefficient regular expression High

This part of the regular expression may cause exponential backtracking on strings starting with '//0.' and containing many repetitions of '0'.

Copilot Autofix

AI 13 days ago

In general, to fix inefficient regular expressions that can exhibit exponential backtracking, you remove ambiguity between alternatives inside quantified groups and avoid nested quantifiers that can match the same text in multiple ways. This is often done by tightening character classes, refactoring nested groups into simpler non‑ambiguous pieces, or using atomic groups/possessive quantifiers where available (not in standard JavaScript).

For this specific case, the problematic part appears in the host name portion of the URL regex:

... (?:(?:[a-z\u00a1-\uffff0-9]+-*)*[a-z\u00a1-\uffff0-9]+)
    (?:\.(?:[a-z\u00a1-\uffff0-9]+-*)*[a-z\u00a1-\uffff0-9]+)*
    (?:\.(?:[a-z\u00a1-\uffff]{2,})) ...

The nested pattern (?:[a-z\u00a1-\uffff0-9]+-*)*[a-z\u00a1-\uffff0-9]+ is ambiguous because both the outer * and the inner + can consume the same alphanumeric characters in multiple ways. A simpler, non‑ambiguous structure that preserves behavior is to treat each label as a sequence of one or more groups of alnums possibly followed by hyphens, then ensure the label ends with an alnum. In practice, this is equivalent to saying “one or more of [alnum-], but not ending with -”, which we can write as [a-z\u00a1-\uffff0-9-]*[a-z\u00a1-\uffff0-9]. Using this for each label removes the nested, overlapping quantifiers.

So the best low‑impact fix is:

  • Replace each (?:[a-z\u00a1-\uffff0-9]+-*)*[a-z\u00a1-\uffff0-9]+ with [a-z\u00a1-\uffff0-9-]*[a-z\u00a1-\uffff0-9] in the domain portions of the URL regex.
  • Keep the rest of the pattern unchanged.

Concretely, inside SnowProfileScanner/wwwroot/lib/jquery-validation/dist/jquery.validate.js, in the url method’s regex literal on line 1394, there are two such occurrences (one for the initial hostname label, one in the repeated \. group). We replace both with the simplified, non‑ambiguous form, leaving all flags, anchors, and other alternatives intact. No new imports or helper functions are needed; we only adjust the regex literal.

Suggested changeset 1
SnowProfileScanner/wwwroot/lib/jquery-validation/dist/jquery.validate.js

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/SnowProfileScanner/wwwroot/lib/jquery-validation/dist/jquery.validate.js b/SnowProfileScanner/wwwroot/lib/jquery-validation/dist/jquery.validate.js
--- a/SnowProfileScanner/wwwroot/lib/jquery-validation/dist/jquery.validate.js
+++ b/SnowProfileScanner/wwwroot/lib/jquery-validation/dist/jquery.validate.js
@@ -1391,7 +1391,7 @@
 			// https://gist.github.com/dperini/729294
 			// see also https://mathiasbynens.be/demo/url-regex
 			// modified to allow protocol-relative URLs
-			return this.optional( element ) || /^(?:(?:(?:https?|ftp):)?\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]+-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]+-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})).?)(?::\d{2,5})?(?:[/?#]\S*)?$/i.test( value );
+			return this.optional( element ) || /^(?:(?:(?:https?|ftp):)?\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:[a-z\u00a1-\uffff0-9-]*[a-z\u00a1-\uffff0-9])(?:\.(?:[a-z\u00a1-\uffff0-9-]*[a-z\u00a1-\uffff0-9]))*(?:\.(?:[a-z\u00a1-\uffff]{2,})).?)(?::\d{2,5})?(?:[/?#]\S*)?$/i.test( value );
 		},
 
 		// https://jqueryvalidation.org/date-method/
EOF
@@ -1391,7 +1391,7 @@
// https://gist.github.com/dperini/729294
// see also https://mathiasbynens.be/demo/url-regex
// modified to allow protocol-relative URLs
return this.optional( element ) || /^(?:(?:(?:https?|ftp):)?\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]+-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]+-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})).?)(?::\d{2,5})?(?:[/?#]\S*)?$/i.test( value );
return this.optional( element ) || /^(?:(?:(?:https?|ftp):)?\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:[a-z\u00a1-\uffff0-9-]*[a-z\u00a1-\uffff0-9])(?:\.(?:[a-z\u00a1-\uffff0-9-]*[a-z\u00a1-\uffff0-9]))*(?:\.(?:[a-z\u00a1-\uffff]{2,})).?)(?::\d{2,5})?(?:[/?#]\S*)?$/i.test( value );
},

// https://jqueryvalidation.org/date-method/
Copilot is powered by AI and may make mistakes. Always verify output.
@gruble gruble marked this pull request as ready for review February 11, 2026 07:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant