Fix bug with UTF-8 pattern group name; and document the rules #23876

khwilliamson · 2025-10-25T15:35:47Z

For an illegal group name in UTF-8, the pointer to the problematic character could be pointing to an incomplete character. because the code neglected to consider if the name was UTF-8 or not.

The description of what a legal name is had not been updated to include UTF-8 names.

This set of changes does not require a perldelta entry. The bug fix I believe is too minor to warrant one; and the documentation changes will shortly be subsumed by further ones that will provide a perldelta entry that would override this one

jkeenan · 2025-10-25T21:53:17Z

2 tests failing in t/re/reg_mesg.t.

This just fills out a couple of tests so that they don't prematurely end. That makes it clear that the eorror that does get shown isn't also due to other mistakes in the test.

This was written before Unicode, and its wording does not accurately extend beyond ASCII. This commit clarifies the description.

I found this by reading the code. Prior to this commit, the parse pointer was advanced by one byte; it should be advanced by one character. As long as the the character was ASCII, things worked. I looked through the regcomp.c source for other mis-use of the macro changed by this commit; none were obvious.

khwilliamson added 3 commits October 25, 2025 16:49

reg_mesg.t: Only one error per test

6cc07d1

This just fills out a couple of tests so that they don't prematurely end. That makes it clear that the eorror that does get shown isn't also due to other mistakes in the test.

perldiag: Update description for regex group names

ba00806

This was written before Unicode, and its wording does not accurately extend beyond ASCII. This commit clarifies the description.

khwilliamson force-pushed the regcomp_inc_by_1 branch from 5cd7e05 to bdade0c Compare October 25, 2025 23:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix bug with UTF-8 pattern group name; and document the rules #23876

Fix bug with UTF-8 pattern group name; and document the rules #23876

khwilliamson commented Oct 25, 2025

Uh oh!

jkeenan commented Oct 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix bug with UTF-8 pattern group name; and document the rules #23876

Are you sure you want to change the base?

Fix bug with UTF-8 pattern group name; and document the rules #23876

Conversation

khwilliamson commented Oct 25, 2025

Uh oh!

jkeenan commented Oct 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants