Skip to content

Commit cd6ae9e

Browse files
tulirturt2liverichvdh
authored
Clarify that arbitrary unicode is allowed in user/room IDs and room aliases (#1506)
Signed-off-by: Tulir Asokan <tulir@maunium.net> Co-authored-by: Travis Ralston <travisr@matrix.org> Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
1 parent a1bdfaa commit cd6ae9e

File tree

2 files changed

+20
-3
lines changed

2 files changed

+20
-3
lines changed
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Clarify that arbitrary unicode is allowed in user/room IDs and room aliases.

content/appendices.md

Lines changed: 19 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -611,10 +611,18 @@ characters permitted in user ID localparts. There are currently active
611611
users whose user IDs do not conform to the permitted character set, and
612612
a number of rooms whose history includes events with a `sender` which
613613
does not conform. In order to handle these rooms successfully, clients
614-
and servers MUST accept user IDs with localparts from the expanded
615-
character set:
614+
and servers MUST accept user IDs with localparts consisting of any legal
615+
non-surrogate Unicode code points except for `:` and `NUL` (U+0000), including other control
616+
characters and the empty string.
616617

617-
extended_user_id_char = %x21-39 / %x3B-7E ; all ASCII printing chars except :
618+
User IDs with localparts containing characters outside the range U+0021 to U+007E, or with
619+
an empty localpart, are considered non-compliant. For current room versions, servers must
620+
still accept events using such user IDs over federation; however they SHOULD NOT forward
621+
such user IDs to clients when referenced outside the context of an event. For example,
622+
device list updates from non-compliant user IDs would be dropped by the receiving server.
623+
624+
A future room version may prevent users using a historical character set
625+
from participating. Use of the historical character set is *deprecated*.
618626

619627
##### Mapping from other character sets
620628

@@ -663,6 +671,11 @@ Room IDs are case-sensitive. They are not meant to be
663671
human-readable. They are intended to be treated as fully opaque strings
664672
by clients.
665673

674+
The localpart of a room ID (`opaque_id` above) may contain any valid
675+
non-surrogate Unicode code points, including control characters, except `:` and `NUL`
676+
(U+0000), but it is recommended to only include ASCII letters and
677+
digits (`A-Z`, `a-z`, `0-9`) when generating them.
678+
666679
The length of a room ID, including the `!` sigil and the domain, MUST
667680
NOT exceed 255 bytes.
668681

@@ -676,6 +689,9 @@ The `domain` of a room alias is the [server name](#server-name) of the
676689
homeserver which created the alias. Other servers may contact this
677690
homeserver to look up the alias.
678691

692+
The localpart of a room alias may contain any valid non-surrogate Unicode codepoints
693+
except `:` and `NUL`.
694+
679695
The length of a room alias, including the `#` sigil and the domain, MUST
680696
NOT exceed 255 bytes.
681697

0 commit comments

Comments
 (0)