This is an unofficial snapshot of the ISO/IEC JTC1 SC22 WG21 Core Issues List revision 116a. See http://www.open-std.org/jtc1/sc22/wg21/ for the official list.
2024-12-19
According to 5.3.1 [lex.charset] paragraph 2,
The character designated by the universal-character-name \UNNNNNNNN is that character whose character short name in ISO/IEC 10646 is NNNNNNNN; the character designated by the universal-character-name \uNNNN is that character whose character short name in ISO/IEC 10646 is 0000NNNN. If the hexadecimal value for a universal-character-name corresponds to a surrogate code point (in the range 0xD800-0xDFFF, inclusive), the program is ill-formed. Additionally, if the hexadecimal value for a universal-character-name outside the c-char-sequence, s-char-sequence, or r-char-sequence of a character or string literal corresponds to a control character (in either of the ranges 0x00-0x1F or 0x7F-0x9F, both inclusive) or to a character in the basic source character set, the program is ill-formed.
It is not specified what should happen if the hexadecimal value does not designate a Unicode code point: is that undefined behavior or does it make the program ill-formed?
As an aside, a note should be added explaining why these requirements apply to to an r-char-sequence when, as the footnote at the end of the paragraph explains,
A sequence of characters resembling a universal-character-name in an r-char-sequence (5.13.5 [lex.string]) does not form a universal-character-name.
Additional note, February, 2021:
This issue was resolved editorially in N4842.