CWG Issue 2333

This is an unofficial snapshot of the ISO/IEC JTC1 SC22 WG21 Core Issues List revision 117a. See http://www.open-std.org/jtc1/sc22/wg21/ for the official list.

2025-04-13

2333. Escape sequences in UTF-8 character literals

Section: 5.13.3 [lex.ccon] Status: CD6 Submitter: Mike Miller Date: 2017-01-05

[Accepted at the November, 2020 meeting as part of paper P2029R4.]

The meaning of a numeric escape appearing in a UTF-8 character literal is not clear. 5.13.3 [lex.ccon] paragraph 3 assumes that the contents of the quoted string is a character with an ISO 10646 code point value, which is not necessarily the case with a numeric escape, and paragraph 8 could be read to indicate that a numeric escape specifies the actual runtime value of the object rather than a Unicode code point. In addition, paragraph 8 only specifies the result for unprefixed and wide-character literals, not for UTF-8 literals, so that could be read as indicating that a numeric escape in a UTF-8 character literal is undefined behavior (i.e., not defined by the Standard).

Notes from the August, 2017 teleconference:

An escape sequence in a UTF-8 character literal should be ill-formed.