This is an unofficial snapshot of the ISO/IEC JTC1 SC22 WG21 Core Issues List revision 115e. See http://www.open-std.org/jtc1/sc22/wg21/ for the official list.
2024-11-11
According to 5.4 [lex.pptoken] paragraph 3,
If the input stream has been parsed into preprocessing tokens up to a given character:
If the next character begins a sequence of characters that could be the prefix and initial double quote of a raw string literal, such as R", the next preprocessing token shall be a raw string literal. Between the initial and final double quote characters of the raw string, any transformations performed in phases 1 and 2 (trigraphs, universal-character-names, and line splicing) are reverted; this reversion shall apply before any d-char, r-char, or delimiting parenthesis is identified.
However, phase 1 is defined as:
Physical source file characters are mapped, in an implementation-defined manner, to the basic source character set (introducing new-line characters for end-of-line indicators) if necessary. The set of physical source file characters accepted is implementation-defined. Trigraph sequences (_N4140_.2.4 [lex.trigraph]) are replaced by corresponding single-character internal representations. Any source file character not in the basic source character set (5.3 [lex.charset]) is replaced by the universal-character-name that designates that character.
The reversion described in 5.4 [lex.pptoken] paragraph 3 specifically does not mention the replacement of physical end-of-line indicators with new-line characters. Is it intended that, for example, a CRLF in the source of a raw string literal is to be represented as a newline character or as the original characters?