This is an unofficial snapshot of the ISO/IEC JTC1 SC22 WG21 Core Issues List revision 110b. See http://www.open-std.org/jtc1/sc22/wg21/ for the official list.
[Voted into the WP at the November, 2010 meeting.]N3092 comment US 13
“Raw” strings are still only Pittsburgh-rare strings: the reversion in phase 3 only applies to an r-char-sequence. It should apply to the entire raw string literal.
Proposed resolution (August, 2010):
Change 5.2 [lex.phases] paragraph 1 phase 1 as follows:
...(An implementation may use any internal encoding, so long as an actual extended character encountered in the source file, and the same extended character expressed in the source file as a universal-character-name (i.e., using the \uXXXX notation), are handled equivalently .)
Change 5.2 [lex.phases] paragraph 1 phase 3 as follows:
...[Example: see the handling of < within a #include preprocessing directive. —end example]
Within the r-char-sequence of a raw string literal, any transformations performed in phases 1 and 2 (trigraphs, universal-character-names, and line splicing) are reverted.
Change 5.2 [lex.phases] paragraph 1 phase 5 as follows:
Each source character set member
and universal-character-namein a character literal or a string literal, as well as each escape sequence in a character literal or a non-raw string literal, is converted to the corresponding member of the execution character set (5.13.3 [lex.ccon], 5.13.5 [lex.string]); if there is no corresponding member, it is converted to an implementation-defined member other than the null (wide) character.
Change 5.3 [lex.charset] paragraph 2 as follows:
...Additionally, if the hexadecimal value for a universal-character-name outside the c-char-sequence, s-char-sequence, or r-char-sequence of a character or string literal corresponds to a control character (in either of the ranges 0x000x1F or 0x7F0x9F, both inclusive) or to a character in the basic source character set, the program is ill-formed.
Change 5.4 [lex.pptoken] paragraph 3 as follows:
If the input stream has been parsed into preprocessing tokens up to a given character:
ifthe next character begins a sequence of characters that could be the prefix and initial double quote of a raw string literal, such as R", the next preprocessing token shall be a raw string literal ;
encoding-prefixopt R raw-string
otherwise, the next preprocessing token is the longest sequence of characters that could constitute a preprocessing token, even if that would cause further lexical analysis to fail.
Delete footnote 24 in 5.13.5 [lex.string] paragraph 2:
Use of characters with trigraph equivalents in a d-char-sequence may produce unintended results.
Insert the following examples after 5.13.5 [lex.string] paragraph 4: