Section: 31.13 [re.grammar] Status: New Submitter: Hubert Tong Opened: 2017-06-25 Last modified: 2017-07-12
View other active issues in [re.grammar].
View all other issues in [re.grammar].
View all issues with New status.
In N4660 subclause 31.13 [re.grammar] paragraph 5:
The productions ClassAtomExClass, ClassAtomCollatingElement and ClassAtomEquivalence provide functionality equivalent to that of the same features in regular expressions in POSIX.
The broadness of the above statement makes it sound like it is merely a statement of intent; however, this appears to be a necessary normative statement insofar as identifying the general semantics to be associated with the syntactic forms identified. In any case, if it is meant for ClassAtomCollatingElement to provide functionality equivalent to a collating symbol in a POSIX bracket expression, multi-character collating elements need to be considered.In [re.grammar] paragraph 14:
The behavior of the internal finite state machine representation when used to match a sequence of characters is as described in ECMA-262. The behavior is modified according to any match_flag_type flags specified when using the regular expression object in one of the regular expression algorithms. The behavior is also localized by interaction with the traits class template parameter as follows: [bullets 14.1 to 14.4]
In none of the bullets does the wording handle multi-character collating elements in a clear manner:
14.1 deals in characters.
14.2 deals in characters (traits_inst.translate accepts only a single character).
14.3 might handle a multi-character collating element; however, there is no specification of how such a collating element is to be identified from the sequence of characters. Additionally, the definition of primary equivalence class specifies that it is a set of characters (not of collating elements).
14.4 deals in characters.
The ECMA-262 specification for ClassRanges also deals in characters.
[2017-07 Toronto Monday issue prioritization]