524. regex named character classes and case-insensitivity don't mix

Section: 31 [re] Status: CD1 Submitter: Eric Niebler Opened: 2005-07-01 Last modified: 2016-02-10

Priority: Not Prioritized

View other active issues in [re].

View all other issues in [re].

View all issues with CD1 status.

Discussion:

This defect is also being discussed on the Boost developers list. The full discussion can be found here: http://lists.boost.org/boost/2005/07/29546.php

-- Begin original message --

Also, I may have found another issue, closely related to the one under discussion. It regards case-insensitive matching of named character classes. The regex_traits<> provides two functions for working with named char classes: lookup_classname and isctype. To match a char class such as [[:alpha:]], you pass "alpha" to lookup_classname and get a bitmask. Later, you pass a char and the bitmask to isctype and get a bool yes/no answer.

But how does case-insensitivity work in this scenario? Suppose we're doing a case-insensitive match on [[:lower:]]. It should behave as if it were [[:lower:][:upper:]], right? But there doesn't seem to be enough smarts in the regex_traits interface to do this.

Imagine I write a traits class which recognizes [[:fubar:]], and the "fubar" char class happens to be case-sensitive. How is the regex engine to know that? And how should it do a case-insensitive match of a character against the [[:fubar:]] char class? John, can you confirm this is a legitimate problem?

I see two options:

1) Add a bool icase parameter to lookup_classname. Then, lookup_classname( "upper", true ) will know to return lower|upper instead of just upper.

2) Add a isctype_nocase function

I prefer (1) because the extra computation happens at the time the pattern is compiled rather than when it is executed.

-- End original message --

For what it's worth, John has also expressed his preference for option (1) above.

Proposed resolution:

Adopt the proposed resolution in N2409.

[ Kona (2007): The LWG adopted the proposed resolution of N2409 for this issue. The LWG voted to accelerate this issue to Ready status to be voted into the WP at Kona. ]