Issue 438: Ambiguity in the "do the right thing" clause

This page is a snapshot from the LWG issues list, see the Library Active Issues List for more information and the meaning of CD1 status.

438. Ambiguity in the "do the right thing" clause

Section: 23.2.4 [sequence.reqmts] Status: CD1 Submitter: Howard Hinnant Opened: 2003-10-20 Last modified: 2016-01-28

Priority: Not Prioritized

View other active issues in [sequence.reqmts].

View all other issues in [sequence.reqmts].

View all issues with CD1 status.

Discussion:

Section 23.2.4 [sequence.reqmts], paragraphs 9-11, fixed up the problem noticed with statements like:

vector<int> v(10, 1);

The intent of the above statement was to construct with:

vector(size_type, const value_type&);

but early implementations failed to compile as they bound to:

template <class InputIterator>
vector(InputIterator f, InputIterator l);

instead.

Paragraphs 9-11 say that if InputIterator is an integral type, then the member template constructor will have the same effect as:

vector<static_cast<size_type>(f), static_cast<value_type>(l));

(and similarly for the other member template functions of sequences).

There is also a note that describes one implementation technique:

One way that sequence implementors can satisfy this requirement is to specialize the member template for every integral type.

This might look something like:

template <class T>
struct vector
{
     typedef unsigned size_type;

     explicit vector(size_type) {}
     vector(size_type, const T&) {}

     template <class I>
     vector(I, I);

     // ...
};

template <class T>
template <class I>
vector<T>::vector(I, I) { ... }

template <>
template <>
vector<int>::vector(int, int) { ... }

template <>
template <>
vector<int>::vector(unsigned, unsigned) { ... }

//  ...

Label this solution 'A'.

The standard also says:

Less cumbersome implementation techniques also exist.

A popular technique is to not specialize as above, but instead catch every call with the member template, detect the type of InputIterator, and then redirect to the correct logic. Something like:

template <class T>
template <class I>
vector<T>::vector(I f, I l)
{
     choose_init(f, l, int2type<is_integral<I>::value>());
}

template <class T>
template <class I>
vector<T>::choose_init(I f, I l, int2type<false>)
{
    // construct with iterators
}

template <class T>
template <class I>
vector<T>::choose_init(I f, I l, int2type<true>)
{
    size_type sz = static_cast<size_type>(f);
    value_type v = static_cast<value_type>(l);
    // construct with sz,v
}

Label this solution 'B'.

Both of these solutions solve the case the standard specifically mentions:

vector<int> v(10, 1);  // ok, vector size 10, initialized to 1

However, (and here is the problem), the two solutions have different behavior in some cases where the value_type of the sequence is not an integral type. For example consider:

     pair<char, char>                     p('a', 'b');
     vector<vector<pair<char, char> > >   d('a', 'b');

The second line of this snippet is likely an error. Solution A catches the error and refuses to compile. The reason is that there is no specialization of the member template constructor that looks like:

template <>
template <>
vector<vector<pair<char, char> > >::vector(char, char) { ... }

So the expression binds to the unspecialized member template constructor, and then fails (compile time) because char is not an InputIterator.

Solution B compiles the above example though. 'a' is casted to an unsigned integral type and used to size the outer vector. 'b' is static casted to the inner vector using it's explicit constructor:

explicit vector(size_type n);

and so you end up with a static_cast<size_type>('a') by static_cast<size_type>('b') matrix.

It is certainly possible that this is what the coder intended. But the explicit qualifier on the inner vector has been thwarted at any rate.

The standard is not clear whether the expression:

     vector<vector<pair<char, char> > >   d('a', 'b');

(and similar expressions) are:

undefined behavior.
illegal and must be rejected.
legal and must be accepted.

My preference is listed in the order presented.

There are still other techniques for implementing the requirements of paragraphs 9-11, namely the "restricted template technique" (e.g. enable_if). This technique is the most compact and easy way of coding the requirements, and has the behavior of #2 (rejects the above expression).

Choosing 1 would allow all implementation techniques I'm aware of. Choosing 2 would allow only solution 'A' and the enable_if technique. Choosing 3 would allow only solution 'B'.

Possible wording for a future standard if we wanted to actively reject the expression above would be to change "static_cast" in paragraphs 9-11 to "implicit_cast" where that is defined by:

template <class T, class U>
inline
T implicit_cast(const U& u)
{
     return u;
}

Proposed resolution:

Replace 23.2.4 [sequence.reqmts] paragraphs 9 - 11 with:

For every sequence defined in this clause and in clause lib.strings:

If the constructor

       template <class InputIterator>
       X(InputIterator f, InputIterator l,
         const allocator_type& a = allocator_type())

is called with a type InputIterator that does not qualify as an input iterator, then the constructor will behave as if the overloaded constructor:

       X(size_type, const value_type& = value_type(),
         const allocator_type& = allocator_type())

were called instead, with the arguments static_cast<size_type>(f), l and a, respectively.

If the member functions of the forms:

       template <class InputIterator>          //  such as  insert()
       rt fx1(iterator p, InputIterator f, InputIterator l);

       template <class InputIterator>          //  such as  append(), assign()
       rt fx2(InputIterator f, InputIterator l);

       template <class InputIterator>          //  such as  replace()
       rt fx3(iterator i1, iterator i2, InputIterator f, InputIterator l);

are called with a type InputIterator that does not qualify as an input iterator, then these functions will behave as if the overloaded member functions:

       rt fx1(iterator, size_type, const value_type&);

       rt fx2(size_type, const value_type&);

       rt fx3(iterator, iterator, size_type, const value_type&);

were called instead, with the same arguments.

In the previous paragraph the alternative binding will fail if f is not implicitly convertible to X::size_type or if l is not implicitly convertible to X::value_type.

The extent to which an implementation determines that a type cannot be an input iterator is unspecified, except that as a minimum integral types shall not qualify as input iterators.

[ Kona: agreed that the current standard requires v('a', 'b') to be accepted, and also agreed that this is surprising behavior. The LWG considered several options, including something like implicit_cast, which doesn't appear to be quite what we want. We considered Howards three options: allow acceptance or rejection, require rejection as a compile time error, and require acceptance. By straw poll (1-6-1), we chose to require a compile time error. Post-Kona: Howard provided wording. ]

[ Sydney: The LWG agreed with this general direction, but there was some discomfort with the wording in the original proposed resolution. Howard submitted new wording, and we will review this again in Redmond. ]

[Redmond: one very small change in wording: the first argument is cast to size_t. This fixes the problem of something like vector<vector<int> >(5, 5), where int is not implicitly convertible to the value type.]

Rationale:

The proposed resolution fixes:

  vector<int> v(10, 1);

since as integral types 10 and 1 must be disqualified as input iterators and therefore the (size,value) constructor is called (as if).

The proposed resolution breaks:

  vector<vector<T> > v(10, 1);

because the integral type 1 is not *implicitly* convertible to vector<T>. The wording above requires a diagnostic.

The proposed resolution leaves the behavior of the following code unspecified.

  struct A
  {
    operator int () const {return 10;}
  };

  struct B
  {
    B(A) {}
  };

  vector<B> v(A(), A());

The implementation may or may not detect that A is not an input iterator and employee the (size,value) constructor. Note though that in the above example if the B(A) constructor is qualified explicit, then the implementation must reject the constructor as A is no longer implicitly convertible to B.