2985. std::reverse should be permitted to be vectorized

Section: 28.6.10 [alg.reverse] Status: LEWG Submitter: Billy O'Neal III Opened: 2017-06-24 Last modified: 2017-07-12

Priority: Not Prioritized

View all other issues in [alg.reverse].

View all issues with LEWG status.

Discussion:

The fine folks on our backend team suggested that we special case std::reverse of 1/2/4/8 to take advantage of vector units. Unfortunately, at present std::reverse says it does N/2 iter_swaps, which doesn't permit our vector implementation even if the iterator inputs are pointers to trivially copyable Ts.

The vectorized version for pointers to shorts is ~8x faster on Skylake than the serial version, and about 7x faster for unsigned long longs; and users don't actually care whether or not we call swap here.

[2017-07 Toronto Monday issue prioritization]

Status to LEWG; this is similar to 2973

Proposed resolution:

This wording is relative to N4659.

  1. Edit 28.6.10 [alg.reverse] as indicated:

    template<class BidirectionalIterator>
      void reverse(BidirectionalIterator first, BidirectionalIterator last);
    template<class ExecutionPolicy, class BidirectionalIterator>
      void reverse(ExecutionPolicy&& exec,
                   BidirectionalIterator first, BidirectionalIterator last);
    

    -1- Requires: *first shall be swappable (20.5.3.2 [swappable.requirements]).

    -2- Effects: For each non-negative integer i < (last - first) / 2, applies iter_swap to all pairs of iterators first + i, (last - i) - 1. If is_trivially_copyable_v<typename iterator_traits<BidirectionalIterator>::value_type> is true, an implementation may permute the elements by making temporary copies, rather than by calling iter_swap. [Note: this allows the implementation to be vectorized. — end note]