This page is a snapshot from the LWG issues list, see the Library Active Issues List for more information and the meaning of Resolved status.
views::split
drops trailing empty rangeSection: 25.7.17 [range.split] Status: Resolved Submitter: Barry Revzin Opened: 2020-08-20 Last modified: 2021-06-14
Priority: 2
View all issues with Resolved status.
Discussion:
From StackOverflow, the program:
#include <iostream> #include <string> #include <ranges> int main() { std::string s = " text "; auto sv = std::ranges::views::split(s, ' '); std::cout << std::ranges::distance(sv.begin(), sv.end()); }
prints 2 (as specified), but it really should print 3. If a range has N
delimiters in it,
splitting should produce N+1
pieces. If the N
th delimiter is the last
element in the input range, views::split
produces only N
pieces — it doesn't
emit a trailing empty range.
Rust, Python, Javascript, Go, Kotlin, Haskell's "splitOn"
all provide N+1
parts
if there were N
delimiters.
APL, D, Elixir, Haskell's "words"
, Ruby, and Clojure all compress all empty words.
Splitting " x "
on " "
would give ["x"]
here, whereas the languages in the
above group would give ["", "x", ""]
Java is distinct from both groups in that it is mostly a first category language, except that by default it removes all trailing empty strings (but it keeps all leading and intermediate empty strings, unlike the second category languages) — although it has a parameter that lets you keep the trailing ones too.
C++20's behavior is closest to Java's default, except that it only removes one trailing empty string instead of every trailing empty string — and this behavior is not parameterizeable. But I think the intent is to be squarely in the first category, so I think the current behavior is just a specification error. Many of these languages also provide an additional extra parameter to limit how many splits happen (e.g. Java, Kotlin, Python, Rust, JavaScript), but that's a separate design question.[2020-09-02; Reflector prioritization]
Set priority to 2 as result of reflector discussions.
[2021-06-13 Resolved by the adoption of P2210R2 at the June 2021 plenary. Status changed: New → Resolved.]
Proposed resolution: