1175. unordered complexity

Section: 26.2.7 [unord.req] Status: Open Submitter: Pablo Halpern Opened: 2009-07-17 Last modified: 2016-02-10

Priority: 3

View other active issues in [unord.req].

View all other issues in [unord.req].

View all issues with Open status.

Discussion:

When I look at the unordered_* constructors, I think the complexity is poorly described and does not follow the style of the rest of the standard.

The complexity for the default constructor is specified as constant. Actually, it is proportional to n, but there are no invocations of value_type constructors or other value_type operations.

For the iterator-based constructor the complexity should be:

Complexity: exactly n calls to construct value_type from InputIterator::value_type (where n = distance(f,l)). The number of calls to key_equal::operator() is proportional to n in the average case and n*n in the worst case.

[ 2010 Rapperswil: ]

Concern that the current wording may require O(1) where that cannot be delivered. We need to look at both the clause 23 requirements tables and the constructor description of each unordered container to be sure.

Howard suggests NAD Editorial as we updated the container requirement tables since this issue was written.

Daniel offers to look deeper, and hopefully produce wording addressing any outstanding concerns at the next meeting.

Move to Open.

[2011-02-26: Daniel provides wording]

I strongly suggest to clean-up the differences between requirement tables and individual specifications. In the usual way, the most specific specifications wins, which is in this case the wrong one. In regard to the concern expressed about missing DefaultConstructible requirements of the value type I disagree: The function argument n is no size-control parameter, but only some effective capacity parameter: No elements will be value-initialized by these constructors. The necessary requirement for the value type, EmplaceConstructible into *this, is already listed in Table 103 — Unordered associative container requirements. Another part of the proposed resolution is the fact that there is an inconsistency of the complexity counting when both a range and a bucket count is involved compared to constructions where only bucket counts are provided: E.g. the construction X a(n); has a complexity of n bucket allocations, but this part of the work is omitted for X a(i, j, n);, even though it is considerable larger (in the average case) for n ≫ distance(i, j).

[2011-03-24 Madrid meeting]

Move to deferred

[ 2011 Bloomington ]

The proposed wording looks good. Move to Review.

[2012, Kona]

Fix up some presentation issues with the wording, combining the big-O expressions into single expressions rather than the sum of two separate big-Os.

Strike "constant or linear", prefer "linear in the number of buckets". This allows for number of buckets being larger than requested n as well.

Default n to "unspecified" rather than "implementation-defined". It seems an un-necessary burden asking vendors to document a quantity that is easily determined through the public API of these classes.

Replace distance(f,l) with "number of elements in the range [f,l)"

Retain in Review with the updated wording

[2012, Portland: Move to Open]

The wording still does not call out Pablo's original concern, that the element constructor is called no more than N times, and that the N squared term applies to moves during rehash.

Inconsistent use of O(n)+O(N) vs. O(n+N), with a preference for the former.

AJM to update wording with a reference to "no more than N element constructor calls".

Matt concerned that calling out the O(n) requirements is noise, and dangerous noise in suggesting a precision we do not mean. The cost of constructing a bucket is very different to constructing an element of user-supplied type.

AJM notes that if there are multiple rehashes, the 'n' complexity is probably not linear.

Matt suggests back to Open, Pablo suggests potentially NAD if we keep revisitting without achieving a resolution.

Matt suggests complexity we are concerned with is the number of operations, such as constructing elements, moving nodes, and comparing/hashing keys. We are less concerned with constructing buckets, which are generally noise in this bigger picture.

[2015-01-29 Telecon]

AM: essentially correct, but do we want to complicate the spec?

HH: Pablo has given us permission to NAD it

JM: when I look at the first change in the P/R I find it mildly disturbing that the existing wording says you have a constant time constructor with a single element even if your n is 10^6, so I think adding this change makes people aware there might be a large cost in initializing the hash table, even though it doesn't show up in user-visible constructions.

HH: one way to avoid that problem is make the default ctor noexcept. Then the container isn't allowed to create an arbitrarily large hash table

AM: but this is the constructor where the user provides n

MC: happy with the changes, except I agree with the editorial recommendation to keep the two 𝒪s separate.

JW: yes, the constant 'k' is different in 𝒪(n) and 𝒪(N)

GR: do we want to talk about buckets at all

JM: yes, good to highlight that bucket construction might be a significant cost

HH: suggest we take the suggestion to split 𝒪(n+N) to 𝒪(n)+𝒪(N) and move to Tentatively Ready

GR: 23.2.1p2 says all complexity requirements are stated solely in terms of the number of operations on the contained object, so we shouldn't be stating complexity in terms of the hash table initialization

HH: channeling Pete, there's an implicit "unless otherwise specified" everywhere.

VV: seem to be requesting modifications that render this not Tentatively Ready

GR: I think it can't be T/R

AM: make the editorial recommendation, consider fixing 23.2.1/3 to give us permission to state complexity in terms of bucket initialization

HH: only set it to Review after we get new wording to review

[2015-02 Cologne]

Update wording, revisit later.

Proposed resolution:

  1. Modify the following rows in Table 103 — Unordered associative container requirements to add the explicit bucket allocation overhead of some constructions. As editorial recommendation it is suggested not to shorten the sum 𝒪(n) + 𝒪(N) to 𝒪(n + N), because two different work units are involved.

    Table 103 — Unordered associative container requirements (in addition to container)
    Expression Return type Assertion/note pre-/post-condition Complexity
    X(i, j, n, hf, eq)
    X a(i, j, n, hf, eq)
    X
    Effects: Constructs an empty container with at least n
    buckets, using hf as the hash function and eq as the key
    equality predicate, and inserts elements from [i, j) into it.
    Average case 𝒪(n + N) (N is distance(i, j)),
    worst case 𝒪(n) + 𝒪(N2)
    X(i, j, n, hf)
    X a(i, j, n, hf)
    X
    Effects: Constructs an empty container with at least n
    buckets, using hf as the hash function and key_equal() as the key
    equality predicate, and inserts elements from [i, j) into it.
    Average case 𝒪(n + N) (N is distance(i, j)),
    worst case 𝒪(n + N2)
    X(i, j, n)
    X a(i, j, n)
    X
    Effects: Constructs an empty container with at least n
    buckets, using hasher() as the hash function and key_equal() as the key
    equality predicate, and inserts elements from [i, j) into it.
    Average case 𝒪(n + N) (N is distance(i, j)),
    worst case 𝒪(n + N2)
  2. Modify 26.5.4.2 [unord.map.cnstr] p. 1-4 as indicated (The edits of p. 1 and p. 3 attempt to fix some editorial oversight.):

    explicit unordered_map(size_type n = see below,
                           const hasher& hf = hasher(),
                           const key_equal& eql = key_equal(),
                           const allocator_type& a = allocator_type());
    

    1 Effects: Constructs an empty unordered_map using the specified hash function, key equality function, and allocator, and using at least n buckets. If n is not provided, the number of buckets is unspecifiedimpldefdefault number of buckets in unordered_map. max_load_factor() returns 1.0.

    2 Complexity: ConstantLinear in the number of buckets.

    template <class InputIterator>
    unordered_map(InputIterator f, InputIterator l,
                  size_type n = see below,
                  const hasher& hf = hasher(),
                  const key_equal& eql = key_equal(),
                  const allocator_type& a = allocator_type());
    

    3 Effects: Constructs an empty unordered_map using the specified hash function, key equality function, and allocator, and using at least n buckets. If n is not provided, the number of buckets is unspecifiedimpldefdefault number of buckets in unordered_map. Then inserts elements from the range [f, l). max_load_factor() returns 1.0.

    4 Complexity: Average case linear, worst case quadraticLinear in the number of buckets. In the average case linear in N and in the worst case quadratic in N to insert the elements, where N is equal to number of elements in the range [f,l).

  3. Modify 26.5.5.2 [unord.multimap.cnstr] p. 1-4 as indicated (The edits of p. 1 and p. 3 attempt to fix some editorial oversight.):

    explicit unordered_multimap(size_type n = see below,
                                const hasher& hf = hasher(),
                                const key_equal& eql = key_equal(),
                                const allocator_type& a = allocator_type());
    

    1 Effects: Constructs an empty unordered_multimap using the specified hash function, key equality function, and allocator, and using at least n buckets. If n is not provided, the number of buckets is unspecifiedimpldefdefault number of buckets in unordered_multimap. max_load_factor() returns 1.0.

    2 Complexity: ConstantLinear in the number of buckets.

    template <class InputIterator>
    unordered_multimap(InputIterator f, InputIterator l,
                       size_type n = see below,
                       const hasher& hf = hasher(),
                       const key_equal& eql = key_equal(),
                       const allocator_type& a = allocator_type());
    

    3 Effects: Constructs an empty unordered_multimap using the specified hash function, key equality function, and allocator, and using at least n buckets. If n is not provided, the number of buckets is unspecifiedimpldefdefault number of buckets in unordered_multimap. Then inserts elements from the range [f, l). max_load_factor() returns 1.0.

    4 Complexity: Average case linear, worst case quadraticLinear in the number of buckets. In the average case linear in N and in the worst case quadratic in N to insert the elements, where N is equal to number of elements in the range [f,l).

  4. Modify 26.5.6.2 [unord.set.cnstr] p. 1-4 as indicated (The edits of p. 1 and p. 3 attempt to fix some editorial oversight.):

    explicit unordered_set(size_type n = see below,
                           const hasher& hf = hasher(),
                           const key_equal& eql = key_equal(),
                           const allocator_type& a = allocator_type());
    

    1 Effects: Constructs an empty unordered_set using the specified hash function, key equality function, and allocator, and using at least n buckets. If n is not provided, the number of buckets is unspecifiedimpldefdefault number of buckets in unordered_set. max_load_factor() returns 1.0.

    2 Complexity: ConstantLinear in the number of buckets.

    template <class InputIterator>
    unordered_set(InputIterator f, InputIterator l,
                  size_type n = see below,
                  const hasher& hf = hasher(),
                  const key_equal& eql = key_equal(),
                  const allocator_type& a = allocator_type());
    

    3 Effects: Constructs an empty unordered_set using the specified hash function, key equality function, and allocator, and using at least n buckets. If n is not provided, the number of buckets is unspecifiedimpldefdefault number of buckets in unordered_set. Then inserts elements from the range [f, l). max_load_factor() returns 1.0.

    4 Complexity: Average case linear, worst case quadraticLinear in the number of buckets. In the average case linear in N and in the worst case quadratic in N to insert the elements, where N is equal to number of elements in the range [f,l).

  5. Modify 26.5.7.2 [unord.multiset.cnstr] p. 1-4 as indicated (The edits of p. 1 and p. 3 attempt to fix some editorial oversight.):

    explicit unordered_multiset(size_type n = see below,
                                const hasher& hf = hasher(),
                                const key_equal& eql = key_equal(),
                                const allocator_type& a = allocator_type());
    

    1 Effects: Constructs an empty unordered_multiset using the specified hash function, key equality function, and allocator, and using at least n buckets. If n is not provided, the number of buckets is unspecifiedimpldefdefault number of buckets in unordered_multiset. max_load_factor() returns 1.0.

    2 Complexity: ConstantLinear in the number of buckets.

    template <class InputIterator>
    unordered_multiset(InputIterator f, InputIterator l,
                       size_type n = see below,
                       const hasher& hf = hasher(),
                       const key_equal& eql = key_equal(),
                       const allocator_type& a = allocator_type());
    

    3 Effects: Constructs an empty unordered_multiset using the specified hash function, key equality function, and allocator, and using at least n buckets. If n is not provided, the number of buckets is unspecifiedimpldefdefault number of buckets in unordered_multiset. Then inserts elements from the range [f, l). max_load_factor() returns 1.0.

    4 Complexity: Average case linear, worst case quadraticLinear in the number of buckets. In the average case linear in N and in the worst case quadratic in N to insert the elements, where N is equal to number of elements in the range [f,l).