Issue 4336: bulk vs. task

This page is a snapshot from the LWG issues list, see the Library Active Issues List for more information and the meaning of LEWG status.

4336. `bulk` vs. `task_scheduler`

Section: 33.13.5 [exec.task.scheduler] Status: LEWG Submitter: Dietmar Kühl Opened: 2025-08-31 Last modified: 2025-10-23

Priority: 2

View all other issues in [exec.task.scheduler].

View all issues with LEWG status.

Discussion:

Normally, the scheduler type used by an operation can be deduced when a sender is connected to a receiver from the receiver's environment. The body of a coroutine cannot know about the receiver the task sender gets connected to. The implication is that the type of the scheduler used by the coroutine needs to be known when the task is created. To still allow custom schedulers used when connecting, the type-erased scheduler task_scheduler is used. However, that leads to surprises when algorithms are customised for a scheduler as is, e.g., the case for bulk when used with a parallel_scheduler: if bulk is co_awaited within a coroutine using task_scheduler it will use the default implementation of bulk which sequentially executes the work, even if the task_scheduler was initialised with a parallel_scheduler (the exact invocation may actually be slightly different or need to use bulk_chunked or bulk_unchunked but that isn't the point being made):

struct env {
    auto query(ex::get_scheduler_t) const noexcept { return ex::parallel_scheduler(); }
};
struct work {
    auto operator()(std::size_t s){ /*...*/ };
};

ex::sync_wait(
    ex::write_env(ex::bulk(ex::just(), 16u, work{}),
    env{}
));
ex::sync_wait(ex::write_env(
    []()->ex::task<void, ex::env<>>>{ co_await ex::bulk(ex::just(), 16u, work{}); }(),
    env{}
));

The two invocations should probably both execute the work in parallel but the coroutine version doesnt: it uses the task_scheduler which doesnt have a specialised version of bulk to potentially delegate in a type-erased form to the underlying scheduler. It is straight forward to move the write_env wrapper inside the coroutine which fixes the problem in this case but this need introduces the potential for a subtle performance bug. The problem is sadly not limited to a particular scheduler or a particular algorithm: any scheduler/algorithm combination which may get specialised can suffer from the specialised algorithm not being picked up.

There are a few ways this problem can be addressed (this list of options is almost certainly incomplete):

Accept the situation as is and advise users to be careful about customised algorithms like bulk when using task_scheduler.
Extend the interface of task_scheduler to deal with a set of algorithms for which it provides a type-erased interface. The interface would likely be more constrained and it would use virtual dispatch at run-time. However, the set of covered algorithms would necessarily be limited in some form.
To avoid the trap, make the use of known algorithms incompatible with the use of task_scheduler, i.e., customise these algorithms for task_scheduler such that a compile-time error is produced.

A user who knows that the main purpose of a coroutine is to executed an algorithm customised for a certain scheduler can use task<T, E> with an environment E specifying exactly that scheduler type. However, this use may be nested within some sender being co_awaited and users need to be aware that the customisation wouldnt be picked up. Any approach I'm currently aware of will have the problem that customised versions of an algorithm are not used for algorithms we are currently unaware of.

[2025-10-23; Reflector poll. Status changed: New → LEWG]

Set priority to 2 after reflector poll. Send to LEWG.

"It seems like there are several kinds of problems with sender algorithm customization that have related causes and might have a single solution."

Proposed resolution:

4336. bulk vs. task_scheduler

4336. `bulk` vs. `task_scheduler`