This page is a snapshot from the LWG issues list, see the Library Active Issues List for more information and the meaning of New status.

4336. bulk vs. task_scheduler

Section: 33.13.5 [exec.task.scheduler] Status: New Submitter: Dietmar Kühl Opened: 2025-08-31 Last modified: 2025-09-01

Priority: Not Prioritized

View other active issues in [exec.task.scheduler].

View all other issues in [exec.task.scheduler].

View all issues with New status.

Discussion:

Normally, the scheduler type used by an operation can be deduced when a sender is connected to a receiver from the receiver's environment. The body of a coroutine cannot know about the receiver the task sender gets connected to. The implication is that the type of the scheduler used by the coroutine needs to be known when the task is created. To still allow custom schedulers used when connecting, the type-erased scheduler task_scheduler is used. However, that leads to surprises when algorithms are customised for a scheduler as is, e.g., the case for bulk when used with a parallel_scheduler: if bulk is co_awaited within a coroutine using task_scheduler it will use the default implementation of bulk which sequentially executes the work, even if the task_scheduler was initialised with a parallel_scheduler (the exact invocation may actually be slightly different or need to use bulk_chunked or bulk_unchunked but that isn't the point being made):

struct env {
    auto query(ex::get_scheduler_t) const noexcept { return ex::parallel_scheduler(); }
};
struct work {
    auto operator()(std::size_t s){ /*...*/ };
};

ex::sync_wait(
    ex::write_env(ex::bulk(ex::just(), 16u, work{}),
    env{}
));
ex::sync_wait(ex::write_env(
    []()->ex::task<void, ex::env<>>>{ co_await ex::bulk(ex::just(), 16u, work{}); }(),
    env{}
));

The two invocations should probably both execute the work in parallel but the coroutine version doesnt: it uses the task_scheduler which doesnt have a specialised version of bulk to potentially delegate in a type-erased form to the underlying scheduler. It is straight forward to move the write_env wrapper inside the coroutine which fixes the problem in this case but this need introduces the potential for a subtle performance bug. The problem is sadly not limited to a particular scheduler or a particular algorithm: any scheduler/algorithm combination which may get specialised can suffer from the specialised algorithm not being picked up.

There are a few ways this problem can be addressed (this list of options is almost certainly incomplete):

A user who knows that the main purpose of a coroutine is to executed an algorithm customised for a certain scheduler can use task<T, E> with an environment E specifying exactly that scheduler type. However, this use may be nested within some sender being co_awaited and users need to be aware that the customisation wouldnt be picked up. Any approach I'm currently aware of will have the problem that customised versions of an algorithm are not used for algorithms we are currently unaware of.

Proposed resolution: