This page is a snapshot from the LWG issues list, see the Library Active Issues List for more information and the meaning of LEWG status.
bulk vs. task_schedulerSection: 33.13.5 [exec.task.scheduler] Status: LEWG Submitter: Dietmar Kühl Opened: 2025-08-31 Last modified: 2025-10-23
Priority: 2
View other active issues in [exec.task.scheduler].
View all other issues in [exec.task.scheduler].
View all issues with LEWG status.
Discussion:
Normally, the scheduler type used by an operation can be deduced
when a sender is connected to a receiver from the
receiver's environment. The body of a coroutine cannot know about
the receiver the task sender gets connected
to. The implication is that the type of the scheduler used by the
coroutine needs to be known when the task is created.
To still allow custom schedulers used when connecting, the type-erased
scheduler task_scheduler is used. However, that leads
to surprises when algorithms are customised for a scheduler as is,
e.g., the case for bulk when used with a
parallel_scheduler: if bulk is
co_awaited within a coroutine using
task_scheduler it will use the default implementation
of bulk which sequentially executes the work, even if
the task_scheduler was initialised with a
parallel_scheduler (the exact invocation may actually
be slightly different or need to use bulk_chunked or
bulk_unchunked but that isn't the point being made):
struct env {
auto query(ex::get_scheduler_t) const noexcept { return ex::parallel_scheduler(); }
};
struct work {
auto operator()(std::size_t s){ /*...*/ };
};
ex::sync_wait(
ex::write_env(ex::bulk(ex::just(), 16u, work{}),
env{}
));
ex::sync_wait(ex::write_env(
[]()->ex::task<void, ex::env<>>>{ co_await ex::bulk(ex::just(), 16u, work{}); }(),
env{}
));
The two invocations should probably both execute the work in parallel
but the coroutine version doesnt: it uses the task_scheduler
which doesnt have a specialised version of bulk to
potentially delegate in a type-erased form to the underlying
scheduler. It is straight forward to move the write_env
wrapper inside the coroutine which fixes the problem in this case
but this need introduces the potential for a subtle performance
bug. The problem is sadly not limited to a particular scheduler or
a particular algorithm: any scheduler/algorithm combination which
may get specialised can suffer from the specialised algorithm not
being picked up.
There are a few ways this problem can be addressed (this list of options is almost certainly incomplete):
task_scheduler.task_scheduler to deal
with a set of algorithms for which it provides a type-erased
interface. The interface would likely be more constrained and it
would use virtual dispatch at run-time. However, the set of covered
algorithms would necessarily be limited in some form.task_scheduler, i.e., customise these
algorithms for task_scheduler such that a compile-time
error is produced.
A user who knows that the main purpose of a coroutine is to executed
an algorithm customised for a certain scheduler can use task<T,
E> with an environment E specifying exactly
that scheduler type. However, this use may be nested within some
sender being co_awaited and users need to be aware
that the customisation wouldnt be picked up. Any approach I'm
currently aware of will have the problem that customised versions
of an algorithm are not used for algorithms we are currently unaware
of.
[2025-10-23; Reflector poll. Status changed: New → LEWG]
Set priority to 2 after reflector poll. Send to LEWG.
"It seems like there are several kinds of problems with sender algorithm customization that have related causes and might have a single solution."
Proposed resolution: