*This is an unofficial snapshot of the ISO/IEC JTC1 SC22 WG21
Core Issues List revision 115c.
See http://www.open-std.org/jtc1/sc22/wg21/ for the official
list.*

2024-09-25

Consider:

int main() { constexpr auto x = 3.14f; assert( x == 3.14f ); // can fail? static_assert( x == 3.14f ); // can fail? }

Can a conforming implementation represent a floating-point literal with excess precision, causing the comparisons to fail?

Subclause 5.13.4 [lex.fcon] paragraph 3 specifies:

If the scaled value is not in the range of representable values for its type, the program is ill-formed. Otherwise, the value of afloating-point-literalis the scaled value if representable, else the larger or smaller representable value nearest the scaled value, chosen in an implementation-defined manner.

This phrasing leaves little leeway for excess precision. In contrast, C23 (WG14 N3096) specifies in section 6.4.4.2 paragraph 6:

The values of floating constants may be represented in greater range and precision than that required by the type (determined by the suffix); the types are not changed thereby. ...

Subclause 7.1 [expr.pre] paragraph 6 allows excess precision for floating-point computations (including their operands):

The values of the floating-point operands and the results of floating-point expressions may be represented in greater precision and range than that required by the type; the types are not changed thereby. [ Footnote: The cast and assignment operators must still perform their specific conversions as described in 7.6.1.4 [expr.type.conv], 7.6.3 [expr.cast], 7.6.1.9 [expr.static.cast] and 7.6.19 [expr.ass]. -- end footnote ]

Taken together, that means that `314.f / 100.f` can be
computed and represented more precisely than `3.14f`, which is
hard to justify. The footnote appears to imply
that `(float)3.14f` is required to yield a value
with `float` precision, but that conversion (eventually) ends up at 9.4.1 [dcl.init.general] bullet 16.9:

- ...
- Otherwise, the initial value of the object being initialized is the (possibly converted) value of the initializer expression. ...

This phrasing leaves no permission to discard excess precision when
converting from a `float` value to type `float` ("... is
the value...").

However, if initialization is intended to drop excess precision,
then an overloaded operator returning `float` can never behave
like a built-in operation with excess precision, because returning a value
means initializing the return value.

The C++ standard library inherits the `FLT_EVAL_METHOD`
macro from the C standard library. C23 (WG14 N3096) specifies it as
follows in section 5.2.4.2.2:

0 evaluate all operations and constants just to the range and precision of the type; 1 evaluate operations and constants of type float and double to the range and precision of the double type, evaluate long double operations and constants to the range and precision of the long double type; 2 evaluate all operations and constants to the range and precision of the long double type.

Taken together, a conforming C++ implementation cannot
define `FLT_EVAL_METHOD` to 1 or 2, because literals (=
"constants") cannot be represented with excess precision in C++.

**Additional notes (June, 2023)**

Forwarded to EWG via cplusplus/papers#1584, by decision of the CWG chair.