2381. Inconsistency in parsing floating point numbers

Section: 25.4.2.1.2 [facet.num.get.virtuals] Status: Open Submitter: Marshall Clow Opened: 2014-04-30 Last modified: 2016-09-08

Priority: 2

View all other issues in [facet.num.get.virtuals].

View all issues with Open status.

Discussion:

In 25.4.2.1.2 [facet.num.get.virtuals] we have:

Stage 3: The sequence of chars accumulated in stage 2 (the field) is converted to a numeric value by the rules of one of the functions declared in the header <cstdlib>:

This implies that for many cases, this routine should return true:

bool is_same(const char* p) 
{
  std::string str{p};
  double val1 = std::strtod(str.c_str(), nullptr);
  std::stringstream ss(str);
  double val2;
  ss >> val2;
  return std::isinf(val1) == std::isinf(val2) &&                 // either they're both infinity
         std::isnan(val1) == std::isnan(val2) &&                 // or they're both NaN
         (std::isinf(val1) || std::isnan(val1) || val1 == val2); // or they're equal
}

and this is indeed true, for many strings:

assert(is_same("0"));
assert(is_same("1.0"));
assert(is_same("-1.0"));
assert(is_same("100.123"));
assert(is_same("1234.456e89"));

but not for others

assert(is_same("0xABp-4")); // hex float
assert(is_same("inf"));
assert(is_same("+inf"));
assert(is_same("-inf"));
assert(is_same("nan"));
assert(is_same("+nan"));
assert(is_same("-nan"));

assert(is_same("infinity"));
assert(is_same("+infinity"));
assert(is_same("-infinity"));

These are all strings that are correctly parsed by std::strtod, but not by the stream extraction operators. They contain characters that are deemed invalid in stage 2 of parsing.

If we're going to say that we're converting by the rules of strtold, then we should accept all the things that strtold accepts.

[2016-04, Issues Telecon]

People are much more interested in round-tripping hex floats than handling inf and nan. Priority changed to P2.

Marshall says he'll try to write some wording, noting that this is a very closely specified part of the standard, and has remained unchanged for a long time. Also, there will need to be a sample implementation.

[2016-08, Chicago]

Zhihao provides wording

The src array in Stage 2 does narrowing only. The actual input validation is delegated to strtold (independent from the parsing in Stage 3 which is again being delegated to strtold) by saying:

[...] If it is not discarded, then a check is made to determine if c is allowed as the next character of an input field of the conversion specifier returned by Stage 1.

So a conforming C++11 num_get is supposed to magically accept an hexfloat without an exponent

0x3.AB

because we refers to C99, and the fix to this issue should be just expanding the src array.

Support for Infs and NaNs are not proposed because of the complexity of nan(n-chars).

[2016-08, Chicago]

Tues PM: Move to Open

[2016-09-08, Zhihao Yuan comments and updates proposed wording]

Examples added.

Proposed resolution:

This wording is relative to N4606.

  1. Change 25.4.2.1.2 [facet.num.get.virtuals]/3 Stage 2 as indicated:

    static const char src[] = "0123456789abcdefpxABCDEFPX+-";

  2. Append the following examples to 25.4.2.1.2 [facet.num.get.virtuals]/3 Stage 2 as indicated:

    [Example:

    Given an input sequence of "0x1a.bp+07p",

    • if Stage 1 returns %d, "0" is accumulated;

    • if Stage 1 returns %i, "0x1a" are accumulated;

    • if Stage 1 returns %g, "0x1a.bp+07" are accumulated.

    In all cases, leaving the rest in the input.

    — end example]