2524. generate_canonical can occasionally return 1.0

Section: 29.6.8.4.2 [rand.dist.pois.exp] Status: Open Submitter: Michael Prähofer Opened: 2015-08-20 Last modified: 2017-11-10

Priority: 2

View all issues with Open status.

Discussion:

Original title was: exponential_distribution<float> sometimes returns inf.

The random number distribution class template exponential_distribution<float> may return "inf" as can be seen from the following example program:

// compiled with
// g++ -std=c++11 Error_exp_distr.cpp

#include <iostream>
#include <random>
#include <bitset>

int main(){
  unsigned long long h;
  std::mt19937_64 mt1(1);
  std::mt19937_64 mt2(1);
  mt1.discard(517517);
  mt2.discard(517517);
  std::exponential_distribution<float> dis(1.0);
  h = mt2();
  std::cout << std::bitset<64>(h) << " " << (float) -log(1 - h/pow(2, 64)) << " " 
            << -log(1 - (float) h/pow(2, 64)) << " " << dis(mt1) << std::endl;
  h = mt2();
  std::cout << std::bitset<64>(h) << " " << (float) -log(1 - h/pow(2, 64)) << " " 
            << -log(1 - (float) h/pow(2, 64)) << " " << dis(mt1) << std::endl;
}

output:

0110010110001001010011000111000101001100111110100001110011100001 0.505218 0.505218 0.505218
1111111111111111111111111101010011000110011110011000110101100110 18.4143 inf inf

The reason seems to be that converting a double x in the range [0, 1) to float may result in 1.0f if x is close enough to 1. I see two possibilities to fix that:

  1. use internally double (or long double?) and then convert the result at the very end to float.

  2. take only 24 random bits and convert them to a float x in the range [0, 1) and then return -log(1 - x).

I have not checked if std::exponential_distribution<double> has the same problem: For float on the average 1 out of 224 (~107) draws returns "inf", which is easily confirmed. For double on the average 1 out of 253 (~1016) draws might return "inf", which I have not tested.

Marshall:
I don't think the problem is in std::exponential_distribution; but rather in generate_canonical.

Consider:

std::mt19937_64 mt2(1);
mt2.discard(517517);
std::cout << std::hexfloat << std::generate_canonical<float, std::numeric_limits<float>::digits>(mt2) << std::endl;
std::cout << std::hexfloat << std::generate_canonical<float, std::numeric_limits<float>::digits>(mt2) << std::endl;
std::cout << std::hexfloat << std::generate_canonical<float, std::numeric_limits<float>::digits>(mt2) << std::endl;

which outputs:

0x1.962532p-2
0x1p+0
0x1.20d0cap-3
but generate_canonical is defined to return a result in the range [0, 1).

[2015-10, Kona Saturday afternoon]

Options:

WEB: The one thing we cannot tolerate is any output range other than [0, 1).

WEB: I believe there may be a documented algorithm for the generator, and perhaps it's possible to discover en-route that the algorithm produces the wrong result and fix it.

MC: No. I analyzed this once, and here it is: the algorithm is in [rand.util.canonical], and it's all fine until p5. The expression S/R^k is mathematically less than one, but it may round to one.

GR: Could we change the rounding mode for the computation?

HH: No, because the rounding mode is global, not thread-local.

AM: SG1 wants to get rid of the floating point environment.

STL: The problem is that the standard specifies the implementation, and the implementation doesn't work.

MC: I'm not sure if nudging it down will introduce a subtle bias.

EF: I worry about how the user's choice of floating point environment affects the behaviour.

MS offers to run the topic past colleagues.

MC: Will set the status to open. STL wants to rename the issue. WEB wants to be able to find the issue by its original name still.

Mike Spertus to run the options past his mathematical colleagues, and report back.

[2017-11 Albuquerque Wednesday issue processing]

Choice: Rerun the algorithm if it gets 1.0.

Thomas K to provide wording; Marshall and STL to review.

Proposed resolution: