Fixes random number generator with normal distribution #2980

umar456 · 2020-07-30T06:17:41Z

The randn function was generating low quality floating point random numbers. This PR fixes and issue with the Box Muller parameters for floating point numbers.

Description

This bug fix addresses duplicate values when generating random numbers with floats. This issue was caused because the boxMullerTransform function was passed incorrect values for the second set of 2 floating point numbers.

This PR also enables the Chi squared tests for random number generators. These are heavier tests but we should start testing the quality to avoid issues like this. We are unfortunately not testing the randn function but its a step in the right direction. @rstub How difficult would it be to implement a similar test for the normal distribution.

Fixes: #2972

Changes to Users

N/A

Checklist

Rebased on latest master
Code compiles
Tests pass
~~`[ ] Functions added to unified API~~
~~[ ] Functions documented~~

* boxMuller for float randn was being passed incorrect portions of the counter/seed

The Chi2 tests measure the quality of the random number generator. This test is too expensive to run on the CPU

rstub · 2020-07-31T22:11:49Z

Concerning the chi-squared test: That is also possible for the normal distribution:

Another test variant is the χ^2 test which was also used by Leong et al. (2005). The basic idea is as follow:

The real line is divided into B bins, equally spaced between (symmetric) values distant enough from zero so that no N(0, 1) draw should exceed them.

Here, we follow Leong et al. (2005) and use a range from -7 to 7 with a total of 200 bins.

A large number of N(0, 1) random variates is drawn, and for each of these numbers a counter in the bin corresponding to
the draw is increased.

After the N draws, the empirical distribution is compared to the theoretical (provided by the corresponding value of the Normal density function) using a standard chi2 test.

(source: https://cran.r-project.org/web/packages/RcppZiggurat/vignettes/RcppZiggurat.pdf)

I have tried to code this as follows:

#include <iostream>
#include <iomanip>
#include <cassert>

#include <arrayfire.h>
#include <af/traits.hpp>

using af::array;
using af::randomEngine;
using af::randomEngineType;

template <typename T>
T chi2_statistic(array input, array expected) {
    expected *= af::sum<T>(input) / af::sum<T>(expected);
    array diff = input - expected;
    return af::sum<T>((diff * diff) / expected);
}

// should be used only for x <= 5 (roughly)
array cnd(array x)
{
    return 0.5 * erfc(- x * sqrt(0.5));
}

template <typename T>
bool testRandomEngineNormalChi2(randomEngineType type)
{
    af::dtype ty = (af::dtype)af::dtype_traits<T>::af_type;

    int elem = 1024*1024;
    int steps = 256*32;
    int bins = 100;
    T lower_edge = -7.0;
    T upper_edge = 7.0;

    array total_hist = af::constant(0.0, 2 * bins, ty);
    array edges = af::seq(bins + 1)/bins * lower_edge;
    array expected = -af::diff1(cnd(edges));
    expected = af::join(0, expected(af::seq(bins-1, 0, -1)), expected);

    af::randomEngine r(type, 0);

    // R> qchisq(c(5e-6, 1 - 5e-6), 197)
    // [1] 121.3197 297.2989
    T lower = 121.3197;
    T upper = 297.2989;

    bool prev_step = true;
    bool prev_total = true;
    af::setSeed(0x76fa214467690e3c);
    std::cout << std::setw(4) << "step"
              << std::setw(7) << "chi2_i"
              << std::setw(7) << "chi2_t"
              << std::setprecision(2) << std::fixed << std::endl;
    for (int i = 0; i < steps; ++i) {
        array step_hist = af::histogram(af::randn(elem, ty, r), 2 * bins, lower_edge, upper_edge);
        T step_chi2 = chi2_statistic<T>(step_hist, expected);
        std::cout << std::setw(4) << i
                  << std::setw(7) << step_chi2;
        bool step = step_chi2 > lower && step_chi2 < upper;
        if (!(step || prev_step)) return false;
        prev_step = step;

        total_hist += step_hist;
        T total_chi2 = chi2_statistic<T>(total_hist, expected);
        std::cout << std::setw(7) << total_chi2
                  << std::endl;
        bool total = total_chi2 > lower && total_chi2 < upper;
        if (!(total || prev_total)) return false;
        prev_total = total;
    }
    return true;
}



int main(int argc, char ** argv) {
    int backend = argc > 1 ? atoi(argv[1]) : 0;
    af::setBackend(static_cast<af::Backend>(backend));
    int device = argc > 2 ? atoi(argv[2]) : 0;
    af::setDevice(device);
    af::info();

    std::cout << "PHILOX" << std::endl;
    if (!testRandomEngineNormalChi2<float>(AF_RANDOM_ENGINE_PHILOX))
        std::cout << "FAILURE" << std::endl;
    std::cout << "THREEFRY" << std::endl;
    if (!testRandomEngineNormalChi2<float>(AF_RANDOM_ENGINE_THREEFRY))
        std::cout << "FAILURE" << std::endl;
    std::cout << "MERSENNE" << std::endl;
    if (!testRandomEngineNormalChi2<float>(AF_RANDOM_ENGINE_MERSENNE))
        std::cout << "FAILURE" << std::endl;

    if (af::isDoubleAvailable(device)) {
        std::cout << "PHILOX" << std::endl;
        if (!testRandomEngineNormalChi2<double>(AF_RANDOM_ENGINE_PHILOX))
            std::cout << "FAILURE" << std::endl;
        std::cout << "THREEFRY" << std::endl;
        if (!testRandomEngineNormalChi2<double>(AF_RANDOM_ENGINE_THREEFRY))
            std::cout << "FAILURE" << std::endl;
        std::cout << "MERSENNE" << std::endl;
        if (!testRandomEngineNormalChi2<double>(AF_RANDOM_ENGINE_MERSENNE))
            std::cout << "FAILURE" << std::endl;
    }
}

Getting the expected correct is a bit tricky. One needs to go far from the mean value, where the density of the normal distribution is almost zero. I am computing the density within these ranges via the difference of the cummulated density function (c.f. #2078), but that only works well for values close to zero, not close to one. I am sure there are more elegant ways to do this ...

Anyway, for me this fails very quickly

 ArrayFire v3.8.0 (OpenCL, 64-bit Linux, build 64855cbd)
[0] INTEL: Intel(R) Gen8 HD Graphics NEO, 9399 MB
-1- BEIGNET: Intel(R) HD Graphics 5500 BroadWell U-Processor GT2, 4096 MB
-2- POCL: pthread-Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz, 9701 MB
PHILOX
step chi2_i chi2_t
   0 287.13 287.13
   1 454.45 371.92
   2 371.33FAILURE
THREEFRY
step chi2_i chi2_t
   0 355.77 355.77
   1 464.06FAILURE
MERSENNE
step chi2_i chi2_t
   0 367.78 367.78
   1 350.76FAILURE

I am not sure if this indicates an error in my code or is an example of the problem you have identified here. I have not tried to build a patched version.

umar456 · 2020-08-01T02:56:29Z

@rstub This is great. I will work with this and see if I can improve the results. I am able to go further(fails after 13 steps) with the latest changes but not by much. Thanks for your input!

umar456 · 2020-08-07T06:09:55Z

@rstub I pushed my latest changes to this branch and the normal distributions look better. The issue was with the box muller transform function we had been using to generate the normal distribution. For floats we were incorrectly passing in the wrong arguments and there was a subtle bug in one of the conversion methods. I made a couple changes to the test you posed above and I would appreciate if you could run and review the tests.

@willat343 Could you also give this branch a try and give me any feedback on the results?

test/rng_quality.cpp

rstub · 2020-08-07T16:39:36Z

I have checked out the branch. Works very well! I have only tested with float, though. I have add some nitpicky comments above.

BTW, good that you switched back to int elem = 256 * 1024 * 1024;. The lower number of elements had been an artifact of my testing. And with lower elements, it gets more likely to get "odd" results in one of the tries.

umar456 · 2020-08-07T18:26:20Z

Thanks @rstub. I was curious about your choice for the dof as well. Your explanation makes sense. Can you give me your reasoning behind the probability values you chose for that function. I am trying to figure out if those values apply to other types like f16 or if they need to be modified for the limited precision of that type. The actual distribution for f16 looks reasonable when you print out the histogram of the random number array but quickly fails the chi test. I suspect it is because of the reduced resolution of the type.

src/backend/cuda/compile_module.cpp

src/backend/cuda/kernel/random_engine.hpp

src/backend/cuda/compile_module.cpp

test/rng_quality.cpp

test/CMakeLists.txt

src/backend/cuda/kernel/random_engine_philox.hpp

test/rng_quality.cpp

src/backend/opencl/kernel/random_engine_write.cl

src/backend/cuda/kernel/random_engine.hpp

src/backend/cpu/kernel/random_engine.hpp

rstub · 2020-08-09T09:09:09Z

@umar: The range of chi^2 is just a heuristic. Large enough to not trigger spurious failures in the cases I looked at but small enough to catch the errors identified at that time (c.f. #2007).

Have you tried running the test for f16 for a longer period without the failure conditions? What does the local and global test statstics look like when plotted as a function of iteration? See https://cran.r-project.org/web/packages/RcppZiggurat/vignettes/RcppZiggurat.pdf for an example of such graphs.

Move the RNG quality checks to a separate file to avoid memory allocation failures because the random tests are running parallely with other tests. These tests are marked as SERIAL so they are running serially.

* Fix rounding of some operations by using fused operations like sincospi and fma instead of a multiply add. * Convert half constants to hex values and use __ushort_as_half to avoid redundant conversions from float * Pass integers instead of pointers in the OpenCL backend of the rng functions

umar456 added 2 commits July 30, 2020 01:43

Fix randn by passing in correct values to boxMuller

69525e7

* boxMuller for float randn was being passed incorrect portions of the counter/seed

Enable Uniform Chi2 tests for CUDA and OpenCL

2b3fbda

The Chi2 tests measure the quality of the random number generator. This test is too expensive to run on the CPU

umar456 force-pushed the randn_fix branch 4 times, most recently from 09bbcdf to 71aff29 Compare July 31, 2020 21:21

umar456 force-pushed the randn_fix branch 2 times, most recently from bec40dc to e8a1e62 Compare August 7, 2020 04:41

9prady9 added this to the 3.7.3 milestone Aug 7, 2020

rstub reviewed Aug 7, 2020

View reviewed changes

test/rng_quality.cpp Show resolved Hide resolved

test/rng_quality.cpp Show resolved Hide resolved

test/rng_quality.cpp Outdated Show resolved Hide resolved

rstub previously approved these changes Aug 7, 2020

View reviewed changes

umar456 dismissed rstub’s stale review via 9f96a9f August 7, 2020 18:28

umar456 force-pushed the randn_fix branch 2 times, most recently from 9f96a9f to 036470f Compare August 7, 2020 18:36

9prady9 requested changes Aug 8, 2020

View reviewed changes

umar456 force-pushed the randn_fix branch 3 times, most recently from 3900496 to 6aab251 Compare August 12, 2020 13:41

umar456 added 4 commits August 12, 2020 10:01

Fix warning in compile module about the length of the error message

09ca488

Move RNG(random) quality checks to own file to avoid large alloc

8e2ab80

Move the RNG quality checks to a separate file to avoid memory allocation failures because the random tests are running parallely with other tests. These tests are marked as SERIAL so they are running serially.

Fix problems in RNG for older compute architectures with fp16

c60c26d

umar456 force-pushed the randn_fix branch from 6aab251 to c60c26d Compare August 12, 2020 14:02

9prady9 approved these changes Aug 12, 2020

View reviewed changes

9prady9 merged commit fe0c8d5 into arrayfire:master Aug 12, 2020

Fixes random number generator with normal distribution #2980

Fixes random number generator with normal distribution #2980

Uh oh!

Conversation

umar456 commented Jul 30, 2020

Description

Changes to Users

Checklist

Uh oh!

rstub commented Jul 31, 2020

Uh oh!

umar456 commented Aug 1, 2020

Uh oh!

umar456 commented Aug 7, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rstub commented Aug 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

umar456 commented Aug 7, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rstub commented Aug 9, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rstub commented Aug 7, 2020 •

edited

Loading