Skip to content

Conversation

@umar456
Copy link
Member

@umar456 umar456 commented Jul 30, 2020

The randn function was generating low quality floating point random numbers. This PR fixes and issue with the Box Muller parameters for floating point numbers.

Description

This bug fix addresses duplicate values when generating random numbers with floats. This issue was caused because the boxMullerTransform function was passed incorrect values for the second set of 2 floating point numbers.

This PR also enables the Chi squared tests for random number generators. These are heavier tests but we should start testing the quality to avoid issues like this. We are unfortunately not testing the randn function but its a step in the right direction. @rstub How difficult would it be to implement a similar test for the normal distribution.

Fixes: #2972

Changes to Users

N/A

Checklist

  • Rebased on latest master
  • Code compiles
  • Tests pass
  • `[ ] Functions added to unified API
  • [ ] Functions documented

umar456 added 2 commits July 30, 2020 01:43
* boxMuller for float randn was being passed incorrect portions
of the counter/seed
The Chi2 tests measure the quality of the random number generator. This
test is too expensive to run on the CPU
@umar456 umar456 force-pushed the randn_fix branch 4 times, most recently from 09bbcdf to 71aff29 Compare July 31, 2020 21:21
@rstub
Copy link
Contributor

rstub commented Jul 31, 2020

Concerning the chi-squared test: That is also possible for the normal distribution:

Another test variant is the χ^2 test which was also used by Leong et al. (2005). The basic idea is as follow:

  1. The real line is divided into B bins, equally spaced between (symmetric) values distant enough from zero so that no N(0, 1) draw should exceed them.
  2. Here, we follow Leong et al. (2005) and use a range from -7 to 7 with a total of 200 bins.
  3. A large number of N(0, 1) random variates is drawn, and for each of these numbers a counter in the bin corresponding to
    the draw is increased.
  4. After the N draws, the empirical distribution is compared to the theoretical (provided by the corresponding value of the Normal density function) using a standard chi2 test.

(source: https://cran.r-project.org/web/packages/RcppZiggurat/vignettes/RcppZiggurat.pdf)

I have tried to code this as follows:

#include <iostream>
#include <iomanip>
#include <cassert>

#include <arrayfire.h>
#include <af/traits.hpp>

using af::array;
using af::randomEngine;
using af::randomEngineType;

template <typename T>
T chi2_statistic(array input, array expected) {
    expected *= af::sum<T>(input) / af::sum<T>(expected);
    array diff = input - expected;
    return af::sum<T>((diff * diff) / expected);
}

// should be used only for x <= 5 (roughly)
array cnd(array x)
{
    return 0.5 * erfc(- x * sqrt(0.5));
}

template <typename T>
bool testRandomEngineNormalChi2(randomEngineType type)
{
    af::dtype ty = (af::dtype)af::dtype_traits<T>::af_type;

    int elem = 1024*1024;
    int steps = 256*32;
    int bins = 100;
    T lower_edge = -7.0;
    T upper_edge = 7.0;

    array total_hist = af::constant(0.0, 2 * bins, ty);
    array edges = af::seq(bins + 1)/bins * lower_edge;
    array expected = -af::diff1(cnd(edges));
    expected = af::join(0, expected(af::seq(bins-1, 0, -1)), expected);

    af::randomEngine r(type, 0);

    // R> qchisq(c(5e-6, 1 - 5e-6), 197)
    // [1] 121.3197 297.2989
    T lower = 121.3197;
    T upper = 297.2989;

    bool prev_step = true;
    bool prev_total = true;
    af::setSeed(0x76fa214467690e3c);
    std::cout << std::setw(4) << "step"
              << std::setw(7) << "chi2_i"
              << std::setw(7) << "chi2_t"
              << std::setprecision(2) << std::fixed << std::endl;
    for (int i = 0; i < steps; ++i) {
        array step_hist = af::histogram(af::randn(elem, ty, r), 2 * bins, lower_edge, upper_edge);
        T step_chi2 = chi2_statistic<T>(step_hist, expected);
        std::cout << std::setw(4) << i
                  << std::setw(7) << step_chi2;
        bool step = step_chi2 > lower && step_chi2 < upper;
        if (!(step || prev_step)) return false;
        prev_step = step;

        total_hist += step_hist;
        T total_chi2 = chi2_statistic<T>(total_hist, expected);
        std::cout << std::setw(7) << total_chi2
                  << std::endl;
        bool total = total_chi2 > lower && total_chi2 < upper;
        if (!(total || prev_total)) return false;
        prev_total = total;
    }
    return true;
}



int main(int argc, char ** argv) {
    int backend = argc > 1 ? atoi(argv[1]) : 0;
    af::setBackend(static_cast<af::Backend>(backend));
    int device = argc > 2 ? atoi(argv[2]) : 0;
    af::setDevice(device);
    af::info();

    std::cout << "PHILOX" << std::endl;
    if (!testRandomEngineNormalChi2<float>(AF_RANDOM_ENGINE_PHILOX))
        std::cout << "FAILURE" << std::endl;
    std::cout << "THREEFRY" << std::endl;
    if (!testRandomEngineNormalChi2<float>(AF_RANDOM_ENGINE_THREEFRY))
        std::cout << "FAILURE" << std::endl;
    std::cout << "MERSENNE" << std::endl;
    if (!testRandomEngineNormalChi2<float>(AF_RANDOM_ENGINE_MERSENNE))
        std::cout << "FAILURE" << std::endl;

    if (af::isDoubleAvailable(device)) {
        std::cout << "PHILOX" << std::endl;
        if (!testRandomEngineNormalChi2<double>(AF_RANDOM_ENGINE_PHILOX))
            std::cout << "FAILURE" << std::endl;
        std::cout << "THREEFRY" << std::endl;
        if (!testRandomEngineNormalChi2<double>(AF_RANDOM_ENGINE_THREEFRY))
            std::cout << "FAILURE" << std::endl;
        std::cout << "MERSENNE" << std::endl;
        if (!testRandomEngineNormalChi2<double>(AF_RANDOM_ENGINE_MERSENNE))
            std::cout << "FAILURE" << std::endl;
    }
}

Getting the expected correct is a bit tricky. One needs to go far from the mean value, where the density of the normal distribution is almost zero. I am computing the density within these ranges via the difference of the cummulated density function (c.f. #2078), but that only works well for values close to zero, not close to one. I am sure there are more elegant ways to do this ...

Anyway, for me this fails very quickly

 ArrayFire v3.8.0 (OpenCL, 64-bit Linux, build 64855cbd)
[0] INTEL: Intel(R) Gen8 HD Graphics NEO, 9399 MB
-1- BEIGNET: Intel(R) HD Graphics 5500 BroadWell U-Processor GT2, 4096 MB
-2- POCL: pthread-Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz, 9701 MB
PHILOX
step chi2_i chi2_t
   0 287.13 287.13
   1 454.45 371.92
   2 371.33FAILURE
THREEFRY
step chi2_i chi2_t
   0 355.77 355.77
   1 464.06FAILURE
MERSENNE
step chi2_i chi2_t
   0 367.78 367.78
   1 350.76FAILURE

I am not sure if this indicates an error in my code or is an example of the problem you have identified here. I have not tried to build a patched version.

@umar456
Copy link
Member Author

umar456 commented Aug 1, 2020

@rstub This is great. I will work with this and see if I can improve the results. I am able to go further(fails after 13 steps) with the latest changes but not by much. Thanks for your input!

@umar456 umar456 force-pushed the randn_fix branch 2 times, most recently from bec40dc to e8a1e62 Compare August 7, 2020 04:41
@umar456
Copy link
Member Author

umar456 commented Aug 7, 2020

@rstub I pushed my latest changes to this branch and the normal distributions look better. The issue was with the box muller transform function we had been using to generate the normal distribution. For floats we were incorrectly passing in the wrong arguments and there was a subtle bug in one of the conversion methods. I made a couple changes to the test you posed above and I would appreciate if you could run and review the tests.

@willat343 Could you also give this branch a try and give me any feedback on the results?

@9prady9 9prady9 added this to the 3.7.3 milestone Aug 7, 2020
@rstub
Copy link
Contributor

rstub commented Aug 7, 2020

I have checked out the branch. Works very well! I have only tested with float, though. I have add some nitpicky comments above.

BTW, good that you switched back to int elem = 256 * 1024 * 1024;. The lower number of elements had been an artifact of my testing. And with lower elements, it gets more likely to get "odd" results in one of the tries.

rstub
rstub previously approved these changes Aug 7, 2020
@umar456
Copy link
Member Author

umar456 commented Aug 7, 2020

Thanks @rstub. I was curious about your choice for the dof as well. Your explanation makes sense. Can you give me your reasoning behind the probability values you chose for that function. I am trying to figure out if those values apply to other types like f16 or if they need to be modified for the limited precision of that type. The actual distribution for f16 looks reasonable when you print out the histogram of the random number array but quickly fails the chi test. I suspect it is because of the reduced resolution of the type.

@umar456 umar456 force-pushed the randn_fix branch 2 times, most recently from 9f96a9f to 036470f Compare August 7, 2020 18:36
@rstub
Copy link
Contributor

rstub commented Aug 9, 2020

@umar: The range of chi^2 is just a heuristic. Large enough to not trigger spurious failures in the cases I looked at but small enough to catch the errors identified at that time (c.f. #2007).

Have you tried running the test for f16 for a longer period without the failure conditions? What does the local and global test statstics look like when plotted as a function of iteration? See https://cran.r-project.org/web/packages/RcppZiggurat/vignettes/RcppZiggurat.pdf for an example of such graphs.

@umar456 umar456 force-pushed the randn_fix branch 3 times, most recently from 3900496 to 6aab251 Compare August 12, 2020 13:41
Move the RNG quality checks to a separate file to avoid memory allocation
failures because the random tests are running parallely with other
tests. These tests are marked as SERIAL so they are running
serially.
* Fix rounding of some operations by using fused operations like sincospi
  and fma instead of a multiply add.
* Convert half constants to hex values and use __ushort_as_half
  to avoid redundant conversions from float
* Pass integers instead of pointers in the OpenCL backend of
  the rng functions
@9prady9 9prady9 merged commit fe0c8d5 into arrayfire:master Aug 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

af::randn creates duplicates with CUDA backend [BUG]

3 participants