4

I'm trying to get numpy to use my own implementation of an RNG in for consistency reasons. My understanding, based on the little documentation I could find, from the numpy docs here and here is that I need to provide a custom BitGenerator class that implements the random_raw method and then initialise using np.random.Generator, so I tried this:

import numpy as np

class TestBitGenerator(np.random.BitGenerator):
  def __init__(self):
    super().__init__(0)
    self.counter = 0
  def random_raw(self, size=None):
    self.counter += 1
    if size is None:
      return self.counter
    return np.full(n, self.counter)  

mc_adapter = TestBitGenerator() 
npgen = np.random.Generator(mc_adapter)

print(npgen.random())

which results in a segfault:

$ python bitgen.py 
Segmentation fault (core dumped)

I assume I'm missing something (from TestBitGenerator?) here, can anyone point me in the right direction?

I tried not subclassing np.random.BitGenerator, and got a different error: object has no attribute 'capsule'

I using numpy 1.19.2 and python 3.8.2

2
  • If it is for consistency reasons, why don't you just set the seed to a consistent value? Commented Sep 30, 2020 at 14:50
  • It didn't seem relevant to the question to explain my exact needs as its complicated. I have some C++ code (including my RNG) taking to python via pybind11, and I'm running on MPI and I need control over how each process is seeded, and each process should (ideally) use just a single random stream accessible from both the C++ and python layers. Commented Sep 30, 2020 at 15:26

3 Answers 3

2

Actually you can do it in pure python if you wrap it using the library RandomGen (this was the incubator for the current np.random.Generator). So there is a UserBitGenerator that allow you to use only python: https://bashtage.github.io/randomgen/bit_generators/userbitgenerator.html

Sad that this did not make it in numpy (or if it is I did not find it yet...).

Sign up to request clarification or add additional context in comments.

Comments

2

If you are using pybind11, it's not too hard, although there's no good documentation explaining any of this, so it took me a while to figure it out. Here is what I came up with, so hopefully this will save a few people some consternation.

  1. Define your custom bit generator in C++. You'll need to define three functions: next_unint32, next_uint64, and next_double, each returning the next random value of the given type. (Typically one of these will be primary, and you'll define the other two based on it.) They all need to take one argument, given as void*, which you'll want to recast to whatever you actually have for a state object. Say a pointer to an instance of some class CustomRNG or whatever.
  2. Write a function that takes a pointer to your state object and a py::capsule (I use namespace py = pybind11, so spell that out if you don't do this.) This capsule wraps a bitgen_t struct, and writes the appropriate values to its elements. Something like the following:
#include <numpy/random/bitgen.h>  // For bitgen_t struct

// For reference:
/*
typedef struct bitgen {
  void *state;
  uint64_t (*next_uint64)(void *st);
  uint32_t (*next_uint32)(void *st);
  double (*next_double)(void *st);
  uint64_t (*next_raw)(void *st);
} bitgen_t;
*/

void SetupBitGen(CustomRNG* rng, py::capsule capsule)
{
    bitgen_t* bg(capsule);
    bg->state = rng;
    bg->next_uint64 = next_uint64;
    bg->next_uint32 = next_uint32;
    bg->next_double = next_double;
    bg->next_raw = next_uint64;
};
  1. Have pybind11 wrap this for you so you can call it from python. I'm also assuming that your CustomRNG type is already wrapped and accessible in python.
my_module.def("setup_bitgen", &SetupBitGen)

Or you could make this function a method of your CustomRNG class:

py::class_<CustomRNG>
    [...]
    .def("setup_bitgen", &SetupBitGen);
  1. In Python make a class derived from numpy.random.BitGenerator, which takes an instance of your CustomRNG as an argument. For the __init__, first call super().__init__() to make the things numpy expects to be there. Then call lib.setup_bitgen(self.rng, self.capsule) (or self.rng.setup_bitgen(self.capsule) if you went the method route) to update the elements of the capsule.

That's it. Now you can make your own BitGenerator object using this class, and pass that as an argument of numpy.random.Generator.

This is the method I used in GalSim, so you can see a worked example of it here.

4 Comments

Many thanks for this, I've now implemented something close you you example myself and it works. My only concern is that we have to roll our own bitgen_t struct to match numpy's; and if they change it at some point in the future it will break badly (e.g. segfault) and it almost certainly will be far from obvious what the problem is.
@virgesmith FWIW, it's located in numpy/core/include/numpy/random/bitgen.h in the installed numpy repo. So if you want to be a bit safer, you could add this directory to the C_INCLUDE_PATH programmatically in setup.py, and #include that.
I decided that was a good idea, so I did that in GalSim. cf. github.com/GalSim-developers/GalSim/pull/1179/commits/…
Definitely a good idea, have done likewise. Thanks again
0

I did some digging here and Generator's constructor looks like this:

    def __init__(self, bit_generator):
        self._bit_generator = bit_generator

        capsule = bit_generator.capsule
        cdef const char *name = "BitGenerator"
        if not PyCapsule_IsValid(capsule, name):
            raise ValueError("Invalid bit generator. The bit generator must "
                             "be instantiated.")
        self._bitgen = (<bitgen_t *> PyCapsule_GetPointer(capsule, name))[0]
        self.lock = bit_generator.lock

also here explains the requirements for the bitgen_t struct:

struct bitgen:
    void *state
    npy_uint64 (*next_uint64)(void *st) nogil
    uint32_t (*next_uint32)(void *st) nogil
    double (*next_double)(void *st) nogil
    npy_uint64 (*next_raw)(void *st) nogil

ctypedef bitgen bitgen_t

so from what I can make out, it looks like any implementation of BitGenerator must be done in cython, any attempt to implement one in pure python (and probably pybind11 too) just won't work(?)

As I'm not familiar with cython and if/how it could coexist with pybind11, for now I'm just going to ensure each of my (parallel) processes explicitly calls np.random.Generator with a numpy RNG seeded deterministically so that it's independent from my own RNG steam.

2 Comments

If you found a limitation in numpy.random.Generator that you care about, you may want to file an issue on that: github.com/numpy/numpy/issues . Also, an example showing how multithreaded random generation works is in the NumPy documentation, in case you didn't see it: numpy.org/doc/stable/reference/random/multithreading.html
Hello, I think I found an issue in the documentation example, and I have raised an issue on GitHub. @PeterO. Would it be possibly related to this?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.