Most efficient random integer generator by random bits consumed

Discovered in 2013 by then-UPMC-student Jérémie Lumbroso, and independently re-discovered by an Apple programmer last Fall (with much ado on Twitter and HN), the basic idea is as follows:

Take your (infinite) random bitstream as the binary digits ("fractional part") of random real $r\in\left[0,1\right)$
Specifically: $r=\sum_{i=1}^{\infty}{\text{bit}_i 2^{-i}}$
Read out only as many digits of $r$ as are required to determine the value of $\left\lfloor rn\right\rfloor$
You now (clearly) have fairly chosen $\leftarrow_{$}\left\{0,1,\cdots,n-1\right\}$
For the next number, resume reading bits from the stream where this number left off

While it's easily-shown that no single-draw algorithm can be perfectly optimal, this algorithm is noticeably more efficient than rejection sampling in the single-draw case (and Lumbroso reminds us that this it is, obviously, also perfectly compatible with arithmetic-coded batching of the draws (left as an exercise for the reader)).

from fractions import Fraction
from random import getrandbits
def educational_randbelow(n):
	# performs VERY poorly CPU-wise --
	# for illustration only!
	if n <= 1:
			# A random integer "below" 1 must be 0
			return 0
	k = (n-1).bit_length()
	if n == 2 ** k:
		# Ironically, this interpretation
		# of the algorithm
		# doesn't handle the
		# trivial case well...
		# handle it specially:
		return getrandbits(k)
	# i represents the current level of "detail" we have on r,
	# telling us how MANY bits of it we have uncovered so far:
	i = 1
	# next is the value r (from the equation):
	# (it starts out VERY coarse, just i=1 bit of detail!)
	# (*technically, it is a lower bound on r --
	#  it tells us AT LEAST how big r is known
	#  to be based on what we've seen of it.)
	r = Fraction(getrandbits(1), 2**i)
	# this next number is how big we KNOW
	# that r can NEVER QUITE reach; if
	# 100% of our "random" bits from here on out
	# were high, it would APPROACH that value:
	r_upperbound = r + Fraction(1, 2**i)
	# if int(n * r) != int(n * r_upperbound)
	# then that means that we don't yet have
	# enough detail about our random number
	# to see exactly what the true integer
	# value of n*r should be
	while int(n * r) != int(n * r_upperbound):
		# Just keep resolving more
		# bits of 0 <= r < 1
		# until a number is
		# homed-in-on unambiguously
		i += 1
		r += Fraction(getrandbits(1), 2**i)
		r_upperbound = r + Fraction(1, 2**i)
	# Once int(n * r) == int(n * r_upperbound),
	# the while-loop will break, since resolving
	# more bits of r would be pointless
	# -- time to return the random number!
	return int(n * r)

With one helping of mathematical intuition, two helpings of high-school algebra, and an invested afternoon, you can eliminate all the Fraction objects in the above example to MASSIVELY improve performance:

def randbelow(n):
	if n <= 1:
		return 0
	k = (n - 1).bit_length()
	a = getrandbits(k)
	b = 2 ** k
	if n == b:
		return a
	while (n * a // b) != (n * (a + 1) // b):
		a = (a * 2) | getrandbits(1)
		b = b * 2
	return n * a // b

Below here is an adaptation from the paper itself; it runs far faster even than the above optimization (and has integrated handling of powers-of-2), but I couldn't possibly explain to you at this time just how or why it is equivalent to the above code. Perhaps I'll understand it after more pondering, someday:

Python

def randbelow(n):
	"""Fast Dice Roller, Lumbroso, 2013"""
	a = 0
	b = 1
	while True:
		a = (a * 2) | getrandbits(1)
		b = b * 2
		if b >= n:
			if a < n:
				return a
			a = a - n
			b = b - n

uintmax_t randbelow(uintmax_t n, bool (*flip)())
// Fast Dice Roller
// Lumbroso, 2013
{
	uintmax_t a = 0;
	uintmax_t b = 1;
	while (1) {
		a = (a << 1) | flip();
		b = b << 1;
		if ( b >= n ) {
			if ( a < n ) return a;
			else
			{
				a = a - n;
				b = b - n;
			}
		}
	}
}

bool flip()
// example fair coin-flip function
{
	return (rand() > RAND_MAX/2);
}

function randbelow(n, getrandbit=()=>(Math.random() > 1/2)) {
	// Fast Dice Roller
	// Lumbroso, 2013
	const _int = m => n.__proto__.constructor(m);
	const [_0, _1] = [_int(0), _int(1)];
	let a = _0;
	let b = _1;
	while( true ) {
		a = (a << _1) + _int(getrandbit());
		b = b << _1;
		if ( b >= n ) {
			if ( a < n ) return a;
			else
			{
				a = a - n;
				b = b - n;
			}
		}
	}
}

Most efficient random integer generator by random bits consumed

Leave a Reply Cancel reply