In my last post, I described how I started from beautiful images generated by "turmites" and ended up with one that looked, and in many aspects felt, random.
I knew my approach was amateurish. After all, I am an amateur when it comes to information entropy, predictability, statistical analysis, and random number generators (RNGs). But I wanted to learn more and explore the idea further.
So my first attempt was to write a pseudo-random number generator (PRNG) based on my LLLR turmite: if I let the turmite run enough iterations on an NxN grid using 4 colors, the result is NxNx2 bits.
With a 64x64 grid, I’d get 8,192 bits, or 1,024 bytes. Are they random enough? How do I measure that?
I learned about tests used to identify structural weaknesses in PRNGs, like SmallCrush and BigCrush.
Yes (as you may have expected), my "turmite RNG" failed miserably, even on the simplest statistical tests like Chi-Square. It was also orders of magnitude slower than anything else out there.
Still, my intuition told me there must be some way to build a decent RNG using cellular automata.
I found Stephen Wolfram’s "Rule 30"1. It’s a very simple, 1D cellular automaton, "displaying aperiodic, chaotic behaviour". And Wolfram used it as a PRNG in Mathematica.
Now we're talking.
Arrogantly, I built my own implementation using my own assumptions and preferences, without looking into how others used Rule 30 for PRNG purposes —which turned out to be a good thing, I think.
In most cases, cellular automata "live" on infinite spaces, like plains and lines, but in my previous work with turmites, I liked how they behaved on toroidal spaces. So I used a circular 256-bit stripe, and instead of using just the middle bits of consecutive steps as the RNG output, I used the whole 256-bit space. (That’s a 256× speed improvement over the typical use of Rule 30 as an RNG!)
And it looked good. Performance was excellent because I could fit the entire state in 4 uint64 words and compute the next state using basic binary operations on each word.
I even claimed 4x speed improvements over math/rand!
Slow down, cowboy.
People were kind to point out2 that the tests I used did not compare apples to apples. And as I searched more, I learned there's a lot to fix and improve.
Long story short, two days later I had a decent PRNG that is comparable to, and in some cases better than, the standard pseudo-random number generators used by Golang.
I even modified the algorithm further, to improve statistical distribution: after each step, I apply XOR rotation mixing to each word, which is not part of the original Rule 30 algorithm.
Rule30RND is a drop-in replacement for math/rand and math/rand/v2, with a very small footprint (state and manipulation require just 40 bytes of memory), comparable statistical qualities, and is 30%–60% faster.
Rule30RND generates reproducible sequences (depending on the seed), which may sound like a drawback for an RNG, but it’s actually useful in many cases where a PRNG is preferred over a CSPRNG (Cryptographically Secure Pseudo-Random Number Generator). Those cases usually involve generating large, statistically unpredictable data sets used for simulations or testing.
Throughput Comparison, relative to math/rand
| Algorithm | Read() 32KB | Read() 1KB | Uint64() |
|---|---|---|---|
| MathRand | 1.00x | 1.00x | 1.00x |
| MathRandV2 | 1.69x | 1.63x | 0.58x |
| Rule30 | 5.47x | 5.13x | 1.39x |
| CryptoRand | 3.11x | 1.75x | 0.03x |
Rule30RND Basic Tests (ent)
| Test | Result | Expected | Status |
|---|---|---|---|
| Entropy | 7.999982 bits/byte | 8.0000 | ✓ Perfect (99.9998%) |
| Chi-Square | 253.9 | ~255 (200-310) | ✓ Excellent |
| Arithmetic Mean | 127.4987 | 127.5 | ✓ Perfect |
| Monte Carlo π | 3.140031 | 3.141593 | ✓ 0.05% error |
| Serial Correlation | 0.000171 | 0.0000 | ✓ Uncorrelated |
SmallCrush results
| RNG | Passed | Failed | Pass Rate |
|---|---|---|---|
| Rule30 | 15/15 | 0/15 | 100% |
| math/rand | 15/15 | 0/15 | 100% |
| math/rand/v2 (PCG) | 5/15 | 10/15 | 100% |
(Update: The original version of this post incorrectly reported that all the above tests failed 5/15 tests. This was due to an error in the test scripts.)
That’s no small feat for an amateur! Many people smarter and with much deeper knowledge than me spend years working on these things, me building something that can be compared to what they build makes me proud.
I’ll probably keep tinkering on Rule30RND. Running a BigCrunch test now, that will probably take a whole day or more.
Happy New Year!