Random collisions
Stan James
(instanceof Sidekick)
Ranch Hand
Ranch Hand
Posts: 8791
posted 9 years ago
My dad taught statistics and always did the thing about the odds of two people in class having the same birthday. He's gone now so I turn to the math majors here. How do we compute how often to expect a duplicate or repeat in an rng for n digits or a range? I'm looking at 5 digits, 0..99999.
A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Stan James
(instanceof Sidekick)
Ranch Hand
Ranch Hand
Posts: 8791
posted 9 years ago
While working with this I read the code for our custom RNG and got scared.
I started looking for ways to compare randomness and settled on even distribution. I made a unit test that for a parameter n generates n**2 values between 1..n and asserts that 99% of the possible numbers are generated within 10% of the ideal distribution. Java Random flies through this for n over a few hundred with no problem.
The custom generator didn't trust the seed algorithm for some reason and resets the seed to the output value every time. I guess because the output value is many orders of magnitude smaller than the original seed, the value converges on repeating cycles pretty quickly. For 1..10000 38 possible values had counts within 1 of each other, the other 9962 occurred zero or one times.
Moral: You are almost certainly not smarter than the library, or, in this case, Donald Knuth.
I started looking for ways to compare randomness and settled on even distribution. I made a unit test that for a parameter n generates n**2 values between 1..n and asserts that 99% of the possible numbers are generated within 10% of the ideal distribution. Java Random flies through this for n over a few hundred with no problem.
The custom generator didn't trust the seed algorithm for some reason and resets the seed to the output value every time. I guess because the output value is many orders of magnitude smaller than the original seed, the value converges on repeating cycles pretty quickly. For 1..10000 38 possible values had counts within 1 of each other, the other 9962 occurred zero or one times.
Moral: You are almost certainly not smarter than the library, or, in this case, Donald Knuth.
A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Hang a left on main. Then read this tiny ad:
the new thread boost feature brings a LOT of attention to your favorite threads
https://coderanch.com/t/674455/ThreadBoostfeature
