This week's book giveaway is in the Performance forum.
We're giving away four copies of The Java Performance Companion and have Charlie Hunt, Monica Beckwith, Poonam Parhar, & Bengt Rutisson on-line!
See this thread for details.
Win a copy of The Java Performance Companion this week in the Performance forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Random collisions

 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
My dad taught statistics and always did the thing about the odds of two people in class having the same birthday. He's gone now so I turn to the math majors here. How do we compute how often to expect a duplicate or repeat in an rng for n digits or a range? I'm looking at 5 digits, 0..99999.
 
Pat Farrell
Rancher
Posts: 4678
7
Linux Mac OS X VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
google for 'birthday paradox'
It works out that the odds hit 50% quickly, I think 22 people or so, where your naive expectation is that it would take 180.

I worked with a guy who could calculate it in his head.
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
While working with this I read the code for our custom RNG and got scared.

I started looking for ways to compare randomness and settled on even distribution. I made a unit test that for a parameter n generates n**2 values between 1..n and asserts that 99% of the possible numbers are generated within 10% of the ideal distribution. Java Random flies through this for n over a few hundred with no problem.

The custom generator didn't trust the seed algorithm for some reason and resets the seed to the output value every time. I guess because the output value is many orders of magnitude smaller than the original seed, the value converges on repeating cycles pretty quickly. For 1..10000 38 possible values had counts within 1 of each other, the other 9962 occurred zero or one times.

Moral: You are almost certainly not smarter than the library, or, in this case, Donald Knuth.
 
Paul Clapham
Sheriff
Posts: 21133
32
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Stan, I saw the first part of your post in the Today's Posts section and thought to myself "Better remind him what Knuth said about random-number generators". But... I see that's covered off already. Good. Carry on then.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic