• Post Reply Bookmark Topic Watch Topic
  • New Topic

The birthday paradox Problem  RSS feed

 
dhrubo bhattacharjee
Greenhorn
Posts: 23
Eclipse IDE Java Oracle
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi I have recently started learning Java and following Absolute Java by Walter Savitch and I got stuck with one problem and need help/suggestions from you guys:

The birthday paradox is that there is a surprisingly high probability that two or more people in the same room happen to share the same birthday. By birthday,we mean the same day of the year (ignoring leap years), but not the exact birthday that includes the birth year or time of day. Write a program that approximates the probability that 2 or more people in the same room have the same birthday, for 2 to 50 people in the room.The program should use simulation to approximate the answer. Over many trials (say, 5,000), randomly assign birthdays (i.e., a number from 1–365) to everyone
in the room. Use a HashSet to store the birthdays. As the birthdays are randomly generated, use the contains method of a HashSet to see if someone with the same birthday is already in the room. If so, increment a counter that tracks how many times at least two people have the same birthday and then move on to the next trial. After the trials are over, divide the counter by the number of trials to get an estimated probability that two or more people share the same birthday for a given room size.Your output should look something like the following. It will not be exactly the same due to the random numbers:For 2 people, the probability of two birthdays is about 0.002 ,For 3 people, the probability of two birthdays is about 0.0082


This is what my code looks like



I cannot understand how to do the simulation and run 5000 trials.Please help

 
fred rosenberger
lowercase baba
Bartender
Posts: 12562
49
Chrome Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I would suggest you re-engineer this slightly. A main() method shouldn't have much code in it. You need to think about writing more methods that do pieces.

If you had a method that you could call:

doOneTrial()

This method would do basically what you have. then you simply write a loop:


you may have to add a little more here and there, to catch/accumulate your results....but by writing methods that do simple things, you can easily build up very complicated programs.
 
Carey Brown
Saloon Keeper
Posts: 3309
46
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
 
Carey Brown
Saloon Keeper
Posts: 3309
46
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I might suggest breaking this up into multiple methods. You could have, as an example, something like this:

that returns true if 2 people have the same birth date.
and something like

that returns the number of trials with matches out of the numberOfTrials.
 
Junilu Lacar
Sheriff
Posts: 11476
180
Android Debian Eclipse IDE IntelliJ IDE Java Linux Mac Spring Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I don't think the algorithm/math suggested by the book is correct, specifically, by incrementing a counter every time you get a collision and then dividing the counter by the number of trials. Theoretically, there is a 50% probability of two same birthdays with 23 people. The algorithm suggested by the book consistently gives the 50% probability mark at somewhere between 19 and 20 people, even when I try it with 1_000_000 trials; 100% probability is consistently with 28 people. This is not correct.

When I adjust the calculation, the simulation numbers approach the theoretical numbers, with 50% probability with 23 people. To get this, I counted the number of trials where there is least one collision, then divide that number by the number of trials.

Edit:
I guess it all depends on how you interpret the bolded part of this sentence:

If so, increment a counter that tracks how many times at least two people have the same birthday and then move on to the next trial.

Interpreted one way, I get the incorrect results. Interpreted another way, I get the correct algorithm.
 
Junilu Lacar
Sheriff
Posts: 11476
180
Android Debian Eclipse IDE IntelliJ IDE Java Linux Mac Spring Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It appears that interpreting the algorithm as stated in the book is not straightforward since both solutions presented so far are incorrect. I think it would be more clearly stated this way: "If there are at least two people who have the same birthday, increment the counter and move on to the next trial."
 
Junilu Lacar
Sheriff
Posts: 11476
180
Android Debian Eclipse IDE IntelliJ IDE Java Linux Mac Spring Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Also, none of the methods presented so far for generating a random number from 1-365 are correct. Random.nextInt(bound) has a range of 0 to (bound-1).
 
Campbell Ritchie
Marshal
Posts: 56518
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to the Ranch

I think complicated algorithms are not a good way to learn programming of any kind. But take note of the warning about 364 and 365.
 
Piet Souris
Master Rancher
Posts: 2041
75
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Junilu Lacar wrote:I don't think the algorithm/math suggested by the book is correct, specifically, by incrementing a counter every time you get a collision and then dividing the counter by the number of trials. Theoretically, there is a 50% probability of two same birthdays with 23 people. The algorithm suggested by the book consistently gives the 50% probability mark at somewhere between 19 and 20 people, even when I try it with 1_000_000 trials; 100% probability is consistently with 28 people. This is not correct.

When I adjust the calculation, the simulation numbers approach the theoretical numbers, with 50% probability with 23 people. To get this, I counted the number of trials where there is least one collision, then divide that number by the number of trials.

Edit:
I guess it all depends on how you interpret the bolded part of this sentence:

If so, increment a counter that tracks how many times at least two people have the same birthday and then move on to the next trial.

Interpreted one way, I get the incorrect results. Interpreted another way, I get the correct algorithm.

The book:
"If two or more people share the same birthday, increase the counter and go to
the next trial (of the 5000). After the trials are over, divide the counter by the number of trials."

I see no ambiguity in the phrasing of the book, it is correct. What part makes you doubt?

And you are correct about the 364 and 365, but that is not the fault of the book. For the outcomes,
it does not much matter if you use 364 or 365.

Gteetz,
Piet
 
Junilu Lacar
Sheriff
Posts: 11476
180
Android Debian Eclipse IDE IntelliJ IDE Java Linux Mac Spring Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Piet Souris wrote:
I see no ambiguity in the phrasing of the book, it is correct. What part makes you doubt?

The fact that at least three two people, including myself, interpreted the book incorrectly is a good indication that there is, in fact, some ambiguity in interpreting it.

Edit: I guess I shouldn't have counted Carey -- the comment he added indicates he had the correct interpretation. Must be just me and the OP then. ¯\_(ツ)_/¯
 
Junilu Lacar
Sheriff
Posts: 11476
180
Android Debian Eclipse IDE IntelliJ IDE Java Linux Mac Spring Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Piet Souris wrote:
The book:
"If two or more people share the same birthday, increase the counter and go to
the next trial (of the 5000). After the trials are over, divide the counter by the number of trials."

That is NOT what the OP posted:

As the birthdays are randomly generated, use the contains method of a HashSet to see if someone with the same birthday is already in the room. If so, increment a counter that tracks how many times at least two people have the same birthday and then move on to the next trial.

What you wrote is your interpretation, which is correct, and is pretty much the same as what I suggested would be a clearer statement of the algorithm: "If there are at least two people who have the same birthday, increment the counter and move on to the next trial."
 
dhrubo bhattacharjee
Greenhorn
Posts: 23
Eclipse IDE Java Oracle
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thank you everyone for the wonderful suggestions.This is how my code looks like now.Please let me know if it is ok.

 
fred rosenberger
lowercase baba
Bartender
Posts: 12562
49
Chrome Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
dhrubo bhattacharjee wrote:Please let me know if it is ok.

well...do YOU think it is OK? Have you tested it? Have you gotten the results you'd expect?
 
Carey Brown
Saloon Keeper
Posts: 3309
46
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
 
Tyson Lindner
Ranch Hand
Posts: 211
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
+1 to using several different methods, each one to:

1. randomly generate a list of birthdays
2. check if a duplicate exists from a given list (true or false)
3. combination of 1 and 2 above to estimate a probability for a number of trials
4. use all 3 above to estimate a probability for a given number of people

So all your main method should be doing is calling 4 and printing out the results.

-------

Actually reading the assignment again it looks like you're supposed to kinda combine 1 and 2 into the same action. As in, you're randomly generating birthdays and then you immediately check for a duplicate after each generation. Not sure how much leeway you either have or want to give yourself, but I still feel its much easier conceptually to break the program into different parts even if it might be less efficient.
 
Junilu Lacar
Sheriff
Posts: 11476
180
Android Debian Eclipse IDE IntelliJ IDE Java Linux Mac Spring Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Your results should be something like this, with only a few hundredths/thousandths difference:

I got these by running 400,000 trials - they are very close to the theoretical probabilities: http://en.wikipedia.org/wiki/Birthday_problem
 
Don't get me started about those stupid light bulbs.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!