Win a copy of OCP Java SE 8 Programmer II Exam Study Guide this week in the OCP forum!
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
Sheriffs:
Saloon Keepers:
Bartenders:

Greenhorn
Posts: 23
Hi I have recently started learning Java and following Absolute Java by Walter Savitch and I got stuck with one problem and need help/suggestions from you guys:

The birthday paradox is that there is a surprisingly high probability that two or more people in the same room happen to share the same birthday. By birthday,we mean the same day of the year (ignoring leap years), but not the exact birthday that includes the birth year or time of day. Write a program that approximates the probability that 2 or more people in the same room have the same birthday, for 2 to 50 people in the room.The program should use simulation to approximate the answer. Over many trials (say, 5,000), randomly assign birthdays (i.e., a number from 1–365) to everyone
in the room. Use a HashSet to store the birthdays. As the birthdays are randomly generated, use the contains method of a HashSet to see if someone with the same birthday is already in the room. If so, increment a counter that tracks how many times at least two people have the same birthday and then move on to the next trial. After the trials are over, divide the counter by the number of trials to get an estimated probability that two or more people share the same birthday for a given room size.Your output should look something like the following. It will not be exactly the same due to the random numbers:For 2 people, the probability of two birthdays is about 0.002 ,For 3 people, the probability of two birthdays is about 0.0082

This is what my code looks like

lowercase baba
Bartender
Posts: 12613
50
I would suggest you re-engineer this slightly. A main() method shouldn't have much code in it. You need to think about writing more methods that do pieces.

If you had a method that you could call:

doOneTrial()

This method would do basically what you have. then you simply write a loop:

you may have to add a little more here and there, to catch/accumulate your results....but by writing methods that do simple things, you can easily build up very complicated programs.

Saloon Keeper
Posts: 4065
48

Carey Brown
Saloon Keeper
Posts: 4065
48
I might suggest breaking this up into multiple methods. You could have, as an example, something like this:

that returns true if 2 people have the same birth date.
and something like

that returns the number of trials with matches out of the numberOfTrials.

Sheriff
Posts: 11747
191
I don't think the algorithm/math suggested by the book is correct, specifically, by incrementing a counter every time you get a collision and then dividing the counter by the number of trials. Theoretically, there is a 50% probability of two same birthdays with 23 people. The algorithm suggested by the book consistently gives the 50% probability mark at somewhere between 19 and 20 people, even when I try it with 1_000_000 trials; 100% probability is consistently with 28 people. This is not correct.

When I adjust the calculation, the simulation numbers approach the theoretical numbers, with 50% probability with 23 people. To get this, I counted the number of trials where there is least one collision, then divide that number by the number of trials.

Edit:
I guess it all depends on how you interpret the bolded part of this sentence:

If so, increment a counter that tracks how many times at least two people have the same birthday and then move on to the next trial.

Interpreted one way, I get the incorrect results. Interpreted another way, I get the correct algorithm.

Junilu Lacar
Sheriff
Posts: 11747
191
It appears that interpreting the algorithm as stated in the book is not straightforward since both solutions presented so far are incorrect. I think it would be more clearly stated this way: "If there are at least two people who have the same birthday, increment the counter and move on to the next trial."

Junilu Lacar
Sheriff
Posts: 11747
191
Also, none of the methods presented so far for generating a random number from 1-365 are correct. Random.nextInt(bound) has a range of 0 to (bound-1).

Marshal
Posts: 58421
178
Welcome to the Ranch

I think complicated algorithms are not a good way to learn programming of any kind. But take note of the warning about 364 and 365.

Rancher
Posts: 2459
80

Junilu Lacar wrote:I don't think the algorithm/math suggested by the book is correct, specifically, by incrementing a counter every time you get a collision and then dividing the counter by the number of trials. Theoretically, there is a 50% probability of two same birthdays with 23 people. The algorithm suggested by the book consistently gives the 50% probability mark at somewhere between 19 and 20 people, even when I try it with 1_000_000 trials; 100% probability is consistently with 28 people. This is not correct.

When I adjust the calculation, the simulation numbers approach the theoretical numbers, with 50% probability with 23 people. To get this, I counted the number of trials where there is least one collision, then divide that number by the number of trials.

Edit:
I guess it all depends on how you interpret the bolded part of this sentence:

If so, increment a counter that tracks how many times at least two people have the same birthday and then move on to the next trial.

Interpreted one way, I get the incorrect results. Interpreted another way, I get the correct algorithm.

The book:
"If two or more people share the same birthday, increase the counter and go to
the next trial (of the 5000). After the trials are over, divide the counter by the number of trials."

I see no ambiguity in the phrasing of the book, it is correct. What part makes you doubt?

And you are correct about the 364 and 365, but that is not the fault of the book. For the outcomes,
it does not much matter if you use 364 or 365.

Gteetz,
Piet

Junilu Lacar
Sheriff
Posts: 11747
191

Piet Souris wrote:
I see no ambiguity in the phrasing of the book, it is correct. What part makes you doubt?

The fact that at least three two people, including myself, interpreted the book incorrectly is a good indication that there is, in fact, some ambiguity in interpreting it.

Edit: I guess I shouldn't have counted Carey -- the comment he added indicates he had the correct interpretation. Must be just me and the OP then. ¯\_(ツ)_/¯

Junilu Lacar
Sheriff
Posts: 11747
191

Piet Souris wrote:
The book:
"If two or more people share the same birthday, increase the counter and go to
the next trial (of the 5000). After the trials are over, divide the counter by the number of trials."

That is NOT what the OP posted:

As the birthdays are randomly generated, use the contains method of a HashSet to see if someone with the same birthday is already in the room. If so, increment a counter that tracks how many times at least two people have the same birthday and then move on to the next trial.

What you wrote is your interpretation, which is correct, and is pretty much the same as what I suggested would be a clearer statement of the algorithm: "If there are at least two people who have the same birthday, increment the counter and move on to the next trial."

dhrubo bhattacharjee
Greenhorn
Posts: 23
Thank you everyone for the wonderful suggestions.This is how my code looks like now.Please let me know if it is ok.

fred rosenberger
lowercase baba
Bartender
Posts: 12613
50

dhrubo bhattacharjee wrote:Please let me know if it is ok.

well...do YOU think it is OK? Have you tested it? Have you gotten the results you'd expect?

Carey Brown
Saloon Keeper
Posts: 4065
48
• 1

Ranch Hand
Posts: 211
+1 to using several different methods, each one to:

1. randomly generate a list of birthdays
2. check if a duplicate exists from a given list (true or false)
3. combination of 1 and 2 above to estimate a probability for a number of trials
4. use all 3 above to estimate a probability for a given number of people

So all your main method should be doing is calling 4 and printing out the results.

-------

Actually reading the assignment again it looks like you're supposed to kinda combine 1 and 2 into the same action. As in, you're randomly generating birthdays and then you immediately check for a duplicate after each generation. Not sure how much leeway you either have or want to give yourself, but I still feel its much easier conceptually to break the program into different parts even if it might be less efficient.

Junilu Lacar
Sheriff
Posts: 11747
191
Your results should be something like this, with only a few hundredths/thousandths difference:

I got these by running 400,000 trials - they are very close to the theoretical probabilities: http://en.wikipedia.org/wiki/Birthday_problem