Win a copy of The Way of the Web Tester: A Beginner's Guide to Automating Tests this week in the Testing forum!

# generating unique Random Numbers

mike mimmis
Ranch Hand
Posts: 57
Hi Ranchers,

I am developing a project in which I need to generate unique random numbers between two given input numbers. I have implemented the following but not getting as expected.

It is generating unique numbers but most of them are serial numbers like 1 2 3 4 5 6 8 7 9 10 11 13 14 12...... but i need random numbers which are not serial order...

Thanks and Regards,

Mike....

Riaan Nel
Ranch Hand
Posts: 160
Hi Mike

Look at the line below.

Math.random() returns a double value between 0 and 1. If you try to cast 0.xxx to an int, you'll always get zero. There isn't really a random number in that line - essentially, it looks just like this:
Also, the very first cast to int looks to be redundant.

Consider mulitplying the double value with a larger int (e.g. 100, for a range of values between 0 and 100) before casting to int. Alternatively, look at the java.util.Random class.

Ranch Hand
Posts: 384

check java.util.Random

int nextInt(int n) :
Returns a pseudorandom, uniformly distributed int value between 0 (inclusive) and the specified value (exclusive), drawn from this random number generator's sequence.

Martin Vajsar
Sheriff
Posts: 3752
62
• 1
If the size of the range you need to cover with random numbers is not prohibitively large, I'd simply create a list of all integers in the given range and shuffled the list (there is a method in Java to do so). In my opinion, this approach should be viable even for million of numbers (with ArrayList of Integers, the memory requirement will be around 10-20 MB - relatively tiny compared to today's typical memory sizes).

Depending on your exact needs, other approaches could be viable. But keep in mind that generating random numbers is a difficult subject and getting it right can be surprisingly hard. As far as I know, there is not a function in JDK that would generate unique random numbers in given range.

Winston Gutkowski
Bartender
Posts: 10527
64
mike mimmis wrote:I am developing a project in which I need to generate unique random numbers between two given input numbers.

Do you mean that the numbers need to be different every time (as, for example, for a Lotto draw)?

If that's the case, and the range isn't too big, one method is to create a List of all the possible numbers and then use Collections.shuffle(). Another alternative, if the range may be very large, is to keep a HashSet or BitSet of 'previously drawn' numbers (the choice will depend on the spread and how many numbers you need to draw).

HIH

Winston

Rob Spoor
Sheriff
Posts: 20709
68
Riaan Nel wrote:Hi Mike

Look at the line below.

Math.random() returns a double value between 0 and 1. If you try to cast 0.xxx to an int, you'll always get zero. There isn't really a random number in that line - essentially, it looks just like this:
Also, the very first cast to int looks to be redundant.

I think you are mistaken. The calculation will be htnostarts (int) + (diff+1) (int) * (Math.random()) (double). The result of this calculation (int + int * double) will be a double, so the rounding / truncating you mentioned will only take place because of the outer cast to int, which is done over the final result of the calculation, not the intermediate random number.

Riaan Nel
Ranch Hand
Posts: 160
Rob Spoor wrote:
Riaan Nel wrote:Hi Mike

Look at the line below.

Math.random() returns a double value between 0 and 1. If you try to cast 0.xxx to an int, you'll always get zero. There isn't really a random number in that line - essentially, it looks just like this:
Also, the very first cast to int looks to be redundant.

I think you are mistaken. The calculation will be htnostarts (int) + (diff+1) (int) * (Math.random()) (double). The result of this calculation (int + int * double) will be a double, so the rounding / truncating you mentioned will only take place because of the outer cast to int, which is done over the final result of the calculation, not the intermediate random number.

Rob, you're spot on. I have no idea what I was thinking when I posted that.

Rob Spoor
Sheriff
Posts: 20709
68
Even with a + instead of *, the result would still be a double. The only difference is that htnostarts and diff + 1 could be combined into roomBcap + 1.

Jayesh A Lalwani
Rancher
Posts: 2762
32
• 1
I think the problem could be that you are using a HashSet. HashSet does not guarantee that the order of the items added in the Set is preserved. I am not sure, but most likely it is reordering your results using the hash code of the results. Since, the hash of an integer is the integer itself, the end result is that your randomly generated numbers are being sorted by the HashSet

Try using a set that preserves order

dennis deems
Ranch Hand
Posts: 808
Jayesh A Lalwani wrote:I think the problem could be that you are using a HashSet. HashSet does not guarantee that the order of the items added in the Set is preserved. I am not sure, but most likely it is reordering your results using the hash code of the results. Since, the hash of an integer is the integer itself, the end result is that your randomly generated numbers are being sorted by the HashSet

Try using a set that preserves order

This. I tried an ArrayList and the output appeared suitably random.

Jayesh A Lalwani
Rancher
Posts: 2762
32
Booyah! The devil is in the details

Martin Vajsar
Sheriff
Posts: 3752
62
Dennis Deems wrote:
Jayesh A Lalwani wrote:I think the problem could be that you are using a HashSet. HashSet does not guarantee that the order of the items added in the Set is preserved. I am not sure, but most likely it is reordering your results using the hash code of the results. Since, the hash of an integer is the integer itself, the end result is that your randomly generated numbers are being sorted by the HashSet

Try using a set that preserves order

This. I tried an ArrayList and the output appeared suitably random.

This was a good catch.

Unfortunately, if you just use ArrayList instead of a HashSet, the output might contain repeated (non-unique) numbers. It violates the original request that the numbers must be unique.

Jayesh A Lalwani
Rancher
Posts: 2762
32

Martin Vajsar
Sheriff
Posts: 3752
62
Jayesh A Lalwani wrote:LinkedHashSet will do the trick

Indeed. But shuffling an ArrayList populated with numbers from required range performs better (in O(n) time, compared to O(n^2) of the algorithm in this thread), takes less memory, and probably provides better distribution of possible permutations. Plus, it would make it more clear what the code is meant to do, which should be the most important decision factor of these.

Mike Simmons
Ranch Hand
Posts: 3090
14
Martin Vajsar wrote:
Jayesh A Lalwani wrote:LinkedHashSet will do the trick

Indeed. But shuffling an ArrayList populated with numbers from required range performs better (in O(n) time, compared to O(n^2) of the algorithm in this thread), takes less memory, and probably provides better distribution of possible permutations. Plus, it would make it more clear what the code is meant to do, which should be the most important decision factor of these.

Hmmm... I like the shuffle approach, but I disagree with several of these claims. The performance and memory usage depend a lot on the ratio of the number of values to be selected (I'll call this N) to the number of values in the range (call it R). As long as N << R, performance of both algorithms is essentially O(N). Obviously this changes as N approaches R, and the LinkedHashSet solution is impossible for N > R. Memory usage can be considerably worse for the shuffle solution, if N << R. And I don't see any reason why the probability distribution would be any different for the two techniques.

I agree that the shuffle code would be clearer, though, and that that's probably most important.

I would add that the shuffle code is more consistent in its performance. The LinkedHashSet method just gets slower as you go, and especially as N -> R it can take an arbitrarily long time, which is troubling.

The shuffle algorithm seems to take more time to set up initially, since you need to initialize the whole array or list, and shuffle it entirely before getting the first result. However it's possible to perform this work lazily, shuffling just one element at a time, and initializing elements on an as-needed basis.

Martin Vajsar
Sheriff
Posts: 3752
62
Mike, I was talking about the algorithm in the original post, for which N = R. Unfortunately, during editing the explicit reference to the original algorithm fell out - my bad. Your general analysis holds, of course.

Regarding probability distribution, I was probably wrong. I wanted to state that Collections.shuffle creates as good permutations as it gets, so any "do it yourself" algorithm will be either the same in this regard, or worse.

Pat Farrell
Rancher
Posts: 4678
7
mike mimmis wrote:I am developing a project in which I need to generate unique random numbers between two given input numbers. I have implemented the following but not getting as expected.

As written, what you are asking for is mathematically impossible. If the result is a function of the two input numbers, its not random.

If you want a 'pseudo-random' wad of bits that are a function of the two input numbers, simple use a cryptographically strong hash function on the input. Sha256 is part of the standard Java JDK/JRE, and its fairly easy to use.

You will still be subject to the birthday paradox relative to your two numbers, but that's a simple consequence of our world of mathematics and probability.

Winston Gutkowski
Bartender
Posts: 10527
64
Pat Farrell wrote:As written, what you are asking for is mathematically impossible.

It seems to me that you're splitting hairs. If it comes to it, it is impossible for a computer algorithm to produce a random number, period - the range has nothing to do with it.

If you want a 'pseudo-random' wad of bits that are a function of the two input numbers, simple use a cryptographically strong hash function on the input. Sha256 is part of the standard Java JDK/JRE, and its fairly easy to use.

Which still doesn't solve OP's problem of wanting unique random (or, if you must, unpredictable) numbers. As Mike said, if the number required is the same as the range spread, pretty much the only reasonable way of doing that is to perform a shuffle.

Winston

Jayesh A Lalwani
Rancher
Posts: 2762
32
One thing to note is that the algorithm posted above has a very good chance that it will run for more than n iterations and a very slim chance that the number of iterations will be much larger than n. Depending on the range, there is a good chance that you will get collisions, and you are basically retrying the random number generation on collision.

So, for example, let't say you are generating numbers between 1 and 100, and you have already generated 99 numbers, and are trying to generate that 100th number. There is a 99% chance that you will get a collision and you will need to try again. My probability fu is not strong as it used to be(correct me if I'm wrong here), but I seem to remember than there is a 20% chance that you will not get the number in 80-81 iterations

The shuffle algorithm will be much more predictable.

Mike Simmons
Ranch Hand
Posts: 3090
14
Pat Farrell wrote:
mike mimmis wrote:I am developing a project in which I need to generate unique random numbers between two given input numbers. I have implemented the following but not getting as expected.

As written, what you are asking for is mathematically impossible. If the result is a function of the two input numbers, its not random.

But he didn't say it was a function of the two input numbers - at least, not only those two numbers. He just sait the result must be between those two numbers. There could be other things determining the result, such as a true random number generator based on measurement of some random physical phenomenon. Or a pseudorandom generator based on 'hidden' inputs such as current time in nanoseconds, or other more secure hidden inputs.

Pat Farrell
Rancher
Posts: 4678
7
Mike Simmons wrote:But he didn't say it was a function of the two input numbers - at least, not only those two numbers. He just sait the result must be between those two numbers.

That too is impossible in general. Now only is it impossible to have a guarantee of unique values, but its also true that for some input values, there are no float/double values between the two input values.

When you get into the details of comparing floating point numbers, you run across the concept of a 'ulp'

maxUlps the maximum error in terms of Units in the Last Place
or "ulp" stands for "unit of least precision". A difference of one ulp between two floats
indicates that they're "adjacent" floats; that there's no value in between them.

While real numbers are infinite, floating points on computers can be adjacent.

Winston Gutkowski
Bartender
Posts: 10527
64
Pat Farrell wrote:
Mike Simmons wrote:But he didn't say it was a function of the two input numbers - at least, not only those two numbers. He just sait the result must be between those two numbers.
That too is impossible in general.

Actually, it isn't. It's perfectly possible to have a random interval between two values.

While real numbers are infinite, floating points on computers can be adjacent...

Again. Not really sure what your point is, except that computers can't generate true random numbers. We seem to have managed to write a lot of software that relies on the pseudo-random ones they do produce though. And if you're really that worried about it, you can always make sure that your range limits are separated by at least two ULPs.

Winston

Pat Farrell
Rancher
Posts: 4678
7
Winston Gutkowski wrote:
While real numbers are infinite, floating points on computers can be adjacent...

Again. Not really sure what your point is, except that computers can't generate true random numbers. We seem to have managed to write a lot of software that relies on the pseudo-random ones they do produce though. And if you're really that worried about it, you can always make sure that your range limits are separated by at least two ULPs.

How can you do that? If you accept arbitrary inputs, you have no control over how far apart the numbers are.

We seem to be talking past each other.

My point is that as stated, the OP's desires can't be met. Change the criteria or assumptions and sure, the OP can do something. But as stated it is either too vague or impossible.

dennis deems
Ranch Hand
Posts: 808
It's not the numbers themselves that are random, but the selections from among them. Not that the distinction matters terribly.

Mike Simmons
Ranch Hand
Posts: 3090
14
So, Pat, I guess you're dropping the original reason you gave? "If the result is a function of the two input numbers, its not random." Which seems to be based on a misunderstanding of what the poster said. That is what I was originally responding to. Are we done with that one, then?

Now you seem to be assuming that the problem is about real numbers. I agree that it's highly problematic to talk about uniqueness of real numbers using floating-point approximations. However the poster never said he's talking about real numbers, and a glance at the code seems to confirm he is in fact talking about integers. Ulps are not relevant here.

Paul Clapham
Sheriff
Posts: 21443
33
Dennis Deems wrote:It's not the numbers themselves that are random, but the selections from among them. Not that the distinction matters terribly.

I think that sounds right. When you say you want "unique random" numbers you're already using contradictory terms; if you pick numbers randomly from a finite set then you have to expect duplicates eventually. (Imagine the finite set having only two members, then picking random numbers from it is equivalent to tossing a coin repeatedly.)

But if after picking a random number from the set you remove it from the set, then you aren't going to get random numbers any more. You are going to get unique numbers, though, which is the more important half of the requirements. And you could certainly say they were randomly chosen. And I agree that the distinction doesn't matter, it's just pickiness over terminology.

Mike Simmons
Ranch Hand
Posts: 3090
14
Usually when this sort of problem comes up, we also get people saying that it's fundamentally impossible to have the numbers be both random and guaranteed to be unique - that "random" implies there may be repetition. Personally I've always considered it fairly obvious that when someone says "random unique numbers" they mean as random as possible, subject to the additional constraint that they be unique as well. At each step, choose randomly from all the available unique numbers that have not been used yet. Some people will not be happy unless we rephrase this as "random selection without replacement", rather than "random unique numbers". Whatever - I figure the objective was clear enough, based on the original phrasing and the code shown.

Mike Simmons
Ranch Hand
Posts: 3090
14
Hah. I wrote that before reading Paul C's post. Fortunately Paul made the observation in a much more flexible, pragmatic way than I've seen others do it in the past.

Paul Clapham
Sheriff
Posts: 21443
33
Yes, people fundamentally don't understand randomness. If you ask somebody to write down a sequence of, say, 100 random digits, they are going to generate a sequence which fails to be random because it doesn't contain enough runs of duplicate digits. And when I say "somebody" I'm including you and me.

So when your manager requests "random" numbers you don't have to use mathematical definitions of randomness. Even in the unlikely event that your manager is a mathematician, that still isn't what he or she meant. You have a great deal of leeway in practice.

Pat Farrell
Rancher
Posts: 4678
7
Paul Clapham wrote:So when your manager requests "random" numbers you don't have to use mathematical definitions of randomness. Even in the unlikely event that your manager is a mathematician, that still isn't what he or she meant. You have a great deal of leeway in practice.

Yes a typical manager has no clue what a random number is.

Any number generated on a computer, random and unique contradictory.

Any one who considers arithmetical methods of producing random digits is, of course, in a state of sin.
John von Neumann, 1951, quoted by Knuth

If you are trying to use random numbers for a nonce or other security function, you really need to use a cryptographic set of functions, for random, hash, etc.

Saurabh Pillai
Ranch Hand
Posts: 524
Is it safe to say that random number generator is actually non-random. It is just that the parameters/inputs used to generate the number is unknown to us. If we know the parameters, we know the next number to be generated?

Mike Simmons
Ranch Hand
Posts: 3090
14
That's true for the pseudo-random generators typically used on computers - like java.util.Random, for example. Or for the more cryptographically secure alternatives available elsewhere in the libraries. However it's also possible to generate true random numbers by observing certain random physical processes (e.g. in quantum mechanics). Most of us don't have such equipment hooked up to our computers, however.

Jayesh A Lalwani
Rancher
Posts: 2762
32
Actually pseudo random number generation works better than true random number generation in some applications. For example, if you are doing a Monte-Carlo simulation, and you need to be able to repeat your results at a later date. So, rather than storing information about the random sample points that you took during a simulation, you can use a pseudo random number generator and store the seed. Next time you want to repeat your results, you regenerate your sample points by using the same seed

At the very least it makes debugging easier.

Saurabh Pillai
Ranch Hand
Posts: 524
Mike Simmons wrote:However it's also possible to generate true random numbers by observing certain random physical processes (e.g. in quantum mechanics). Most of us don't have such equipment hooked up to our computers, however.

I do not understand this. Can you give me more information(link or something) explaining from using 'random physical processes' all the way to generated number. I am not able to establish a link here.

Pop-up Quiz,

Let's say in 10 seconds, you need to pick 20 numbers. No range. How does our mind pick those exact numbers?

Mike Simmons
Ranch Hand
Posts: 3090
14
Jayesh: I agree; good points.

Saurabh: For your quiz: 1,2 ,3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20. Because the problem is underspecified, and I have no reason to spend time digging for better requirements.

http://en.wikipedia.org/wiki/Random_number_generation#Physical_methods
http://en.wikipedia.org/wiki/Hardware_random_number_generator

Jayesh A Lalwani
Rancher
Posts: 2762
32
Actually, humans make very bad random number generators

I am taking a jujitsu class, and as part of the class, we had to "attack" our partner with a hand or a leg strike. Before striking, we had to call out "felony" or "misdemeanor" and the partner had to change their defense based on whether the attack is a felony or misdemeanor. I started getting very self aware of the patterns in my attacks. I was like felony-felony-misdemeanor-felony-felony-misdemeanor-oh I'm following a pattern-misdemeanor-felony-felony-misdemeanor-misdemeanor-of crap another pattern.

Concious human brains seek order. It;s very hard for a person to stay truly random. Fencers, Boxers and Poker players actually take advantage of this "weakness". Everyone has a pattern. The difference between good players and bad ones is that the better ones have longer patterns and bigger set of patterns to draw from.

Mike Simmons
Ranch Hand
Posts: 3090
14
And this is also why it's possible to have a World Championship of rock, paper, scissors.

Pat Farrell
Rancher
Posts: 4678
7
Saurabh Pillai wrote:Is it safe to say that random number generator is actually non-random. It is just that the parameters/inputs used to generate the number is unknown to us. If we know the parameters, we know the next number to be generated?

All computer "random number" generates are more properly called pseudo-random number generators. Thus the quote from Von Neumann. Not only is it 100% true that
If we know the parameters, we know the next number to be generated?

but it is even true that with a bit of work, you can figure out the parameters from the output.

Pat Farrell
Rancher
Posts: 4678
7
Saurabh Pillai wrote:
Mike Simmons wrote:However it's also possible to generate true random numbers by observing certain random physical processes (e.g. in quantum mechanics). Most of us don't have such equipment hooked up to our computers, however.

I do not understand this. Can you give me more information(link or something) explaining from using 'random physical processes' all the way to generated number. I am not able to establish a link here.

Google is your friend. And you are correct in that real random numbers are impossible on a computer without an external sensor. The best sensors use something like the decay of an atomic particle.

Weaker ones use things like the timing of internet packets on the computer's NIC. That can be influenced by a Bad Guy (tm) if they can send packets on the same subnet that your computer is on.

This is a complex topic, well covered in the cryptographic studies.

Winston Gutkowski
Bartender
Posts: 10527
64
Pat Farrell wrote:but it is even true that with a bit of work, you can figure out the parameters from the output.

A bit of work? For linear congruential, maybe, but there are a whole raft of algorithms out there for which it is, for all practical purposes, impossible to work out the parameters from the output. Or maybe this is a mathematical 'bit'...

Paul Clapham wrote:I think that sounds right. When you say you want "unique random" numbers you're already using contradictory terms...
But if after picking a random number from the set you remove it from the set, then you aren't going to get random numbers any more. You are going to get unique numbers, though, which is the more important half of the requirements. And you could certainly say they were randomly chosen.

There is, however, such a thing as a perfect shuffle: that is, a shuffle of a finite set of values in which each value has an equal probability of appearing at a particular position; and the chances are that Collections (or Arrays) .shuffle() comes pretty close to that. I also wouldn't be at all surprised if it produces a better distribution than choosing random numbers and eliminating collisions, since the collisions themselves are likely to highlight any predictability or skew in the RNG used.

Winston