Here's another polymorphic algorithm that uses the preceding swap method.
This algorithm, which is included in the Java platform's Collections class, randomly permutes the specified list using the specified source of randomness. It's a bit subtle: It runs up the list from the bottom, repeatedly swapping a randomly selected element into the current position. Unlike most naive attempts at shuffling, it's fair (all permutations occur with equal likelihood, assuming an unbiased source of randomness) and fast (requiring exactly list.size()-1 swaps).
Why does every permutation occur with equal likelihood???
If you think about what it's doing: it's picking a random element and putting it at the end of the list. All elements are equally likely. It's then picking a random element from those left over and putting that at the end-but-one of the list. And it carries on like that.
Given that, how could it privilege one permutation over another? The initial permutation of the list can't affect the likelihood of the outcome.
I do agree that every permutation might be a possible outcome, but why in the same likelihood?
Indexes that are closer to zero have more chances to be shifted, since they might be picked up during more iterations.
So if for example the initial permutation has X in index 0, there's a bigger chance that X will be located relatively far from index 0 in the final permutation.
What does it give us that in every iteration we pick a random number from a smaller and smaller range? It would be the same complexity as always picking a random number from list.size(). And then it would be obvious why every final permutation has the same probability.
Right now it does not seem obvious to me. Or it requires some real calculation.
Say you've got a pack of cards, and you want to put them in a random order. You can do the following. Pick a card at random, and put it on the table. Pick another card at random, and put it on top of that. Keep going until you're finished. How can one permutation be more likely than another? The initial ordering of the pack is irrelevant, because each time we're picking a random value. This algorithm is identical in concept to that.
Or look at it another way. On the nth step, the chance of picking a particular remaining slot is 1 in (N - n + 1) (N is the size of the array). We can sort the remaining slots into order, or even shuffle them completely - it doesn't change the probabilities of the next value chosen. The multiple swapping of the earlier slots is irrelevant - it's only when their final value is fixed that matters.
But if you want the brute force approach, let's calculate the probability that a particular value is chosen on the nth step. So it needs to be not chosen on the first, not chosen on the second....not chosen on the (n-1)th, and chosen on the nth.
Nearly all the terms cancel, and you're left with 1/N, which is what you'd expect from a perfect shuffling algorithm.
Ok I got what you said about the pack of cards. It's a good explanation. When I re think about it, I do not understand why it bothered me that the swap keeps sending numbers to the subgroup of the numbers to choose from in the next iteration.
But as for the probability calculation, I think that the probability that a particular value is chosen on the nth step is 1/(N-n+1) ???
Once chosen, it cannot be chosen on the following steps.
So the total probability for a certain permutation is 1/N * 1/(N-1) * ... * 1/1 = 1/N!
Which makes sense.
Joseph Sweet wrote:But as for the probability calculation, I think that the probability that a particular value is chosen on the nth step is 1/(N-n+1) ??
That's a slightly different question. That's the probability that a value is chosen on the nth step given that it wasn't chosen on any of the first (n - 1) steps. Mine is the unconditional probability - the probability at the starting point that the value will be chosen on the nth step. The two are related - that's where the final term in my expression comes from.