Say you've got a pack of
cards, and you want to put them in a random order. You can do the following. Pick a card at random, and put it on the table. Pick another card at random, and put it on top of that. Keep going until you're finished. How can one permutation be more likely than another? The initial ordering of the pack is irrelevant, because each time we're picking a random value. This algorithm is identical in concept to that.
Or look at it another way. On the nth step, the chance of picking a particular remaining slot is 1 in (
N -
n + 1) (
N is the size of the array). We can sort the remaining slots into order, or even shuffle them completely - it doesn't change the probabilities of the next value chosen. The multiple swapping of the earlier slots is irrelevant - it's only when their final value is fixed that matters.
But if you want the brute force approach, let's calculate the probability that a particular value is chosen on the
nth step. So it needs to be not chosen on the first, not chosen on the second....not chosen on the (
n-1)th, and chosen on the
nth.
Nearly all the terms cancel, and you're left with 1/
N, which is what you'd expect from a perfect shuffling algorithm.