Francois Meyer

Greenhorn

Posts: 2

posted 2 years ago

I'm working on a program that compares the performance of different sorting algorithms. The results that I am getting are surprising. My Shellsort is constantly beating my Quicksort in terms of performance. I assume that it is something sub-optimal in my Quicksort algorithm, but it seems like the standard algorithm. My code is as follows.

Shellsort:

Quicksort:

What could be the reason for this?

Shellsort:

Quicksort:

What could be the reason for this?

posted 2 years ago

It could be that your benchmarking technique is biased. It's surprisingly hard to write Java benchmarks which produce valid comparisons. Even something as ordinary as always running one option before the other option can lead you into one of the benchmark traps.

Chan Ag

Rancher

Posts: 1090

14

posted 2 years ago

And sorting algorithms' performance could vary greatly because of your input. Remember that shell sort's performance tends to become linear as you approach the best case.

Also if your input has several equal keys, the standard quick sort ( like yours ) has too much overhead just exchanging equal elements. Quick sort tends to perform with quadratic performance in the presence of too many duplicate keys. Also you could see a performance gain if you randomly shuffled the input to your quick sort algorithm but I suspect you already know that.

[ I haven't looked in detail at your quick sort cases. ]

Also if your input has several equal keys, the standard quick sort ( like yours ) has too much overhead just exchanging equal elements. Quick sort tends to perform with quadratic performance in the presence of too many duplicate keys. Also you could see a performance gain if you randomly shuffled the input to your quick sort algorithm but I suspect you already know that.

[ I haven't looked in detail at your quick sort cases. ]

William Brogden

Author and all-around good cowpoke

Rancher

Rancher

Posts: 13078

6

posted 2 years ago

There are many test cases for input to sorting benchmarks. Try:

1. already sorted

2. already sorted but one element is out of place

3. inverse sorted

4. lots of duplicates

Personally I ended up with a hybrid of a bucket sort followed by shell sort for large numbers of words - of course having a limited number of possible starting letters helps a bunch.

Like Paul said, benchmarking java is loaded with traps.

Bill

1. already sorted

2. already sorted but one element is out of place

3. inverse sorted

4. lots of duplicates

Personally I ended up with a hybrid of a bucket sort followed by shell sort for large numbers of words - of course having a limited number of possible starting letters helps a bunch.

Like Paul said, benchmarking java is loaded with traps.

Bill

Chan Ag

Rancher

Posts: 1090

14

posted 2 years ago

Sorry this is after many days. But I think it's worth mentioning.

Your quick sort is not really securing the pivot in its correct position with each invocation of the sort method. As a result, the array ranges you are sorting in each sort call are longer than they should be. So you are not really quick sorting your array. Eventually it works because the pivot still has the scope to be put in its correct position. However that happens much later.

For example, if you had an array of ints as follows.

1, 2, 3, 4, 5, 7, 4, 6, 10.

Your first invocation would lead to the following sort ranges.

1, 2, 3, 4, 4

and 7, 5, 6, 10

and your pivot was 5.

I really think you should fix that.

Your quick sort is not really securing the pivot in its correct position with each invocation of the sort method. As a result, the array ranges you are sorting in each sort call are longer than they should be. So you are not really quick sorting your array. Eventually it works because the pivot still has the scope to be put in its correct position. However that happens much later.

For example, if you had an array of ints as follows.

1, 2, 3, 4, 5, 7, 4, 6, 10.

Your first invocation would lead to the following sort ranges.

1, 2, 3, 4, 4

and 7, 5, 6, 10

and your pivot was 5.

I really think you should fix that.

posted 2 years ago

- 1

One my my classic examples of why the "best" solution isn't always the best solution hinges on just such a case.

Long ago in my mainframe days, I was responsible for a critical system program. It ran over 100 times a day and it had to run correctly or basically the whole shop would shut down.

Part of the data that controlled this program was a 4-character report ID code. The codes basically ran at different levels, defined by the first character in the code. But the codes came in in an order that was ALMOST, but not completely sorted. The "X" level codes occurred somewhere about the middle.

The worst-case scenario for heapsorts and quicksorts is just such a situation. Heap sorting is done by tossing stuff into piles, recursing, sorting, and merging. In the case of ordered data, everything goes into the same pile, so there's a lot of overhead when it comes time to merge a full pile with an empty one (details are a little more subtle than that, but I hope you get the idea).

However, a simple bubble sort also doesn't do well when stuff is nearly ordered except for chunks that are way out of order, since the out-of-order chunks have to slowly bubble over multiple passes.

A Shell-Metzner sort, however, is ideal, since it's a binary bubble operation. On the initial passes, it moves out-of-place data up or down in big relocations, then slowly refines until the last pass or so becomes a straight bubble.

So the best solution for my particular case was not QuickSort or HeapSort, it was Shell.

Long ago in my mainframe days, I was responsible for a critical system program. It ran over 100 times a day and it had to run correctly or basically the whole shop would shut down.

Part of the data that controlled this program was a 4-character report ID code. The codes basically ran at different levels, defined by the first character in the code. But the codes came in in an order that was ALMOST, but not completely sorted. The "X" level codes occurred somewhere about the middle.

The worst-case scenario for heapsorts and quicksorts is just such a situation. Heap sorting is done by tossing stuff into piles, recursing, sorting, and merging. In the case of ordered data, everything goes into the same pile, so there's a lot of overhead when it comes time to merge a full pile with an empty one (details are a little more subtle than that, but I hope you get the idea).

However, a simple bubble sort also doesn't do well when stuff is nearly ordered except for chunks that are way out of order, since the out-of-order chunks have to slowly bubble over multiple passes.

A Shell-Metzner sort, however, is ideal, since it's a binary bubble operation. On the initial passes, it moves out-of-place data up or down in big relocations, then slowly refines until the last pass or so becomes a straight bubble.

So the best solution for my particular case was not QuickSort or HeapSort, it was Shell.

An IDE is no substitute for an Intelligent Developer.