Win a copy of Head First Agile this week in the Agile forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Find kth smallest element in an unsorted array  RSS feed

 
Riaz Ud Din
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I want to find kth smallest element in unsorted array without fully sorting the array. I use Quicksort to partially sort the array and find the kth smallest element. But i get error. Here is the code

 
Knute Snortum
Sheriff
Posts: 4091
112
Chrome Eclipse IDE Java Postgres Database VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What is the error? On what line? Do you have a stack trace?
 
Stefan Evans
Bartender
Posts: 1836
10
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I gave it a run. Error I get is a stack overflow. It gets stuck in an endless recursion because of bugs in the partition method.

Can you explain how your partition method is supposed to work?

I would suggest putting in some debugging statements so you can see what is actually happening.
And printing out the array at each step along the way so that you can see the results of your machinations.
Some example debug statements:


One of the things wrong is the return value of your partition function. Should it really be return a[pivot] ?
 
Riaz Ud Din
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Knute Snortum wrote:What is the error? On what line? Do you have a stack trace?


Line 34 gives stackover flow error
 
Riaz Ud Din
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Stefan Evans wrote:I gave it a run. Error I get is a stack overflow. It gets stuck in an endless recursion because of bugs in the partition method.

Can you explain how your partition method is supposed to work?


select pivot initially 0,

increment "i" until a[i]> a[pivot]

increment "j" until a[j]< a[pivot]

if "i" and "j" doesn't cross each other, swap a[i] & a[j] values

and finally swap the a[j] and a[pivot] and return the new pivot value



One of the things wrong is the return value of your partition function. Should it really be return a[pivot] ?


May be you right. I don't know what should be the new pivot value? I thought a[pivot] is the new pivot value and i returned it.
What you suggest what should be the return value?

here is the video link to the algorithm

webpage
 
Molayo Decker
Ranch Hand
Posts: 48
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi
You are using a recursive Binary search .To sort your arrays in natural order. That is from the smallest to the largest, i suggest you use the Collections framework TreeSet array.
example Set<Integer> mySet = new TreeSet<Integer>();

This will sort your array from the smallest to the largest;

Instead of using a while loop why don't you change it to a for loop. This will make your code much simpler.

 
Riaz Ud Din
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Molayo Decker wrote:Hi
You are using a recursive Binary search .To sort your arrays in natural order. That is from the smallest to the largest, i suggest you use the Collections framework TreeSet array.
example Set<Integer> mySet = new TreeSet<Integer>();

This will sort your array from the smallest to the largest;

Instead of using a while loop why don't you change it to a for loop. This will make your code much simpler.



I don't want to fully sort the array. That would be brute force approach. I am trying to find the kth element without fully sorting the array. That's why i use quick sort to partially sort the array.
 
Campbell Ritchie
Marshal
Posts: 55793
164
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Molayo Decker wrote:. . . TreeSet array.
example Set<Integer> mySet = new TreeSet<Integer>();

This will sort your array from the smallest to the largest . . .
No, it won't. It will lose any duplicates. If your array look like this
[1, 2, 99, 3, 4, 99, 999, -123, 99, 5, 6, 99]
you will get an incorrect result because the set will only contain 99 once.

I personally think you should sort the array or not sort the array. Don't try half‑sorting the array.. Copy the array into a List and sort that List.… and you will get a nice out of bound exception if you have fewer than 1000 elements in the List
Or copy the array and sort the copy.
 
Riaz Ud Din
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Campbell Ritchie wrote:

I personally think you should sort the array or not sort the array. Don't try half‑sorting the array..


Is there any way to find the smallest kth element without sorting?
 
Winston Gutkowski
Bartender
Posts: 10573
65
Eclipse IDE Hibernate Ubuntu
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Molayo Decker wrote:This will sort your array from the smallest to the largest;

Assuming that values aren't repeated. If they are, then you'd need a TreeMap<Key, Count>-like structure.

Interesting little problem. The median of medians clearly seems to be optimal, but you could also do it in a single pass by "heapifying" the first k items in the array, and then swapping subsequent items in if they're less than heap.max() (which you could make array[0]). That would work in O(n log(k)) worst time, and could be done in-place; but I suspect Riaz is looking for an O(n) solution.

Winston
 
Campbell Ritchie
Marshal
Posts: 55793
164
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Riaz Ud Din wrote:Is there any way to find the smallest kth element without sorting?
One way is to use the Map suggestion which Winston made.
Another would be to iterate the array repeatedly looking for largest, next‑largest etc., values (or smallest...). Beware: you must count repeated values, so that latter suggestion will be awkward to implement and will run in quadratic time.

You would have to allow for repeated values in an array, unless you have been told there are no repeats. If there are no repeats it may be possible to iterate the array once counting large/small values. But that will be awkward to implement.

I don't think you should go calling sorting a brute force approach.
 
Riaz Ud Din
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:I suspect Riaz is looking for an O(n) solution.



No No No ... O(n) solution is not necessary. I am looking for any better solution.
 
Piet Souris
Rancher
Posts: 1984
67
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
To all: Riaz gave a link to a Youtube film, where this algorithm is explained.

I do not think you can get any better than O(n). Look at how you start:
increase i until a[i] > a[0], and do the same more or less for j.
That sounds very O(n) to me.

Interesting though. I had a quick look at that youtube-film, but since
the wife was playing her favorite music, I had to turn off the sound
(headphones are now on my Xmas wishlist). What I saw was
daunting, but I could not quite follow it.

The idea might be:
split the array into 'left' and 'right', where each a[i] in left <=
a[0], and each a[i] in right > a[0].
If lieft.length >= k, then the element sought is in left, So
do a recursion with left.
If left.len < k, then the element in question is in right, and
we need to find the element k - left.len.

Anyway: there is substantial less sorting done than in the real quicksort.

Interesting!
 
Riaz Ud Din
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

@Piet Souris

Any idea how to put that algorithm into code? i tried but failed.
 
Piet Souris
Rancher
Posts: 1984
67
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi Riaz,
no I just wrote down what came up in my head. Have you understood that Youtube film?
They might have used a completely different algorithm than what I mentioned.
If I get the chance, I will look at the film tonight.

But if I had to implement what I wrote, first thing I'd do is to use ArrayLists,
to avoid the hussle with arrays. Then split the array, using the first element as a pivot,
and look for the lengths of the two Lists. As soon as one of the lists has length 1,
you'd be done. Bu that may sound easier than it is...
 
Riaz Ud Din
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi again Piet Souris

finger cross, soon you get time to implement it successfully. I am cursing the moment i watched the algorithm video, i can't implement it successfully and now it's bugging me.
 
Winston Gutkowski
Bartender
Posts: 10573
65
Eclipse IDE Hibernate Ubuntu
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Riaz Ud Din wrote:No No No ... O(n) solution is not necessary. I am looking for any better solution.

It seems like a partial Quicksort is close to optimal, since it only sorts what it needs to - and never more than one "half" on any given iteration - so I suspect even the worst case is well less than O(n!); and if you choose a random pivot that's likely to be extremely rare, and I'd care to bet that it's O(n) on average.
The median of medians works by splitting into groups of 5, finding the median of each group, and then storing those as an array and applying the function recursively, so I suspect it's even quicker - but it's quite tough to visualise (I still haven't quite got my head around it yet ).

The nice thing about a heap solution is that heaps are relatively simple, and can be applied to any array (or portion of); and getting the "max" (or "min") value is an O(1) operation. Insertion is O(log (k)), but if k is small relative to n (the size of your array), then you shouldn't need to do it very often.
I'm also pretty sure that "heapifying" itself is an O(n) task, so the worst overall time case should be something like O(k) + O((n-k) log(k)) (which I think is still O(n log(k)) unfortunately).
You can also deal with the problem in stages:
1. Get your heap working.
2. Use it to solve the problem.

Plus (@Riaz), you get to implement a heap; which is a fun exercise anyway.

Winston
 
Liutauras Vilda
Marshal
Posts: 4670
320
BSD
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Riaz Ud Din wrote:Hi again Piet Souris

finger cross, soon you get time to implement it successfully. I am cursing the moment i watched the algorithm video, i can't implement it successfully and now it's bugging me.
Don't expect Piet Souris to do the homework for you. At best what could happen in that way, that Piet Souris will know how to do your homework and likely he would get good mark, but how about you? You'll need to solve similar problems in your daily job, and probably Piet Souris won't be sitting next to you. What's then?

I'd think start yourself, and Piet Souris and everyone else will advice you in case you miss some important parts.
 
Riaz Ud Din
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Liutauras Vilda wrote:
Riaz Ud Din wrote:Hi again Piet Souris

finger cross, soon you get time to implement it successfully. I am cursing the moment i watched the algorithm video, i can't implement it successfully and now it's bugging me.
Don't expect Piet Souris to do the homework for you. At best what could happen in that way, that Piet Souris will know how to do your homework and likely he would get good mark, but how about you? You'll need to solve similar problems in your daily job, and probably Piet Souris won't be sitting next to you. What's then?

I'd think start yourself, and Piet Souris and everyone else will advice you in case you miss some important parts.


Ha ha ha ha why you assumed it's my home work? Well it's not my home. Last time i got my home work was in 2005. Programming is not my field but in my spare time i try to learn Java. In fact, I suck at programming but i still like it and want to learn it, just for hobby. I welcome any suggestion/feedback to help me understand and implement this algorithm.
I did write the program for algorithm but at line 34 it give stackoverflow error and any help will be really to correct that code.
 
Riaz Ud Din
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:

The nice thing about a heap solution is that heaps are relatively simple, and can be applied to any array (or portion of); and getting the "max" (or "min") value is an O(1) operation. Insertion is O(log (k)), but if k is small relative to n (the size of your array), then you shouldn't need to do it very often.
I'm also pretty sure that "heapifying" itself is an O(n) task, so the worst overall time case should be something like O(k) + O((n-k) log(k)) (which I think is still O(n log(k)) unfortunately).
You can also deal with the problem in stages:
1. Get your heap working.
2. Use it to solve the problem.

Plus (@Riaz), you get to implement a heap; which is a fun exercise anyway.

Winston


I don't know much about heap but i will surely first study heap and then try to implement this algorithm.
 
Winston Gutkowski
Bartender
Posts: 10573
65
Eclipse IDE Hibernate Ubuntu
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Riaz Ud Din wrote:I don't know much about heap but i will surely first study heap and then try to implement this algorithm.

My advice: Don't stop/switch in mid-stream.

Try to get what you're trying now to work; and then (if you still feel like it), tackle a heap-based solution.

Like I say, I suspect it isn't the most optimal, but it has a few nice qualities:
1. It can be done iin-place.
2. It can handle sequential (or stream-based) input, where you don't know the value of n in advance.

Good luck.

Winston
 
fred rosenberger
lowercase baba
Bartender
Posts: 12542
48
Chrome Java Linux
  • Likes 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Riaz Ud Din wrote:[I]n my spare time i try to learn Java. In fact, I suck at programming but i still like it and want to learn it, just for hobby. I welcome any suggestion/feedback...

Make sure you learn that in programming, simpler is often the better choice. If you are doing this simply for fun, great. But in the "real world", I would much rather see someone write code that's easy to follow and understand than write something complicated and clever that shaves a few microseconds off the execution time.

Something like 90% of software's cost is in maintenance. The easier it is to fix and maintain, the better. In my opinion, a simple sort and then grab the k-th element is trivial to understand.



What you are describing sounds very compilcated. Considering some of the sharpest minds here on the ranch are making comments like

split the array into 'left' and 'right', where each a[i] in left <=
a[0], and each a[i] in right > a[0].
If lieft.length >= k, then the element sought is in left, So
do a recursion with left.
If left.len < k, then the element in question is in right, and
we need to find the element k - left.len.


That does not look very "simple".

just my 2-cents.
 
Riaz Ud Din
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
fred rosenberger wrote:
Make sure you learn that in programming, simpler is often the better choice. If you are doing this simply for fun, great. But in the "real world", I would much rather see someone write code that's easy to follow and understand than write something complicated and clever that shaves a few microseconds off the execution time.


Thanks very much for the feedback. Yes you right. KISS. Keep it simple and short. I am just learning Java for fun. I just randomly watched this algorithm on youtube and then i got interested in it to implement it but i miserably failed. lol
 
Piet Souris
Rancher
Posts: 1984
67
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Gentlemen,

what I really do not understand is:

OP is referring to a Youtube film. Has any of you watched that film?
If so, the central theme was to do NOT a sort, but to apply
some clever algorithm that uses some element switches.

Then, why o why are you telling OP to go for the easy way
of sorting, or for using heaps, or whatever.
At least try to explain what is wrong with OP's attempt
to implement the algorithm.

 
Winston Gutkowski
Bartender
Posts: 10573
65
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Piet Souris wrote:OP is referring to a Youtube film. Has any of you watched that film?

No, you're quite right there. However, many of us are familiar with the general problem.

Then, why o why are you telling OP to go for the easy way of sorting...

Because, as Fred pointed out, it's a VERY important point to learn: Simple is almost always best; - even if it isn't optimal.

or for using heaps, or whatever.

Because variety is the spice of life. Any algorithm involving a sort (and, I suspect, quickselect; and probably median of medians too) is NOT going to be good for dealing with sequential input.

At least try to explain what is wrong with OP's attempt to implement the algorithm.

But you've already done that so well,

Winston
 
AhFai Chan
Ranch Hand
Posts: 81
Android Eclipse IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Riaz Ud Din wrote:I want to find kth smallest element in unsorted array without fully sorting the array. I use Quicksort to partially sort the array and find the kth smallest element. But i get error. Here is the code



I would use use the "bubble" logic.

Pick the number in the first index position 0, and that's your smallest, then compare it to the next one 1, and so forth, if j < i, replace i with j.

That way you get the k-th smallest or biggest, there might be several of them, so my logic would need to count occurrences of the same value.
 
Piet Souris
Rancher
Posts: 1984
67
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
No, you're quite right there. However, many of us are familiar with the general problem.

hi Winston,

glad to see you are back and fully operating again!

And of course, this is a well known problem, and sorting would have been the very
first thing I would have done. However, the solution as used in the film
is certainly not 'general'. I'm planning to implement what I saw, and hopefully
I can tell OP what is wrong with the way he did it. But maybe I'm not clever
enough. I did the course at Coursera about reactive programming, and
in the first lesson we had to implement a Heap. Know what? I had never heard
of such a thing, looked it up at Wikipedia and fount it incredibly complicated!

But then again, I'm still very young, so plenty of time to learn all these things...
 
Stefan Evans
Bartender
Posts: 1836
10
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
TL:DR - You need to understand what you WANT your method to do before you can fix it.


As Piet pointed out, it is quite an interesting approach, and a nice twist on the standard quicksort algorithm.
It relies on the fact that the quicksort "pivot" point is sorted into its actual place in the array, everything greater than it above, and everything less below.
You can then choose which partition to sort, and discard (not sort) the other one as you know the kth value you are looking for is not in there.

@OP
I spent some time and did get the algorithm working (I think). At least for the example you've provided.
You are going along the right track, there are just a couple of things you haven't thought all the way through.
My comments before about putting debug statements in so that you can see what is happening each step along the way still stand.
In fact that is the only way I was able to understand the problem. (I didn't actually see the video link until now)

But this is code ranch, so I'm not just going to post my solution (yet)
I can however hopefully get you moving in the right direction with some questions to blatantly point you in the right direction

You explained the partition algorithm like this.


Ok, you've basically translated the mechanics of your code into english. But it still doesn't explain what the code DOES, and you have missed out a crucial few points.
What are "i" and "j" ? Can you think up better names for these values? What do they represent?
Why is your pivot point 0? Should it always be 0?
Your code included a loop while (i<j) which isn't in your explanation here.
Why do you swap a[j] and a[pivot] ? What do you accomplish by doing this?
What does the return value of this method mean to the caller?

You need to understand what this code DOES.
What do I mean by this>
Example: this code


What does this code do?
If I answered it your way:


sets answer to 0
loops through all the numbers from 0 up to x
adds i to answer
returns answer


Doesn't tell you much does it?
However if I say "calculates the sum of the numbers less than x" you maybe understand it better?

It is THAT sort of explanation you need to be able to give for your partition method.
Until you understand what you want it to do, you won't be able to fix your code.

May be you right. I don't know what should be the new pivot value? I thought a[pivot] is the new pivot value and i returned it.
What you suggest what should be the return value?


I think the return value should represent an index into the array, not the value of an entry in the array. a[pivot] is thus obviously wrong.
The correct answer might actually be "pivot" but where is the "pivot" value right now?

Ok, I've written a wall of text here. But hopefully you can find some answers in here.
>
 
Winston Gutkowski
Bartender
Posts: 10573
65
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Stefan Evans wrote:Ok, I've written a wall of text here. But hopefully you can find some answers in here.

Well, if Riaz hasn't, I have. Have a cow.

Winston
 
Riaz Ud Din
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Stefan Evans wrote:It is THAT sort of explanation you need to be able to give for your partition method.
Until you understand what you want it to do, you won't be able to fix your code.


Yes you are very right, my explanation of "partition " is ridiculous..

I tried it again to explain it but i couldn't come up with better explanation and now i can see many flaws in my partition code as well.
I will try to understand it better and come up with better explanation of partition part.


I think the return value should represent an index into the array, not the value of an entry in the array. a[pivot] is thus obviously wrong.
The correct answer might actually be "pivot" but where is the "pivot" value right now?


Yes, you are again right. It should be pivot value. Thanks for pointing out that.


Ok, I've written a wall of text here. But hopefully you can find some answers in here.
>

Thanks very much taking your time and writing this. Yes it helped me and clarified many things.

[edit=CR]Removed one excess quote tag[/edit]
 
Piet Souris
Rancher
Posts: 1984
67
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, I have nothing to add to Stefans remarks, and I presume that
Riaz has enough information to repair his original listing. I conclude
with a List-version, like I wrote about. I also included a stream-Map
version for java 8, because I find that a difficult subject and so I
use every opportunity to practise it.
 
Riaz Ud Din
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Finally solved the problem with the help of a good friend using conventional quick sort algorithm. Couldn't do it with partial sorting algorithm as shown in the video. https://www.youtube.com/watch?v=o1vuCrt7uYc
 
Riaz Ud Din
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Piet Souris wrote:Well, I have nothing to add to Stefans remarks, and I presume that
Riaz has enough information to repair his original listing. I conclude
with a List-version, like I wrote about. I also included a stream-Map
version for java 8, because I find that a difficult subject and so I
use every opportunity to practise it.


Thanks very much for taking out the time of your weekend and writing the code.
 
Stefan Evans
Bartender
Posts: 1836
10
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Riaz.

Sorry to say it, but you still have problems.
In fact I think your first effort was actually closer to being correct than this one. You have removed some crucial code.

Try with k = 3 and you're still going to get a stack overflow.

Couple more hints then:
In your partition method, you always set pivot point to index 0.
What happens if I want to call partition(a, 4, 8) ?
Your pivot point should be part of the range you are sorting.
The easiest approach is to use the first or last item in the range as the pivot point.


In your first example you had the following:

in your second

I do like the new names of the variables.
But you have removed the bit of code that stopped you overshooting your target (and potentially causing an ArrayIndexOutOfBoundsException)
I think it should be


similarly with the code looping on end--

In your first example you finished your partition method with:


In your second you just have:
return end;

Nice to see you picked an index to return.
But that swap in your original code is important! Its what puts the pivot element into its right place in the array
 
Riaz Ud Din
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
@Stefan Evans

Thanks very much for checking my code and pointing out the mistakes.... I couldn't fix the first code that's why i decided to try another method but that failed as well.
One of my friends fixed the 2nd program and now it works fine.

 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!