• Post Reply Bookmark Topic Watch Topic
  • New Topic

Finding if a number exists in a 2D matrix in the most efficient manner  RSS feed

 
lewis manuel
Ranch Hand
Posts: 64
2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
So the problem is given a matrix of N X N size, and is increasing from left to right and top to bottom, find the most efficient algorithm that checks if the number that you're looking for is in the matrix.

My own solution:



Explanation:

The key things that caught my attention in this problem was the fact that matrix M it always N X N. Meaning it has the same length in any direction(3 * 3, 4 * 4). The second fact was that the column and row is always increasing. Meaning each array in in each row is always sorted. Because of this fact, I figured it would best to use the BinarySearch to cut the amount of indexes I need to visit. What you do think?
 
Carey Brown
Saloon Keeper
Posts: 3309
46
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
lewis manuel wrote:... The second fact was that the column and row is always increasing

Edit: never mind.
 
Paul Clapham
Sheriff
Posts: 22819
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
lewis manuel wrote:The key things that caught my attention in this problem was the fact that matrix M it always N X N. Meaning it has the same length in any direction(3 * 3, 4 * 4). The second fact was that the column and row is always increasing. Meaning each array in in each row is always sorted. Because of this fact, I figured it would best to use the BinarySearch to cut the amount of indexes I need to visit. What you do think?


I'm not convinced that doing a binary search of only three numbers is any faster than just looking at each of the three numbers. If you were working with matrixes of arbitrary size, then sure, as long as (like Carey said) you knew in advance that the rows would be increasing.

And after rereading your code I see that you wrote your own binary search instead of using Java's built-in binary search method for arrays. What you wrote looks plausible to me but it would need a lot of testing to make sure it's correct. I do have to give you props for coming up with that idea though.
 
lewis manuel
Ranch Hand
Posts: 64
2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks I'll continue to do more tests.
 
Paul Clapham
Sheriff
Posts: 22819
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Paul Clapham wrote:I do have to give you props for coming up with that idea though.


Actually I should give you a cow for the idea, plus the well-formed question and neatly indented code. Here ya go!
 
Piet Souris
Master Rancher
Posts: 2041
75
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You can also do a BS on the first elements of the colums, to select
the column, or check the last element of each column before invoking
your BS-method.
 
lewis manuel
Ranch Hand
Posts: 64
2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator


I've edited my code(not the algorithm mind you) so it can accept input from the user(with some exception checking) and so that other people can test it if they want to.
The largest test I've done so far is a 100 * 100 matrix with each element incremented by 10 and the number I was looking for was 780(it found it pretty fast). Feel free to mess with my code if you so desire.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
lewis manuel wrote:I've edited my code(not the algorithm mind you) so it can accept input from the user(with some exception checking) and so that other people can test it if they want to.
OK, but that's likely to limit the size of test arrays. I really like the fact that you've added a createMatrix() method though, because that allows you to generate as many as you like for testing.

The largest test I've done so far is a 100 * 100 matrix with each element incremented by 10 and the number I was looking for was 780(it found it pretty fast). Feel free to mess with my code if you so desire.
But that isn't your "worst case" scenario, which is something you always want to think about when testing.

I suspect it will be something like:
1 2 3 4 ...
2 3 4 5 ...
3 4 5 6 ...
... etc, etc.
because it meets all the required properties of your matrix (at least as stated), but includes the maximum "overlap" in values between rows and columns. There will also be similar "worst case" matrices when it can't contain duplicate values, viz for an n x n matrix:
and the challenge will be to make a search of those matrices as fast as possible.

The problem is that even a binary search of the first and last columns isn't going to narrow your search much, and you'll still need to do searches of all the "limiting" rows.

I think that an initial binary search of the diagonal (top-left to bottom-right) will cut out more candidates than a simple column search, since it will immediately eliminate "sub-squares" at the top-left and bottom-right that can't possibly contain the number, and I suspect that the process can be repeated for the remainder; but exactly how, I'm not quite sure.

See if you can work it out.

And there may well be other solutions. Fun problem, and have another cow.

Winston
 
lewis manuel
Ranch Hand
Posts: 64
2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well like paul has suggested yesterday, I could do an if, else if and else statement that checks if the first element in the row is the target. If that is false, then check if the last element in the row is the target. If false, then do a binary search. This is similar I think to what you would do to a linked list(check the head, then the tail and finally traverse through the list). What I like I about this method is that simple, eliminates potential elements and I don't think it would cause much of a slow down. I don't know how much of an improvement that will have for the program though which is something to think about.

Here is how it looks:

 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
lewis manuel wrote:Here is how it looks:

It'll probably work, but I doubt it's the most efficient.

It is however, very simple ... and there's nothing wrong with that.

Winston
 
lewis manuel
Ranch Hand
Posts: 64
2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I just thought of a very interesting idea.

Take this 2 * 2 matrix for example.

2 4
10 20

Let's say I want to see if 20 is in the matrix. We can agree that there is no point in visiting 0, 2(4) and based on the binary search algorithm, the middle should 0, 0(2). Now instead of looking or comparing 1 array at a time and then moving on to the next array, how about we compare 2 indexes from different rows? Since we are looking for 20, we can compare both 4 and 10(or two indexes at a time) and check which is closer to our target. The general idea is to follow a path that gets us closer to the target by crossing arrays in a vertical direction(of course we still in a horizontal if our target is in the array that we are in). Now, there is probably things I'm not considering at the moment being that I just created this strategy, but I think it is a good start.
 
Carey Brown
Saloon Keeper
Posts: 3309
46
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:There will also be similar "worst case" matrices when it can't contain duplicate values.


I find this puzzle intriguing and feel that the solution lies in being able to narrow down the row AND column, most likely by logic which determines if a subset of a row or a subset of a column is to be processed next.

In order to test this it would be very helpful to have a method to create the worst case scenario matrix of a given size. Any ideas on how one would do this? Edit: maybe this is just as much a puzzle.
 
Piet Souris
Master Rancher
Posts: 2041
75
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What I still don't get:

is

2 10
3 30

allowed or not? I assumed that all elements of row i would be smaller
than those of row i + 1. Is that correct?
If so, then the solution is (logn) ^2, else it's nlogn (unkess I'm mistaken).
 
lewis manuel
Ranch Hand
Posts: 64
2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Piet, these are the conditions:

1st Condition: you are given a N X N matrix. Since the problem is not saying something like M X N which means that there is a possibility that the matrix might not be some thing like 3 X 3, instead it might be a 2 X 3 matrix(jagged matrix).
2nd Condition: From top to bottom and from left to right, the numbers are always increasing. In other words, every column and every row is always sorted.

2 10
4 20

The columns from top to bottom is increasing, and the rows from left to right is increasing. This satisfies the second condition. it's also a N X N matrix because it has the same number of columns and rows. This satisfies the first condition.
This means that this matrix is allowed.
 
Piet Souris
Master Rancher
Posts: 2041
75
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Tht clears it up then.
That means that every row i with a[i][0] <= number <= a[i][n-1]
is a candidate. Hmmm... can't think of anything better than already
mentioned in this topic for now.
 
Piet Souris
Master Rancher
Posts: 2041
75
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, a BS on the first column, find the largest i for which a[i][0] <= number,
and a BS on the last column, with the smallest j with a[j][n-1] >= number.
Then a BS on the rows i...j.
But that sill is nlogn.
 
lewis manuel
Ranch Hand
Posts: 64
2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hey guys, the idea that I proposed a little while back appears to be working. Here is my new solution in which I think the time complexity is linear which is better than what I was getting using the binary search.

Modified Solution



So the way this solution works is that we start from index 0,0(or 0, N-1 or N-1, 0 or N-1, N-1) and then if that value is less than the target, we jump to a new row that and evaluate that index. If that index has a value greater than the target then we decrement the row and move to a new column and repeat the process. I've had success with this solution so far, but I feel we can improve it a little bit more.
 
Carey Brown
Saloon Keeper
Posts: 3309
46
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Here is some code that can generate worst case matrices of any size. The only thing this doesn't do is leave holes, ie values that are missing.


 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Carey Brown wrote:I find this puzzle intriguing and feel that the solution lies in being able to narrow down the row AND column, most likely by logic which determines if a subset of a row or a subset of a column is to be processed next.
In order to test this it would be very helpful to have a method to create the worst case scenario matrix of a given size.

Well, I gave two versions above which could be replicated for any n.

I also think that the solution lies in diagonals and may well be of order O(logn), and my reasoning is as follows:
Suppose you need to find a number x.
1. If you do a binary chop on the main top-left bottom-right diagonal - which must also be in numeric sequence - you will either find x, in which case the job is done, or (more likely) you will find two diagonal positions containing numbers less than and greater than x. Let's say that those two positions are at p,p and p+1,p+1.

Given the constraints of the matrix, you now know that the number cannot be in the sub-squares 1,1-p,p or p+1,p+1-n,n - indexing from 1 - so if it exists, it must be in the other two. And since the same top/bottom left/right ascending constraint applies to those squares, you can apply the same algorithm recursively to each, until you either find n or end up with only two positions left to check.

HIH

Winston
 
Carey Brown
Saloon Keeper
Posts: 3309
46
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
Carey Brown wrote:I find this puzzle intriguing and feel that the solution lies in being able to narrow down the row AND column, most likely by logic which determines if a subset of a row or a subset of a column is to be processed next.
In order to test this it would be very helpful to have a method to create the worst case scenario matrix of a given size.

Well, I gave two versions above which could be replicated for any n.

I also think that the solution lies in diagonals and may well be of order O(logn), and my reasoning is as follows:
Suppose you need to find a number x.
1. If you do a binary chop on the main top-left bottom-right diagonal - which must also be in numeric sequence - you will either find x, in which case the job is done, or (more likely) you will find two diagonal positions containing numbers less than and greater than x. Let's say that those two squares are at p,p and p+1,p+1.

Given the constraints of the matrix, you now know that the number cannot be in the sub-squares 1,1-p,p or p+1,p+1-n,n - indexing from 1 - so if it exists, it must be in the other two. And since the same top/bottom left/right ascending constraint applies to those squares, you can apply the same algorithm recursively to each, until you either find n or end up with only two positions left to check.

HIH

Winston

I'm not entirely sure I understand your top-left/bottom right approach. If you try my matrix generator you will find that matrices that follow the rules don't necessarily break down along clean top-left/bottom-right lines.
 
Carey Brown
Saloon Keeper
Posts: 3309
46
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Here's a mod to my createMatrix() method that makes the matrix even more random while still following the rules.

 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Carey Brown wrote:I'm not entirely sure I understand your top-left/bottom right approach. If you try my matrix generator you will find that matrices that follow the rules don't necessarily break down along clean top-left/bottom-right lines.

By which I assume you mean that the last position in one row could be greater than the first position in the next (correct me if I'm wrong), which is what I meant by "overlap" in my previous post. My assumption is also that the worst possible case is that each row overlaps n-1 positions in the previous row, and I provided examples of both duplicating and non-duplicating versions.

Using diagonals (which I assume you agree must be in numerical order) narrows down both the row and column, as you suggested, but instead of searching whole rows, you search sub-squares which must, by definition, have the same constraints as the whole.

Hope I've clarified.

Winston
 
Carey Brown
Saloon Keeper
Posts: 3309
46
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:Using diagonals (which I assume you agree must be in numerical order)...

In this matrix

the diagonal: 15,25,31,32,26,23,18 (if this is what you mean) is not necessarily in numeric order.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:By which I assume you mean that the last position in one row could be greater than the first position in the next (correct me if I'm wrong), which is what I meant by "overlap" in my previous post.

I should add that my solution relies strictly on the constraint lewis gave in his original post: "a matrix of N X N size [...] increasing from left to right and top to bottom" (my emphasis). If adjacent rows or columns can be equal, then lewis's solution (full row searches) is probably close to optimal.

Winston
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Carey Brown wrote:[the diagonal: 15,25,31,32,26,23,18 (if this is what you mean) is not necessarily in numeric order.

I did say "top-left bottom-right diagonal", but maybe I didn't make it clear enough.

Winston
 
Carey Brown
Saloon Keeper
Posts: 3309
46
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Carey Brown wrote:
Winston Gutkowski wrote:Using diagonals (which I assume you agree must be in numerical order)...

In this matrix

the diagonal: 15,25,31,32,26,23,18 (if this is what you mean) is not necessarily in numeric order.

Ah, maybe you mean 1,9,17,32,39,44,49, which is in numerical order.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:I did say "top-left bottom-right diagonal", but maybe I didn't make it clear enough.

Ooops. Just found a huge hole in my "solution": The "remaining" areas may not be squares.

However, I still think the diagonal chop is the correct first move, because it eliminates at least half the positions (and possibly more) in one go. Just have to work out how to continue...

Winston
 
Carey Brown
Saloon Keeper
Posts: 3309
46
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

If I'm searching for 19, it would be greater than 17 and less than 32. So, I could do a binary search on the row to the right of 17, and if not found, try a binary search on the row to the left of 32.
Edit: FAILS if I search for 20.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:Ooops. Just found a huge hole in my "solution": The "remaining" areas may not be squares.

And have a cow for pointing that out to me with your matrix. I should have written one out myself.

Winston
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Carey Brown wrote:So, I could do a binary search on the row to the right of 17, and if not found, try a binary search on the row to the left of 32.

Not enough, because it could be anywhere (?) in the rectangle whose bottom-left position is immediately to the right of 17, or the one whose top-right position is immediately to the left of 32. I'm now trying to work out if there's any easy way of narrowing that search down, given that we're no longer dealing with squares.

Winston
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:I'm now trying to work out if there's any easy way of narrowing that search down, given that we're no longer dealing with squares.

Aha! I think I may have found it.

Once I find the two diagonal squares that limit my number, I do a "staircase search" starting from the other two squares.

Let's assume I'm looking for x=15. Starting from the square to the right of 17 (21), I go up if it's greater than x, and right if its less than x, until I can't go any further, so I'd go 21→13→22→14→16 and then run out of room (I can't go up any more), so: not found. Starting at the other "start-point" (28) I go left if it's greater than x and down if it's less than x, so I'd go 28→19→4→5→7→15 (found).

Obviously if I run out of room on both searches the number isn't in the square.

And I think that solution (assuming it's correct) is O(n).

Now I just have to prove it.

Winston
 
lewis manuel
Ranch Hand
Posts: 64
2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The solution that I came up with is similar to yours (check out my modified solution). Good job man! Well I have to go to sleep. I'll have a new problem for you guys tomorrow.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
lewis manuel wrote:The solution that I came up with is similar to yours (check out my modified solution). Good job man!

Cheers. I think the initial binary chop on the diagonal makes it slightly quicker, because it saves having to backtrack. It can also be repeated for any "remainder" that is square.

An additional optimization is that any remainder that is only one row or one column wide can be searched with a binary chop.

Well I have to go to sleep. I'll have a new problem for you guys tomorrow.

Fine. It's fun exercising the old braincells (and mine are old).

Winston
 
Piet Souris
Master Rancher
Posts: 2041
75
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
Aha! I think I may have found it.
(...) O(n)
Now I just have to prove it.

Winston

Hmm... given an N x M rectangle of squares, and if we can go either to the
right one square or down one square, how many paths are there to bottom
right, starting from upper left?
Since that is a standard question, found in many courses and in Project
Euler, I won't tell, but it is not O(N) or O(M).

What I suggested comes down to a logn determination of a minimal enclosing
rectangle. Now, doing a BS on all its columns ensures an nlogn process.

Here is an easy generator, somewhat simpler that Carey's, with holes,
that one can use to test ones theories.

It generated this 10x10 matrix

9 18 20 22 27 29 39 40 43 44
19 29 36 46 48 56 66 75 81 90
22 35 42 56 62 63 74 76 88 92
32 42 46 59 69 70 82 92 100 105
36 48 50 66 72 79 86 101 111 112
45 52 54 69 74 86 89 106 116 123
53 55 59 76 81 93 94 116 119 132
63 71 77 82 83 94 95 123 132 136
64 72 79 86 89 99 106 127 136 145
67 75 84 90 97 102 113 129 145 149


Question: is number 51 present?

Here's the code

 
Carey Brown
Saloon Keeper
Posts: 3309
46
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Interesting approach. Yours creates holes, which is a good thing. It also creates duplicates, which I guess is not specifically mentioned in the rules.
When testing your matrix, searching for all values between 1 and 149 (inclusive), I get these statistics on the number of comparisons required:

I modified my generator to create 50% holes but no duplicates. I get:

The your duplicates seem to cut down on the average number of compares, which seems about right.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Piet Souris wrote:Hmm... given an N x M rectangle of squares, and if we can go either to the
right one square or down one square, how many paths are there to bottom
right, starting from upper left?

Hmmm. Not sure that's relevant - but you're the mathematician.

To me, the question is: what is the maximum number of steps possible for a "staircase search" of an N x M rectangle, assuming you know which way to go? N+M-1, I'm pretty sure, assuming a starting point. So, for a 10x10 matrix it will be 19, assuming you start from a corner.

The problem I see with lewis's solution (starting from the top left) is that he doesn't know which way to go, which means he has to follow a "path" until it fails, and then backtrack, which I'm pretty sure renders a worst case that is a triangular number ((n(n-1))/2), which is of order O(n²). But I could be wrong.

With mine (binary chop on the diagonal, and then two searches - one for the "top-right" quadrant, and one (if needed) for the "bottom-left", I already know something about the properties of my quadrants, so I do know which way to move.

I'm pretty sure that my solution is of order O(n) (n being the size of the side of the square) since the "worst case" is O(logn + 2(n-1)) because the sides of the "remaining" rectangles must sum to n. And If one of the sides of my remaining rectangles is 1, I can do a binary search.

However, I'm also fairly sure that lewis's solution can be turned into an O(n) "staircase search" by starting at the bottom-right (ie, position [n-1][0]) and proceeding right if the number is less than x, and up if it isn't, which is simpler, but maybe less "tweakable".

@lewis: I noticed one thing about your "modified solution". You can simplify your conditions quite a bit because:
  x <= sizeOfMatrix - 1
is the same thing as:
  x < sizeOfMatrix
so your loop can be:
  while(row < sizeOfMatrix && column < sizeOfMatrix)

HIH

Winston
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!