This week's book giveaways are in the Cloud and AI/ML forums.
We're giving away four copies each of Cloud Native Patterns and Natural Language Processing and have the authors on-line!
See this thread and this one for details.
Win a copy of Cloud Native PatternsE this week in the Cloud forum
or Natural Language Processing in the AI/ML forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Devaka Cooray
  • Liutauras Vilda
  • Jeanne Boyarsky
  • Bear Bibeault
Sheriffs:
  • Paul Clapham
  • Knute Snortum
  • Rob Spoor
Saloon Keepers:
  • Tim Moores
  • Ron McLeod
  • Piet Souris
  • Stephan van Hulst
  • Carey Brown
Bartenders:
  • Tim Holloway
  • Frits Walraven
  • Ganesh Patekar

Inserting duplicate Keys into a Binary Search Tree?

 
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi.

I am trying to insert duplicatie keys into a BST. And I'm far off what I am trying to achieve.

This is my current code:



How would I achieve to add the duplicates into the List?
 
Rancher
Posts: 828
19
Netbeans IDE Oracle MySQL Database Tomcat Server C++ Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Same way you add any new node into the list, but you have to choose what branch to take for a duplicate.

Have you considered doing a hit count on your nodes?
 
Les Morgan
Rancher
Posts: 828
19
Netbeans IDE Oracle MySQL Database Tomcat Server C++ Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

OK, I have to ask:
You have a Key that you use for comparison for insertion into the Tree, but you also have a Value, and you also have a List. I do not know your use for this list, but if your Value is the data that is usable, and I suspect it is, then it would seem to me that you are defeating the usefulness of the BST--you have no idea what the Value is based on searching the Key. Your size makes no sense to me at all too--as it seems to me that you have a size for the Tree, the number of nodes, and if that is so, then it would seem each and every time you do an insert, then you have to redo your entire Tree's nodes and set the size.

What I have usually seen for BST's is:

and you search off of value.

Another thing I do not see is a reorder: your tree can be an extreme skew tree (linear search all nodes down one side) or unbalanced, and you have not a solution.
 
Les Morgan
Rancher
Posts: 828
19
Netbeans IDE Oracle MySQL Database Tomcat Server C++ Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The practice of returning the root or new root for your "put" also seems rather strange, all of the internal workings of the BST should be transparent to the end user, in other words, if I have a reference to root, then I can see your entire BST contents.
 
Kevin Garcia
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for your input. The data gets sorted before I call the BST. And for the code part.. This is actually from a book of Algorithms (http://www.amazon.com/gp/product/032157351X/ref=as_li_qf_sp_asin_il_tl?ie=UTF8&tag=algs4-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=032157351X).

Basically every Key has a Value. And I'd like to have duplicate keys with different values. That's why I made a List for every Node if there is a duplicate, but whenever I try to use it; it doesn't work out. Most of the times it gives the same value back for every duplicate Node.
 
Les Morgan
Rancher
Posts: 828
19
Netbeans IDE Oracle MySQL Database Tomcat Server C++ Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Kevin,
That makes no sense. Since you are searching on the key, then if you have duplicate keys, your hit is going to come from the first occurrence of your key, as I suspect it is now from your comments. You may have 1000 duplicates in your implementation, but the value that is going to be returned is off your initial key hit--how can you distinguish what value you want, if you are only searching by key?

You need to go back and rethink your logic. A BST is an order log(n) search and is fast in finding what you are looking for, but that is based upon the premise that you have distinct keys to search. Without distinct keys you are going to have to make some kind of implementation to search your space and you are no longer in the realm of BST's. You could make a Ternary Search Tree, TST, and have the middle as equal and have it be a linked list. You could navigate using the BST method of search, but once you find the node, you have to run the linked list to find what you are actually looking for.

Kevin Garcia wrote:Thanks for your input. The data gets sorted before I call the BST. And for the code part.. This is actually from a book of Algorithms (http://www.amazon.com/gp/product/032157351X/ref=as_li_qf_sp_asin_il_tl?ie=UTF8&tag=algs4-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=032157351X).

Basically every Key has a Value. And I'd like to have duplicate keys with different values. That's why I made a List for every Node if there is a duplicate, but whenever I try to use it; it doesn't work out. Most of the times it gives the same value back for every duplicate Node.

 
Kevin Garcia
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Les Morgan wrote:Kevin,
That makes no sense. Since you are searching on the key, then if you have duplicate keys, your hit is going to come from the first occurrence of your key, as I suspect it is now from your comments. You may have 1000 duplicates in your implementation, but the value that is going to be returned is off your initial key hit--how can you distinguish what value you want, if you are only searching by key?

You need to go back and rethink your logic. A BST is an order log(n) search and is fast in finding what you are looking for, but that is based upon the premise that you have distinct keys to search. Without distinct keys you are going to have to make some kind of implementation to search your space and you are no longer in the realm of BST's. You could make a Ternary Search Tree, TST, and have the middle as equal and have it be a linked list. You could navigate using the BST method of search, but once you find the node, you have to run the linked list to find what you are actually looking for.

Kevin Garcia wrote:Thanks for your input. The data gets sorted before I call the BST. And for the code part.. This is actually from a book of Algorithms (http://www.amazon.com/gp/product/032157351X/ref=as_li_qf_sp_asin_il_tl?ie=UTF8&tag=algs4-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=032157351X).

Basically every Key has a Value. And I'd like to have duplicate keys with different values. That's why I made a List for every Node if there is a duplicate, but whenever I try to use it; it doesn't work out. Most of the times it gives the same value back for every duplicate Node.



Thank you again for your brief input. I have to admit this is an assignment for University, but since I couldn't figure it out I'd ask someone else for insight. I cannot change to a TST, unfortunately. What would be a proper and logic way to distinct the keys?
 
Les Morgan
Rancher
Posts: 828
19
Netbeans IDE Oracle MySQL Database Tomcat Server C++ Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Nope, the TST I described would have duplicate keys, all of the values in the linked list would have the same key, and they can all have different values. You're prof is probably trying to get you to think outside the box, you will be using BST searches to find the node to insert, but you have to add extra to it, to search the linked list to find the value needed.

It seems rather pointless to me, because there really is no way to tell what value you want from the key, so you have 13 identical keys with 13 distinct values, you insert the 13 values into the same position in the TST, but then you have your linked list to search--how do you know which of those 13 values you want in return--Monte Carlo method? There has to be something that your prof is leaving out or you are missing, because to get distinct values, you have to have some way of identifying distinct data.

Kevin Garcia wrote:

Les Morgan wrote:Kevin,
That makes no sense. Since you are searching on the key, then if you have duplicate keys, your hit is going to come from the first occurrence of your key, as I suspect it is now from your comments. You may have 1000 duplicates in your implementation, but the value that is going to be returned is off your initial key hit--how can you distinguish what value you want, if you are only searching by key?

You need to go back and rethink your logic. A BST is an order log(n) search and is fast in finding what you are looking for, but that is based upon the premise that you have distinct keys to search. Without distinct keys you are going to have to make some kind of implementation to search your space and you are no longer in the realm of BST's. You could make a Ternary Search Tree, TST, and have the middle as equal and have it be a linked list. You could navigate using the BST method of search, but once you find the node, you have to run the linked list to find what you are actually looking for.

Kevin Garcia wrote:Thanks for your input. The data gets sorted before I call the BST. And for the code part.. This is actually from a book of Algorithms (http://www.amazon.com/gp/product/032157351X/ref=as_li_qf_sp_asin_il_tl?ie=UTF8&tag=algs4-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=032157351X).

Basically every Key has a Value. And I'd like to have duplicate keys with different values. That's why I made a List for every Node if there is a duplicate, but whenever I try to use it; it doesn't work out. Most of the times it gives the same value back for every duplicate Node.



Thank you again for your brief input. I have to admit this is an assignment for University, but since I couldn't figure it out I'd ask someone else for insight. I cannot change to a TST, unfortunately. What would be a proper and logic way to distinct the keys?

 
Ranch Hand
Posts: 239
12
Scala IntelliJ IDE Eclipse IDE Java Ubuntu
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Les - I think that "root" in the method signature of the "put" method is not the root of the BST. It seems like a case of a misnamed variable. You can see the authors' BST in question at the book's webpage where it calls this variable "x" --

Kevin - As for adding duplicates to the list, seems you have the right idea in line 34. What actually happens when you put a duplicate right now. You might want to step through in a debugger.
 
Les Morgan
Rancher
Posts: 828
19
Netbeans IDE Oracle MySQL Database Tomcat Server C++ Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The website makes a lot more sense.

Scott Shipp wrote:Les - I think that "root" in the method signature of the "put" method is not the root of the BST. It seems like a case of a misnamed variable. You can see the authors' BST in question at the book's webpage where it calls this variable "x" --

Kevin - As for adding duplicates to the list, seems you have the right idea in line 34. What actually happens when you put a duplicate right now. You might want to step through in a debugger.

 
Kevin Garcia
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Scott Shipp wrote:Les - I think that "root" in the method signature of the "put" method is not the root of the BST. It seems like a case of a misnamed variable. You can see the authors' BST in question at the book's webpage where it calls this variable "x" --

Kevin - As for adding duplicates to the list, seems you have the right idea in line 34. What actually happens when you put a duplicate right now. You might want to step through in a debugger.



Right now:



I am putting a Student Grade into the tree with the value as Student Number.



I am having trouble on how to know what index(es) to get.
 
Les Morgan
Rancher
Posts: 828
19
Netbeans IDE Oracle MySQL Database Tomcat Server C++ Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
As I look at the code and explanation in the website, it looks like you do not have duplicate insertions at all, you get replacement of value when there is a duplicate, existing, key.
 
Scott Shipp
Ranch Hand
Posts: 239
12
Scala IntelliJ IDE Eclipse IDE Java Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Kevin Garcia wrote:

Scott Shipp wrote:Les - I think that "root" in the method signature of the "put" method is not the root of the BST. It seems like a case of a misnamed variable. You can see the authors' BST in question at the book's webpage where it calls this variable "x" --

Kevin - As for adding duplicates to the list, seems you have the right idea in line 34. What actually happens when you put a duplicate right now. You might want to step through in a debugger.



Right now:



I am putting a Student Grade into the tree with the value as Student Number.



I am having trouble on how to know what index(es) to get.



These are lines 26-35 of what you posted above:



Notice that in the first line here (which is really line 26) it is comparing key. I am assuming that when your output says "Inserting 10.0 belongs to 500710920" - that the value is 500710920 and the key is 10.0. Can you confirm this? Or is it the other way around?

Assuming that's true (key = 10.0, 9.0, etc.) do you think that you should continue having separate "Value" and "List<Value>" fields in your node? Can you walk me through how you see this working? When you add a value to a node the first time, where is it stored? Now let's say a second call with the key is made, how will the new value be stored? If you can explain the steps I think you will be able to fix the code.





 
Scott Shipp
Ranch Hand
Posts: 239
12
Scala IntelliJ IDE Eclipse IDE Java Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Here is my further hint. When I say "When you add a value to a node the first time, where is it stored?" I am talking about when the node is created.
 
Kevin Garcia
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Scott Shipp wrote:

Kevin Garcia wrote:

Scott Shipp wrote:Les - I think that "root" in the method signature of the "put" method is not the root of the BST. It seems like a case of a misnamed variable. You can see the authors' BST in question at the book's webpage where it calls this variable "x" --

Kevin - As for adding duplicates to the list, seems you have the right idea in line 34. What actually happens when you put a duplicate right now. You might want to step through in a debugger.



Right now:



I am putting a Student Grade into the tree with the value as Student Number.



I am having trouble on how to know what index(es) to get.



These are lines 26-35 of what you posted above:



Notice that in the first line here (which is really line 26) it is comparing key. I am assuming that when your output says "Inserting 10.0 belongs to 500710920" - that the value is 500710920 and the key is 10.0. Can you confirm this? Or is it the other way around?

Assuming that's true (key = 10.0, 9.0, etc.) do you think that you should continue having separate "Value" and "List<Value>" fields in your node? Can you walk me through how you see this working? When you add a value to a node the first time, where is it stored? Now let's say a second call with the key is made, how will the new value be stored? If you can explain the steps I think you will be able to fix the code.







Yep, 10.0, 9.0, 8.0, etc are indeed the keys. And the numbers (500710920, etc) are the values. At first I thought when there is a duplicate. In my example 9.0 has a duplicate: when the key 9.0 does not exist yet, it adds it to the field Value. But when there are more duplicates it will use the List<Value> to store the duplicates. I thought it would work.

And when there are no keys yet, it will assign the field root to that key.
 
Scott Shipp
Ranch Hand
Posts: 239
12
Scala IntelliJ IDE Eclipse IDE Java Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Kevin Garcia wrote:

Yep, 10.0, 9.0, 8.0, etc are indeed the keys. And the numbers (500710920, etc) are the values. At first I thought when there is a duplicate. In my example 9.0 has a duplicate: when the key 9.0 does not exist yet, it adds it to the field Value. But when there are more duplicates it will use the List<Value> to store the duplicates. I thought it would work.

And when there are no keys yet, it will assign the field root to that key.



I think you're almost there! Compare how values are put vs. how they are gotten right now. It might be good if you take just these two:

Inserting 9.0 belongs to 500710921
Inserting 9.0 belongs to 500710922

It might be good to actually walk through step-by-step on paper or a whiteboard. Figure out where each value lives after each put, and then when you "get" what happens. Follow your code closely compared to what you think should happen.
 
Kevin Garcia
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Scott Shipp wrote:

Kevin Garcia wrote:

Yep, 10.0, 9.0, 8.0, etc are indeed the keys. And the numbers (500710920, etc) are the values. At first I thought when there is a duplicate. In my example 9.0 has a duplicate: when the key 9.0 does not exist yet, it adds it to the field Value. But when there are more duplicates it will use the List<Value> to store the duplicates. I thought it would work.

And when there are no keys yet, it will assign the field root to that key.



I think you're almost there! Compare how values are put vs. how they are gotten right now. It might be good if you take just these two:

Inserting 9.0 belongs to 500710921
Inserting 9.0 belongs to 500710922

It might be good to actually walk through step-by-step on paper or a whiteboard. Figure out where each value lives after each put, and then when you "get" what happens. Follow your code closely compared to what you think should happen.



After some debugging I came to the following solution:



Where current is a field that starts its value with -1. I do realize that this is not the best approach, but I'm not quiet sure how to do it otherwise. Any feedback is appreciated.

Edit: some output

Where I put: 9.0, 9.0, 8.0, 8.0, 8.0, 7.0, 7.0 and 6.0 in the BST.


 
Scott Shipp
Ranch Hand
Posts: 239
12
Scala IntelliJ IDE Eclipse IDE Java Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Good job, you found the issue!

I'll throw a thought in here. What if you abandoned holding value, and instead you just had a List<Value>? Most of this overhead is due to having to manage the fact that you store one value in one place (value field) and other values in another place (the list). You can easily just store the single value nodes in the List<Value> too. Youw ould change the get method to just return a List<Value>. I wonder if that's what the prof might have in mind.
 
Kevin Garcia
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Scott Shipp wrote:Good job, you found the issue!

I'll throw a thought in here. What if you abandoned holding value, and instead you just had a List<Value>? Most of this overhead is due to having to manage the fact that you store one value in one place (value field) and other values in another place (the list). You can easily just store the single value nodes in the List<Value> too. Youw ould change the get method to just return a List<Value>. I wonder if that's what the prof might have in mind.



Not sure. But that might be a good idea instead of this messy code. Would something like this suffer?

 
Consider Paul's rocket mass heater.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!