• Post Reply Bookmark Topic Watch Topic
  • New Topic

Accessing value from a seperate hash table  RSS feed

 
Matt Wright
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am implementing the hash join algorithm for a project with a hard coded hash function. I've hashed the first relation and I've hashed the second relation. The problem is when hashing the second relation I only know how to add the tuple from the second relation into a third relation and not also access the first relation tuple at that time

 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Matt Wright wrote:I am implementing the hash join algorithm for a project with a hard coded hash function.

Not quite sure what you mean here. Do you have a hashCode() method?

Also: You talk about "tuples", which is a database concept. In Java, you're likely to be dealing with collections of some kind.

What you're attempting is something that is usually done with Sets (java.util.Set) in Java (indeed, the method retainAll() specifically mimics it), so if I was trying to do this I would probably create a Relation class that encapsulates the "where" clause of a SELECT statement.

Once you have that, and can calculate its hash, populate two HashSet<Relation>s with the relations for each of your datasets (don't worry about duplicates for now) and run retailAll(). That will give you the Relations common to both, and then with another pass through your datasets you can add something that equates to row ids for them (eg, List indexes). It's possible you could do this during your initial build, but the two-pass approach is likely to be a lot simpler to code and doesn't change the O(n) characteristic of the operation.

It seems to me that you may be concentrating too much on the mechanics of the operation rather than looking at the design.

HIH

Winston
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:It's possible you could do this during your initial build...

Actually, thinking about it, you could probably do that quite easily if you initially load into HashMap<Relation, List<Integer>>s (where the Integer is the index of your "tuple"); and you could then do the retainAll() operation on their keySet()s.

I would expect this to have similar speed to a HashSet (but I don't know), and it would save you having to scan one of the datasets a second time (in fact you could just scan one of the Maps).

In order to combine them though, you'll need some way of distinguishing indexes of "table" A from those of table B, and one easy method would be to make the latter negative (perhaps something like: negativeIndex = -(originalIndex + 1), as Arrays.binarySearch() does for its "insertion" indexes). Obviously, this won't work for any joins involving more than two tables though (if you need it).

HIH

Winston
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:if I was trying to do this I would probably create a Relation class that encapsulates the "where" clause of a SELECT statement.

Actually, since this is for joins, it would only have to include those elements that are actually part of the join. What you do about any other "filtering" logic would be up to you. One simple way to combine hashes for multiple join fields would be to XOR them.

Winston
 
Matt Wright
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I took your advice and got an efficient solution. Thank you.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Matt Wright wrote:I took your advice and got an efficient solution. Thank you.

Great! Just as a matter of interest, what did you come up with? My "advice" was only one of many ways to do it, and there's no "right" or "wrong" about these things...

Winston
 
Matt Wright
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
http://pastebin.com/7Jc92QLB
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!