• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Paul Clapham
  • Ron McLeod
Sheriffs:
  • Jeanne Boyarsky
  • Liutauras Vilda
Saloon Keepers:
  • Tim Holloway
  • Carey Brown
  • Roland Mueller
  • Piet Souris
Bartenders:

SQL -SELECT on multiple tables overhead

 
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi everyone,

I got a little problem with some joins I�m trying to do in MySQL 5.0.
The setup is as follows:
3 tables - parent_table(pk, data), child_table1(fk_parent, pk, data), child_table2(fk_parent, pk, data).

What do I want to accomplish? Well in my java-code I have a TransferObject
that reflects this structure - Parent, Child1, Child2. Parent has besides it data variable the two child objects.

The easy way is to do a LEFT JOIN i.e:
SELECT p.*, c1.pk, c1.data, c2.pk, c2.data
FROM parent_table p
LEFT JOIN child_table1 c1
ON p.pk = c1.fk_parent
LEFT JOIN child_table2 c2
ON p.pk = c2.fk_parent

In a sense this works, but the resultset overhead is huge because of the duplicate rows that is generated when there are rows in the child tables pointing to a single pk in the parent_table.

Is there any way around it that I�m missing? My options for the moment is to do multiple queries in the java-code and write the joinlogic in java instead.


Any sugestions are welcome - (except 'switch to Hibernate!' )
Cheers!

/Jonas
 
Ranch Hand
Posts: 1780
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Shouldn't your domain object Parent has a collection of Child1's and another collection of Child2's ?

[ February 17, 2006: Message edited by: Jeff Albertson ]
 
Greenhorn
Posts: 17
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well, the thing is, for every entry in the parent table you get a Cartesian product of all c1 and c2 records associated with it. Here is an example. Suppose we have a parent record p1 that has dependent entries:

in child1:

10
20

in child2

a
b
c

Now, given your tuple format (p c1 c2), how would you expect the result set rows to appear? Do you think it will be:

p1 10 a
p1 20 b
p1 - c

No, this is not what the semantics of your query is. The semantics is to produce:

p1 10 a
p1 10 b
p1 10 c
p1 20 a
p1 20 b
p1 20 c

I understand what you are trying to achieve: initialize all parent objects with their children in one sql query. Well, given your table relationships this might not be the ideal solution. Some of the options you may want to consider would include:

- Process the parent table first instantiating parent objects; then process the bulk of child1, for each of them look up in memory their respective parent and tie them together, then do the same for child2. This will yield three queries altogether.

- Go through records in the parent table; for each parent retrieve its dependent children in child1, then in child2. This will result in (2*N + 1) queries (where N is the number of parent records).

- Same as above, but defer children instantiation till they are actually requested (lazy loading).

These are the options I can think of given your descriptions, but maybe better solutions exist.
 
Jeff Albertson
Ranch Hand
Posts: 1780
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The SQL in the original post was doing a proper join, not a cartesian product!

But I think the error was in turning every row in the select into one separate object, instead of creating parent and child objects and associating the appropriately.
 
Bruno Boehr
Greenhorn
Posts: 17
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
> The SQL in the original post was doing a proper join, not a cartesian product!

It is a proper join to the effect of connecting c1 and c2 to the parent. Yet, given two mutually independent sets of records in c1 and c2 (within a particular parent realm), we still get a Cartesian of those sets.
 
Jonas Vikis
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
In answer to Jeff:

You�re quite right, I use collections exactly as you stated/suggested, but since my english is a bit rusty, I didn�t use the proper terms

As for the question in hand, maybe I should give you some more details to why the query gives me headaches...
I�m trying to use a special pattern in the business tier where I cache the ResultSet in an objectarray(which I pass on to the calling class), thus letting me close the JDBC connection much faster. Then I�m using a TransferObjectAssembler to set the data in TOs upon request from the MVC-framework since it's all a part of a rather complex search, and the results could be fairly large.

So after some brainstorming, I come up with two alternatives. The one stated or do as suggested - first call the parent(s) then call the the childrens and through java-logic bring them together.
But then I couldn�t really use the design since that would leave me with a lot of cached ResultSets instead, or leaving my connector class to do the assembling.

But as I�m only human, I might have missed something obvious...

Any suggestions?

Cheers
/Jonas
 
Jonas Vikis
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
In answer to Bruno:

Yup, I�m aware of the cartesian result. That�s what I�m trying to get away from.

Lazy loading would be the most easiest way to get around it, but it�s not possible, because of the fact that the design states that if(children > 0)
then the result must be joined in the view. In other words the collections of childrens are shown together with their respective parents.
This wouldn�t be such a hassle if it weren�t for the fact that a regular client query would generate a result of multiple parents shown together...

A typical query would generate 0 to 10 parents, with 0 to 50 child1 and 0 to 5 child2.
If you do the math for the extreme case - that�s either a lot of queries or a somewhat large ResultSet in comparison to the actual size.


Cheers
/Jonas
 
Jonas Vikis
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Answering my own post... Again.

Anyway, thanks for your answers, they helped me in the right direction.

I managed to come up with a solution which is much like the suggested one, calling them separately (sort of). With some help with my trusted StringBuffer, amongst other, i succeeded to cut queries down to exactly 3 for this situation. Thus leaving the assembling part to java-code, unfortunately not using my assembler or cache

So if anyone is interested in the stub-code I can post it here for you to laugh at, criticize or whatever.

Cheers
/Jonas
[ February 20, 2006: Message edited by: Jonas Vikis ]
 
reply
    Bookmark Topic Watch Topic
  • New Topic