• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Hibernate: Confusion about lazy/eager fetching and preventing LazyInitializationException

 
Klaas van Gelder
Ranch Hand
Posts: 111
Java Linux PHP
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Many posts are available about the dreary LazyInitializationException and how to prevent it. But the many different answers have not made thing very clear and I am struggling for the right approach.

My web application is centered about the Member and Activity class where the logged in member can organize (in- or outdoor) activities where other member can subscribe for. So there is a n:m (many-to-many) relationship between a MEmber and an Activity, which is mapped by Hibernate to a link table:







Because the activities for a member as well as the subscribed members for an activity must be shown in the applicaiton, I modeled it with a bi-directional many-to-many relationship as visible in the Hibernate annotations.

In the code above, the fetchtypes are set to LAZY because I fear an explosion of fetched entities: after obtaining a Member from the DAO, all his subscribed Activities are fetched, with each of those Activities fetching their own subscribed Members respectively. Because the application needs to be scalable to hundreds of Members with hundreds of Activities each, I think this kind of "recursive fetching" can quickly generate a very big object graph and subsequent performance and/or memory problems.

Besides, the logged in Member instance is saved in the session as long as the member is logged in and I am afraid that this can cause synchronization problems.

Setting the fetch types to LAZY however make it impossible to iterate over the activities for a member, or the members subsribed to an activity...

Some solutions are discussed on differnt fora: some advice to always use EAGER fetching (and thereby ignoring potential memoy problems) and other articles advice against the use of xxx-to-many relationships in cases were hundreds of child records need to be fetched.
See http://www.javacodegeeks.com/2011/10/avoid-lazy-jpa-collections.html. THis article discusses the possibility to omit the whole relationship and make separate Hibernate queries for those collections which can be safely obtained using separate DAO calls (maybe within one service method and transaction). Only in the case of simple lookup relationships with a limited number of records, such as the Category and Region fields in my Activity class, eager fetching is adviced.

But then the domain model does not really mirror the real-life relationships anymore. Also, queries for adding Activities to Members and vice versa need to be written manually, maybe by using a separate entity class for the link table .
Any thoughts about this?



 
Karthik Shiraly
Bartender
Posts: 1210
25
Android C++ Java Linux PHP Python
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Can you elaborate where exactly you are facing this problem? Some code would be helpful.
While I agree lazy init exceptions are a nuisance, I'm not able to think of a scenario where it makes iteration impossible.
Since they occur when hibernate proxies are accessed outside hibernate sessions, they are indicators that something may be lacking in your design.
As I see it, it's the same optimization that we'd do manually when using regular SQL JDBC queries, that is, don't fetch data which is not needed right now.
 
Klaas van Gelder
Ranch Hand
Posts: 111
Java Linux PHP
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for your reaction. I mean iteration is impossible because the call to the collection variable fails with a LazyInitializationException.

For example in the controller:



The call to activity.getParticipants() already fails because the collection is fetch lazily (and Hibernate session is closed after returning from the DAO method.)
Same problem by accessing activity.participants in the view for showing the participants of an activity.

So basically accessing the colleciton variables is impossible when they are configured as lazily fetched, and setting all fetching to eager is a bit of "fetching the whole database" with all problems this can cause.


Karthik Shiraly wrote:
Can you elaborate where exactly you are facing this problem? Some code would be helpful.
While I agree lazy init exceptions are a nuisance, I'm not able to think of a scenario where it makes iteration impossible.
Since they occur when hibernate proxies are accessed outside hibernate sessions, they are indicators that something may be lacking in your design.
As I see it, it's the same optimization that we'd do manually when using regular SQL JDBC queries, that is, don't fetch data which is not needed right now.
 
Karthik Shiraly
Bartender
Posts: 1210
25
Android C++ Java Linux PHP Python
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The technique I always use to overcome this is the "open session in view" servlet filter. It's a well described solution on the net; just search for it.
It's a servlet filter that opens a hibernate session as soon as request is received, stores it in thread local variable and retains the session until the reponse goes out via same filter.
Just add it to your web.xml.
Hibernate sessions will now remain alive for the lifetime of the entire request-to-response chain, including JSP/other view rendering, rather than lifetime of just the DAO method.

It's not the neatest of solutions because a DAO layer concept is leaking into higher layers, but then hibernate has always been slammed for its leaky abstractions and this is yet another one.
On the plus side, it does keep the OR model clean (without it, one would have to do all the link table trickery you've mentioned), while also optimizing SQL fetching and caching.

 
Klaas van Gelder
Ranch Hand
Posts: 111
Java Linux PHP
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes I had indeed read about this technique but it seemed pretty much of an ugly hack to me because the Hibernate layer should be abstracted away behind the Service layer, as you also point out in your answer. But I understand it is probably the only way to be able to fetch those lazily initiated collections.

And that leaves the choice between using this technique or remove the many-to-many associations completely and make a domain object for the link table (what I probably have to do anyway because I need to put some addional fields in the link table, like the datetime of subscribing to an activity).
Still not an easy decision... :-s


Karthik Shiraly wrote:The technique I always use to overcome this is the "open session in view" servlet filter. It's a well described solution on the net; just search for it.
It's a servlet filter that opens a hibernate session as soon as request is received, stores it in thread local variable and retains the session until the reponse goes out via same filter.
Just add it to your web.xml.
Hibernate sessions will now remain alive for the lifetime of the entire request-to-response chain, including JSP/other view rendering, rather than lifetime of just the DAO method.

It's not the neatest of solutions because a DAO layer concept is leaking into higher layers, but then hibernate has always been slammed for its leaky abstractions and this is yet another one.
On the plus side, it does keep the OR model clean (without it, one would have to do all the link table trickery you've mentioned), while also optimizing SQL fetching and caching.

 
Karthik Shiraly
Bartender
Posts: 1210
25
Android C++ Java Linux PHP Python
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It need not be a this-or-that mutually exclusive solution. Better to look at it as a 80/20 compromise, because it's likely 80% or more of your DAOs don't need eager fetching or link table hacks, while 20% or less may need them.
I would suggest keeping "open session in view" as default, because it gives you lazy fetching optimization.
Where you do need to do something special like eager fetching, you can always do that; "open session in view" does not obstruct that in any way.
 
Tim Holloway
Saloon Keeper
Posts: 18367
56
Android Eclipse IDE Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Open Session in View is indeed a horrible thing. Because the session is open for the duration of the request/response cycle, you can accidentally zap data in the database, to say nothing of potential performance and security problems.

What I do is to work with a 2-tier persistence architecture. Within those 2 tiers, you have full persistence connectivity. Outside those tiers, you are disconnected (detached). The boundary between the persistence tiers and the higher app tiers (business logic) is also the transaction extent.

The upper persistence tier is the "service" layer. The methods in that layer are all transactional. That is, when you invoke one of those methods, you begin a database transaction and when you exit the method, the transaction is committed (or rolled back, in case of failure).

The lower persistence tier is the DAO layer. A DAO class handles the CRUD and Find functions for a single database table. Or in some cases, a parent/child table. The DAOs inherit their transaction context from the service tier methods that invoke them.

The reason for having 2 tiers is to allow my basic table operations to remain simple. They're all done in the DAOs. The service methods, on the other hand, handle complex table interrelationships. For example, if I have an A class that relates to a Y class with children of class Z, the service method can do a fetch of a selected A with its associated Y and Zs. This is returned as a detached graph of entities. The application code can then display and/or modify this graph. If it modifies the graph, then there's typically a "save" method in the service tier that then takes that updated graph and invokes the DAO methods for the objects in that graph to cause it all to get posted to the database.

In some cases I'll have more than one service-tier fetch method. It will typically be named something like "findXXXWithYYYY" or "findXXXWithChildren". That allows me to define a simple lazy-fetch find to get basic info, and a deep-fetch find when I need details.

The deep-fetch find will either manually reference the secondary objects to force-load them (since it's dealing with connected objects) or in cases where the ORM system supports "fetch sets", it may invoke a fetch set that includes the items that in the default fetch are lazy-load.
 
Klaas van Gelder
Ranch Hand
Posts: 111
Java Linux PHP
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Interesting answer. I also make use of separate Service and DAO layers where each service method is transactional (by use of the Spring @Transactional annotation).

As I understand you correctly, you do advice against the use of lazy fetching of objects with child dependencies. I see some overlap with my suggestion in the first post to remove the collection fields from the Activity and Member classes and create a separate (ActivityMemberSubscription) entitiy class which maps to the link table. Then all queries can be done against a single table (apart from the lookup fields which can be eagerly getched without problems) and within the service method, the results can be combined.

The question is whether the child collections are still modelled in the entity classes. You write
"For example, if I have an A class that relates to a Y class with children of class Z, the service method can do a fetch of a selected A with its associated Y and Zs. This is returned as a detached graph of entities"

Does you mean that child collection fields are still present on the parent entities but only populated in the service methods by using multiple queries in a service method? And therefore write separate service methods to obtain, for example, a "naked" member object without its child activities and another service method to get a fully populated Member object with its activities?
Or

Also I want to mention that the Member object is saved in the session during login. I have the feeling that this should not be a complex object with child entities which can get stale and out of sync very quickly.
MAybe this is a case where a lazy fetch "naked" Member object should be used.

I think this is the way to go and it feels better than the "open session in view" method...



Tim Holloway wrote:Open Session in View is indeed a horrible thing. Because the session is open for the duration of the request/response cycle, you can accidentally zap data in the database, to say nothing of potential performance and security problems.

What I do is to work with a 2-tier persistence architecture. Within those 2 tiers, you have full persistence connectivity. Outside those tiers, you are disconnected (detached). The boundary between the persistence tiers and the higher app tiers (business logic) is also the transaction extent.

The upper persistence tier is the "service" layer. The methods in that layer are all transactional. That is, when you invoke one of those methods, you begin a database transaction and when you exit the method, the transaction is committed (or rolled back, in case of failure).

The lower persistence tier is the DAO layer. A DAO class handles the CRUD and Find functions for a single database table. Or in some cases, a parent/child table. The DAOs inherit their transaction context from the service tier methods that invoke them.

The reason for having 2 tiers is to allow my basic table operations to remain simple. They're all done in the DAOs. The service methods, on the other hand, handle complex table interrelationships. For example, if I have an A class that relates to a Y class with children of class Z, the service method can do a fetch of a selected A with its associated Y and Zs. This is returned as a detached graph of entities. The application code can then display and/or modify this graph. If it modifies the graph, then there's typically a "save" method in the service tier that then takes that updated graph and invokes the DAO methods for the objects in that graph to cause it all to get posted to the database.

In some cases I'll have more than one service-tier fetch method. It will typically be named something like "findXXXWithYYYY" or "findXXXWithChildren". That allows me to define a simple lazy-fetch find to get basic info, and a deep-fetch find when I need details.

The deep-fetch find will either manually reference the secondary objects to force-load them (since it's dealing with connected objects) or in cases where the ORM system supports "fetch sets", it may invoke a fetch set that includes the items that in the default fetch are lazy-load.
 
Tim Holloway
Saloon Keeper
Posts: 18367
56
Android Eclipse IDE Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have no One Size Fits All rule. I do have certain guiding principles, though.

If two objects are literally parent-child (such as an invoice header and detail items), then I MAY do an eager fetch by default or I may not. It depends on requirements. If I need to amass lists of invoices without needing line items, I'll set it up for lazy-fetch. On the other hand in cases where I'm expecting to almost always need the details, I may make the fetch be an eager fetch. Unless the details are either very large or very numerous. In extreme cases, I may even fetch the details as an independent list, although this can interfere with normal object modelling, so it's not something to do lightly. It's not uncommon that I'll do a lazy fetch of the parent, then decide I need the children and re-fetch eager using the original parent (or its key) as the argument for the detail fetch. Note that in many ORM suituations, however, this sort of operation doesn't populate the original object, but instead returns a new copy of the object, so care needs to be taken.

If two objects are related one-to-many or many-to-many, I USUALLY won't eager-fetch. Unless it proves truly advantageous.

When working on a multi-request workflow you may have a simple record set to reference. In which case, I go the EJB route and only keep the key in my session, re-fetching the actual ORM object only while I'm actively running request logic and trusting to cache to make it efficient. If, on the other hand, this is an edit of a complex graph, I may either keep the entire disconnected graph in session (if I can spare the resources) or, alternatively, save the transitional graph independently and load it as needed before staging it back to primary objects at final commit time for the workflow. A third alternative is to use an extended transaction architecture, but those can be tricky and are often not available in the environment I'm working in.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic