• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Liutauras Vilda
  • Ron McLeod
Sheriffs:
  • Jeanne Boyarsky
  • Devaka Cooray
  • Paul Clapham
Saloon Keepers:
  • Scott Selikoff
  • Tim Holloway
  • Piet Souris
  • Mikalai Zaikin
  • Frits Walraven
Bartenders:
  • Stephan van Hulst
  • Carey Brown

Hibernate and millions of rows..

 
Ranch Hand
Posts: 580
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have red this thread https://coderanch.com/t/216713/ORM/java/Do-we-over-hibernate and this https://coderanch.com/t/214767/ORM/java/ORM-suitable-big-apps, but I have still questions:

For example,
imagine a database-table with 3.000.000 millions of rows.

I know, Hibernate offers some optimizations
(such as "hibernate.jdbc.batch_size" or "disabling second level cache" or "Stateless Sessions" to handle operations to a big deal of data.

And it is most of all a lack of unknown about hibernates possibilities of optimizations when dealing with millions of rows.
(This thread should fill the gap:-)

What are your practical experiences by working with so much datas?

What are best practices when working with millions of rows?

Should we avoid pure hibernate and call JDBC-API calls (via Hibernate?) directly
(Stored Procedures) when working with so much datas?
How can we make it faster?

Where lies the definitive disadvantageous in ORM
when working with millions of rows?
Are there any?

What should we do or know handling such scenarios?
[ December 15, 2008: Message edited by: nimo frey ]
 
nimo frey
Ranch Hand
Posts: 580
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Okay,
I have tested and made for now these configs to deal with so much data:

In my properties-file:



The "second level cache" should be disabled programmatically
as you do not need to disable the cache in all scenarios:



Now, let s look at all CRUD-Operations:

For SAVE-Operation:

- after flush, you should call clear,
to clear the cache (first level cache? clear all references?)

public void save()
for (int i=1; i<= 2000000; i++)
{

Item i = new Item();
entityManager.persist(i);
entityManager.flush();

// but when I clear within the loop, does hibernate.jdbc.batch_size works?
// I flush and clear after I saved one record..so there can nothing be //batched, am I right?

entityManager.clear();
}

For READ-Operations:

- the only thing, I know, is to set the boundaries of your selection.
If you "set first result" to 0 and "max result" to 2000,
then the list result returns the first 2000 records (when I call this method again from the same session (?), then it returns the result-list contains the records from 2001-4000 (and so on). Am I right??



- the other thing, I know, is that when you call getReference, then you have NO database-hit:

public void read() {

// but reference works only, if this item is in my cache. Am I right??
Item i = i.getReference(Item.class, 1);

}


For DELETE-Operations:

I guess, delete is fast enough(am I right? Are there any optimizations there, too?)

For UPDATE-Operations:

What for optimizations exists for update-statements?


This thread should summarize all optimizations-strategies which can (should) be made by handling millions of datas.

Any suggestions. practical experiences or best-practices are very welcome:-)

We can categorize it at follows:
Category 1: Optimizations in: properties.xml
Category 2: Optimizations in: CRUD-Operations
Category 3: Optimizations in: ?)
[ December 15, 2008: Message edited by: nimo frey ]
 
Bartender
Posts: 10336
Hibernate Eclipse IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


What are your practical experiences by working with so much datas?

What are best practices when working with millions of rows?


Not using an ORM for such large bulk operations is what I'd consider best practice. Databases come with bulk data manipulation tools that are far better suited.
 
nimo frey
Ranch Hand
Posts: 580
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Not using an ORM for such large bulk operations is what I'd consider best practice.



I can use conventional JDBC within Hibernate or do a Stateless Session.

Databases come with bulk data manipulation tools that are far better suited.



So JDBC and StoredProcedures,PreparedStatemends are not common for such cases?

Hmm..I have never heard about "bulk data manipulation tools".

Do you know such tools for DB2 or MySQL?

Cannot find anything. Does these tools (API?) are integrated within JAVA? I have thougth that JDBC or the Hibernate is well suited for such amount of data.
 
Paul Sturrock
Bartender
Posts: 10336
Hibernate Eclipse IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


I can use conventional JDBC within Hibernate or do a Stateless Session.


Yes. But I still wouldn't recommend processing millions of rows via an ORM.

I don't know about DB2 but I'd be surprised if it didn't. Bulk loading/unloading and data transformation tasks are as old as the hills.
 
nimo frey
Ranch Hand
Posts: 580
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
okay
I look for such tasks and give you feedback if it can integrated in JAVA.

bye
 
Politics n. Poly "many" + ticks "blood sucking insects". Tiny ad:
Gift giving made easy with the permaculture playing cards
https://coderanch.com/t/777758/Gift-giving-easy-permaculture-playing
reply
    Bookmark Topic Watch Topic
  • New Topic