• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

ORM or JDBC for large data load

 
Ravi Danum
Ranch Hand
Posts: 154
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Hello All,

What is better to use when loading a large amount of data into a database: @Entity or JDBC or some other method?

I have created @Entity objects (Spring), but hesitate to use an interface extending CrudRepository because it persists one object (record) at a time.

Would JDBC be better to use so that I can insert many records within one transaction? Or is there another way to persist several @Entity objects within a single transaction?

Thanks for any help.

-Ravi


 
gyank kannur
Ranch Hand
Posts: 43
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
We had a requirement to implement a search page with a large number of criteria (like the ones in flight search). It was a disaster to use ORM. Better to use jdbc for large data. In this way, if there are additional criteria, you can tune your sql queries
 
Ravi Danum
Ranch Hand
Posts: 154
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Many thanks, gyank kannur!

I appreciate your response. I will use jdbc.


-Ravi
 
Ramya Subraamanian
Ranch Hand
Posts: 178
17
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
you can also consider using batchUpdate in Spring JdbcTemplate. This link should give an idea
 
Ravi Danum
Ranch Hand
Posts: 154
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Thanks to Ramya Subraamanian!

Great suggestion!

-Ravi
 
Tim Holloway
Saloon Keeper
Posts: 18359
56
Android Eclipse IDE Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
JDBC is like assembly language. Back (several decades now!) software companies used to brag about the how their product was efficient because it was written in assembly language.

Well, it probably was - back then, when compilers rather stupidly translated line-by-line from high level to machine code. But that changed somewhere around 1985 when I started seeing products like IBM's Pascal/VS compiler, which not only generated frighteningly efficient machine code, but could re-optimize the code start-to-finish every time it compiled - something that would be cost-prohibitive to do on an assembly-language program of any size.

I also endured my share of non-optimal bubble-sorts and linear searches working with assembly-language code. Because while we weren't as pressured for time back then, even a simple Shellsort required about 3 times as much work to implement (and sort/search libraries were scare back then).

These days, not only do compilers do a superlative job of optimizing code, environments such as Java's JIT can actually analyse code for efficiency while it's running and re-optimize the code on the fly based on the workload. Under some circumstances, Java can outperform C/assembly code because of that.

A similar situation exists between JDBC and ORMS. I've seen benchmarks that put the throughput for ORM-based tests at double the rate of their raw JDBC equivalents. Again, that's because when things begin to scale, machines can juggle more variables than humans can and machines can afford to do more work on optimizing than humans can - especially since the watchword of the day is usually "Just Git 'er Dun!"

Your Mileage May Vary, of course. If you have a rigidly captive environment, JDBC may outperform an ORM, especially when it's a smaller environment. The overhead of an ORM, like the overhead of a JVM, is substantial, so first your project has to be big enough to justify the start-up costs.

More important than the platform, however, is what you do with it. Choosing your algorithms (and in this case, your schema) can make a world of difference. In fact, much of the success of the NoSQL movement comes from the enhanced throughput that you can get when forgoing the advanced features that SQL databases offer - especially table joins. A favorite case in point that I often quote was a product where the data was almost but not completely pre-sorted. That's a nightmare scenario for Heap and Quick Sorts, and not much better for the much-maligned bubble sort. But a Shellsort was ideal for the data in question. The efficiency of the algorithm vastly dwarfed any efficiencies of automated or manual code generation.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic