Forums Register Login

Fetch 100 rows at a time using RowHandler in IBatis(Oracle 11g database ) - Example

+Pie Number of slices to send: Send
I need to fetch 2 million records from the oracle database(1 year data).
When I am retrieving it as list, it is taking 'n' minutes and it hangs.



So, I tried implementing IBatis `"RowHandler"` interface and I overrided the and I am able to get the result (One row at a time).

But I need to get 'n' rows at a time where n >= 1000. So I added fetchSize="1000" and resultSetType="FORWARD_ONLY" attribute to my select statement.

E.g:



But still I am getting only one row at a time in the "handleRow(Object obj)" method.



When the method is called during query execution, `"List Size ->"` is always incrementing by one. But I am expecting increment rate of 1000 (As I have given fetchSize = "1000")...

When I googled out, there is a property available (Driver.useCursorFetch) which can be used along with "fetchSize" and "resultSetType" attribute.
But I think it is only for MySQL Database.
Reference :
http://www.yaldex.com/mysql_manual/ch23s04.html or
http://stackoverflow.com/questions/3870500/ibatis-querywithrowhandler-still-seems-to-fetch-all-rows.

What is the equivalent property(`Driver.useCursorFetch`) for Oracle 11g database.
I need some configuration like below.



Thanks in advance.
+Pie Number of slices to send: Send
The first thing you should do would be to think really hard how to avoid fetching millions of records. Sometimes you really need to, but in many cases it would be feasible to move the processing to the database, either using a clever SQL statement, or to a stored procedure (in Oracle, you can create Java stored procedures, so you don't even need to know PL/SQL and perhaps could reuse some of the existing code). I'm pretty sure this would decrease the response time much more significantly than arriving at the optimal fetch size setting.

Regardless, the fetch size is a JDBC driver responsibility. When you use JDBC's recordsets (as all ORM and persistence frameworks invariably do at the end), you can access only one row at a time, but - behind the scenes - the driver fetches them in batches specified by the fetch size when he runs out of records from the previous batch. I therefore don't think that increases of the number of records processed by the framework are a good indication of the fetch size in effect.

I don't think there is substantial reason to assume that the fetch size is not being honored. It is possible to verify the fetch size by activating the SQL trace capability in the Oracle database. However, you need some privileges and knowledge to be able to do so. Another possible way to guess at that would be to record the rate of arrival of new records with fetch size set to 1 and 100 (repeat several times). If there is no clear difference between the two, probably the fetch size is not in effect. Make sure that there isn't other lengthy processing that would mask the time differences in the fetch operations, though.
+Pie Number of slices to send: Send
As Martin says, ask yourself if you really need to fetch 2 million records out of the database, across your network and into your application server. Can you get the final result you need via some clever SQL or PL/SQL in the database, which is designed to process large volumes of data efficiently, without moving them around your network?
+Pie Number of slices to send: Send
 

chris webster wrote:As Martin says, ask yourself if you really need to fetch 2 million records out of the database, across your network and into your application server. Can you get the final result you need via some clever SQL or PL/SQL in the database, which is designed to process large volumes of data efficiently, without moving them around your network?



Thanks for the suggestion chris.
But, As per the requirement I need to pull those records(2 million) through my application only. So I am looking for the above option(fetchSize) believing that might help. We already proposed PL/SQL approach but they need the records to be pulled via application.
+Pie Number of slices to send: Send
 

Martin Vajsar wrote:The first thing you should do would be to think really hard how to avoid fetching millions of records. Sometimes you really need to, but in many cases it would be feasible to move the processing to the database, either using a clever SQL statement, or to a stored procedure (in Oracle, you can create Java stored procedures, so you don't even need to know PL/SQL and perhaps could reuse some of the existing code). I'm pretty sure this would decrease the response time much more significantly than arriving at the optimal fetch size setting.

Regardless, the fetch size is a JDBC driver responsibility. When you use JDBC's recordsets (as all ORM and persistence frameworks invariably do at the end), you can access only one row at a time, but - behind the scenes - the driver fetches them in batches specified by the fetch size when he runs out of records from the previous batch. I therefore don't think that increases of the number of records processed by the framework are a good indication of the fetch size in effect.

I don't think there is substantial reason to assume that the fetch size is not being honored. It is possible to verify the fetch size by activating the SQL trace capability in the Oracle database. However, you need some privileges and knowledge to be able to do so. Another possible way to guess at that would be to record the rate of arrival of new records with fetch size set to 1 and 100 (repeat several times). If there is no clear difference between the two, probably the fetch size is not in effect. Make sure that there isn't other lengthy processing that would mask the time differences in the fetch operations, though.



Thanks Martin.
I tried setting fetch size from 1 to 100 and repeated the process, but it shows no difference (fetch size is not in effect). It would be great if is any alternative approach available(Not necessarily fetchsize setting) to pull huge amount of records through IBatis.
+Pie Number of slices to send: Send
 

Vijay Kandaswamy wrote:But, As per the requirement I need to pull those records(2 million) through my application only.


Why? That doesn't make sense.

So what you're really asking is: "Can I make pulling 2 million rows into my application any quicker?"

And the answer is: Probably not.

As Chris already said, databases are designed from the ground up to process large volumes of information; Java isn't.
So any solution that involves pulling vast quantities of raw data from your db into your program is unlikely to be optimal.

Winston
1
+Pie Number of slices to send: Send
 

Vijay Kandaswamy wrote:
Thanks for the suggestion chris.
But, As per the requirement I need to pull those records(2 million) through my application only. So I am looking for the above option(fetchSize) believing that might help. We already proposed PL/SQL approach but they need the records to be pulled via application.


So what is the requirement? For example, if you're doing something like creating a flat file (e.g. CSV), you could do this via PL/SQL and the UTL_FILE package - write a file straight from the database to an Oracle directory on the server, then just copy the file to wherever it needs to go. Alternatively, it's part of an ETL process, maybe using a proper ETL tool would be a better approach e.g. OWB, ODI, Informatica if you have them, or the open source Talend if you don't already have an ETL tool.

Just curious - I've seen plenty of situations where people insist they have to move vast amounts of data around, only to discover that they don't really need to!
+Pie Number of slices to send: Send
Thanks chris. Let me try with PL/SQL and the UTL_FILE package. Hope This would be Good solution for us.

chris webster wrote:

Vijay Kandaswamy wrote:
Thanks for the suggestion chris.
But, As per the requirement I need to pull those records(2 million) through my application only. So I am looking for the above option(fetchSize) believing that might help. We already proposed PL/SQL approach but they need the records to be pulled via application.


So what is the requirement? For example, if you're doing something like creating a flat file (e.g. CSV), you could do this via PL/SQL and the UTL_FILE package - write a file straight from the database to an Oracle directory on the server, then just copy the file to wherever it needs to go. Alternatively, it's part of an ETL process, maybe using a proper ETL tool would be a better approach e.g. OWB, ODI, Informatica if you have them, or the open source Talend if you don't already have an ETL tool.

Just curious - I've seen plenty of situations where people insist they have to move vast amounts of data around, only to discover that they don't really need to!

 
Did you just should on me? You should read this tiny ad:
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com


reply
reply
This thread has been viewed 12371 times.
Similar Threads
Performance Issue
does gc still have a chance to do its job if i kill the java process
[jersey/jax-rs] json object value string has extra quotes, jsonp, service chaining‏
Transfering 3rd party (unserialized) objects through web server.
Using Hibernate to insert into Oracle Blob, then read back out
More...

All times above are in ranch (not your local) time.
The current ranch time is
Mar 19, 2024 00:46:50.