• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

JDBC ResultSet limit - Need to process 3000000000

 
Balasubramaniam Muthusamy
Ranch Hand
Posts: 55
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello Everyone,
Could you please someone tell me what is the ResultSet limit? I have a program which needs to process almost 30 billion records. While running it it is fetching 30 billions and it could see them in the ResultSet object while debugging it. However, still it returns false while processing resultSet.next(). But when i reduced the limit to 1 billion it returns true.

could you please guide me how to resolve this issue? thanks in advance

Thanks
Bala
 
Christian Pflugradt
Ranch Hand
Posts: 74
5
Eclipse IDE Python Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
There is no limitation other than your memory as far as I know. Do you try to load all 30 billion records into one data structure or a you processing a few at a time? I think it would help if you past some of your code here so people get an idea what you are using (are you calling setFetchSize for example?). Not sure if I can help you further, though, but I get more of an idea what might be wrong if you showed your code.
 
Balasubramaniam Muthusamy
Ranch Hand
Posts: 55
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello,
Please find the code below
 
Christian Pflugradt
Ranch Hand
Posts: 74
5
Eclipse IDE Python Ubuntu
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Try if the following code changes anything. I've added con.setAutoCommit(false) and st.setFetchSize(100) which should lessen the impact on your memory.
 
Dave Tolls
Ranch Hand
Posts: 2112
16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What database are you hitting, and what exact driver (and version)?

Since you haven't set the default fetch size is usually around 10, so it won't be that causing an issue.
The resultSet doesn't know how many rows there are until it's brought back the last one.

I can't see auto commit being an issue. It's a straight forward SELECT against a regular ResultSet, with no updateable or scrollable nonsense.
I'm assuming here the JDBC driver used doesn't do some weird defaults for a ResultSet (like making it scrollable and/or updateable).

It could be an issue with the driver.
 
Balasubramaniam Muthusamy
Ranch Hand
Posts: 55
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks much for your replies. When i dig into this further resultSet.next returns true until the record count is 2147483646 and returns false when record count moves to 2147483647 which is the maximum of int value. is integer maximum value is limit of ResultSet?

Thanks
Bala
 
Tapas Chand
Ranch Hand
Posts: 583
8
BSD Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Christian Pflugradt wrote:Try if the following code changes anything. I've added con.setAutoCommit(false) and st.setFetchSize(100) which should lessen the impact on your memory.

Well I am not in disagreement with Christian Pflugradt.
But there are few things to note here.
Doc says
setFetchSize(int rows) Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for ResultSet objects genrated by this Statement.


But there are cases when database may choose to ignore hints even if they are valid. I have read this in few blogs like this one below
https://hoopercharles.wordpress.com/2011/09/30/invalid-hints-are-silently-ignored-an-invalid-use_hash-hint-transforms-a-sort-merge-join-into-a-nested-loops-join/

Now I am not sure whether hint to a JDBC driver and hint to a database are the same thing or they are different.
 
Tim Moores
Bartender
Posts: 3137
50
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm not a big fan of stored procedures, but for this many records I think you need to consider ways of processing them that are more efficient; stored procs are one way.
 
Dave Tolls
Ranch Hand
Posts: 2112
16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Balasubramaniam Muthusamy wrote:Thanks much for your replies. When i dig into this further resultSet.next returns true until the record count is 2147483646 and returns false when record count moves to 2147483647 which is the maximum of int value. is integer maximum value is limit of ResultSet?

Thanks
Bala


That is most likely the case.
ints are used for most Collections, which is probably what underlies the ResultSet.

So you need all these in one go, or is there a way of preprocessing in the SQL?
Or, possibly, Tim's SP suggestion?

That's a lot of data to take in in one shot.
 
Winston Gutkowski
Bartender
Pie
Posts: 10571
64
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Balasubramaniam Muthusamy wrote:I have a program which needs to process almost 30 billion records.

Well, I think everyone else has covered the mechanics, but I'd think about the cause, or background.

Why are you processing 30 billion rows? Because they're there? Because you can? (and you certainly can.)

I used to work on the Sceptre system, which was the maintenance schedule for every single rivet in every aircraft in the fleets of several airlines; and I seem to remember that was about 10-12 billion rows (nobody was absolutely sure).

So I wonder what it is (other than possibly a search engine) that would actually need to store 30 billion items, let alone process every single one of them in sequence.

However, if it turns out that you DO have to, I'd definitely go with Tim's suggestion, because this is precisely what databases were designed to do. Java wasn't.

I suspect very strongly that there's a much simpler alternative though.

Winston
 
meenakshi sundar
Ranch Hand
Posts: 128
Python Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I do not see a use case here ,its rare to see that kind of scenario of handling such a volume of data ,There are better way of handling /Manipulating data
you could think about splitting the data to minimize the latency in the architecture

*Moving some of the processing logic closer to the database ie you can think of writing a Stored procedure
*Use Caching

To the specific problem you mentioned

You can restrict the number of rows returned by setting the max rows returned in a Prepared Statement
Example:

ps.setMaxRows(50);


OR

You can use SQL statement instead of JDBC options ,pseudo column rownum comes handy here

SELECT... (SELECT A.*,rownum rnum FROM(SELECT ,column name1,column name2.......FROM "TABLE NAME")A WHERE rownum <= 50)WHERE rnum > 0

 
Balasubramaniam Muthusamy
Ranch Hand
Posts: 55
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The issue has been resolved by upgrading the JDBC jar to next version and now I am able to process the expected records. thanks every one for their reply and suggestion.
 
Tapas Chand
Ranch Hand
Posts: 583
8
BSD Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Balasubramaniam Muthusamy wrote:The issue has been resolved by upgrading the JDBC jar to next version and now I am able to process the expected records. thanks every one for their reply and suggestion.

That is good.
Which jar was being used previously and which jar you are using now?
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic