• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Running Out of Memory

 
Corey McGlone
Ranch Hand
Posts: 3271
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm working on a utility that will take in a file full of ID numbers, check each ID number against a database to see if it has been modified, and then mail a CSV file back to the requester with the results. It's quite possible that the file being uploaded would be very large but, when I get to more than roughly 20,000 ID numbers in that file, I constantly run into OutOfMemoryExceptions.

I've tried to do all sorts of things to reduce memory consumption but, no matter what I do, I always run out of memory at the same stage. Even when I make changes that should result in less memory usage, I don't see any change in performance, which I find very confusing. Here's a snippet of some of the relevant code I'm using. Any thoughts on what I could do to make this more efficient would be much appreciated.

I've tried periodically calling System.gc(), as well, but it made no impact on the result and degraded my performance considerably, so I've removed it.

Thanks, folks.

 
fred rosenberger
lowercase baba
Bartender
Posts: 12196
35
Chrome Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
you can use the -Xmx and -Xms to increase your heap size. This may not be a permanent fix, but it might let you limp through your initial struggles.

type "java -X" to see how to use it...
 
Corey McGlone
Ranch Hand
Posts: 3271
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
fred rosenberger wrote:you can use the -Xmx and -Xms to increase your heap size. This may not be a permanent fix, but it might let you limp through your initial struggles.


I appreciate that but, unfortunately, that's not an option. This code is running as a J2EE application and is hosted on a box which I do not have control over. My chances of getting them to raise the heap size for my application on that box are...well...let's say I'd be better off playing the lottery.
 
Paul Clapham
Sheriff
Posts: 21416
33
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It isn't obvious to me why you have to read all 20,000 strings into memory before processing them. Couldn't you just do something like "read a string, process a string, until end of file"?

Although a mere 20,000 strings shouldn't blow out your memory unless they are extraordinarily long. So there's probably more to it than just that. I would recommend profiling the application but I suspect you might have problems arranging that too.
 
Corey McGlone
Ranch Hand
Posts: 3271
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Paul Clapham wrote:It isn't obvious to me why you have to read all 20,000 strings into memory before processing them. Couldn't you just do something like "read a string, process a string, until end of file"?


I've considered this, as well. The reason I read all the Strings (which are numeric values from 6-8 digits long, so they're not horribly large) into memory is so that I can get an accurate count of how many there are. This allows me to provide progress statistics. I read the numbers into a list and then set the original file to null so the "net memory usage increase" should be negligible.

Even still, like you said, 20,000 Strings of that size shouldn't really be causing that much of an issue.
 
John de Michele
Rancher
Posts: 600
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Corey McGlone wrote:I've considered this, as well. The reason I read all the Strings (which are numeric values from 6-8 digits long, so they're not horribly large) into memory is so that I can get an accurate count of how many there are. This allows me to provide progress statistics. I read the numbers into a list and then set the original file to null so the "net memory usage increase" should be negligible.

Even still, like you said, 20,000 Strings of that size shouldn't really be causing that much of an issue.


It would seem to me that keeping a running count would be just as accurate, and a much better use of resources. What happens when your file gets to 200,000 lines, or 2,000,000?

John.
 
Corey McGlone
Ranch Hand
Posts: 3271
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
John de Michele wrote:It would seem to me that keeping a running count would be just as accurate, and a much better use of resources. What happens when your file gets to 200,000 lines, or 2,000,000?


I do keep a running count of the number of lines I've processed (that's the variable "totalRowsProcessed"). But, in order to know my progress, which is expressed as a percentage, I need to know how many rows were in the original file as well.
 
Steve Fahlbusch
Bartender
Posts: 605
7
Mac OS X Python
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Use adaptive calculations. The bytes process / bytes in file should act as a good approximation of the % complete.

Get the size of the file from the OS.

Keep a count of the bytes read. Bytes read / Size of file * 100 = percent of completion.

 
Corey McGlone
Ranch Hand
Posts: 3271
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Steve Fahlbusch wrote:Use adaptive calculations. The bytes process / bytes in file should act as a good approximation of the % complete.

Get the size of the file from the OS.

Keep a count of the bytes read. Bytes read / Size of file * 100 = percent of completion.


I didn't really figure this was where my issue was - after all, once the numbers are loaded into that list, the list doesn't grow. If I'm getting past this portion of my code (which I know I am), then this shouldn't be an issue.

That said, this is using memory that isn't absolutely necessary to use so I went ahead and tried this. I read in from my file one byte at a time and process each number as I get it. This got me all the way from 15% to 16% before dying with an OutOfMemoryException.

Given that, I'm pretty sure this isn't where my problem lies.
 
Corey McGlone
Ranch Hand
Posts: 3271
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Just to close up this thread - I found my issue today. As usual, the problem lied in code that I wasn't even looking at. It was inside this method:



Inside that method, I was creating a CallableStatement and using it to invoke a stored procedure on the database I'm connected to. After processing, I was properly closing the ResultSet, but I had forgotten to close the CallableStatment. Adding cStmt.close() fixed all my memory issues.

Thanks for all the help, everyone.
 
Rahul P Kumar
Ranch Hand
Posts: 188
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Corey McGlone wrote:Just to close up this thread - I found my issue today. As usual, the problem lied in code that I wasn't even looking at. It was inside this method:



Inside that method, I was creating a CallableStatement and using it to invoke a stored procedure on the database I'm connected to. After processing, I was properly closing the ResultSet, but I had forgotten to close the CallableStatment. Adding cStmt.close() fixed all my memory issues.

Thanks for all the help, everyone.
Do you or anyone have any explanation on that, how not closing a CallableStatement causes out of memory error?
 
Mark Uppeteer
Ranch Hand
Posts: 159
C++ Eclipse IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
the javadoc of the close method says:

"Releases this Statement object's database and JDBC resources immediately instead of waiting for this to happen when it is automatically closed. It is generally good practice to release resources as soon as you are finished with them to avoid tying up database resources. "
javadoc of the close method
I am tempted to believe them,you never know what's under the hood ;)
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic