We have a job in our application which reads data from DB and creates excel sheet out of it using POI.
The problem we are facing is sometimes data is too big (may be >20,000,00) which results in out of memory.
Job logic writes all data into excel(creates multiple sheets since most time data is >65,000 (max rows in excel)) in memory and is written to disk only after it completes writing all data.
Can you guys provide suggestions how to handle this issue?
One solution i can think of is to write 65k records to excel in memory and then flush to disk. Then load the same file and continue with writing next sheet (Not sure whether API supports this way?).
create multiple excel files each having 65k records and merge the files into one single file. Can you guys suggest how we can do this?
I am not 100% familiar with the Apache POI API, however is it possible to read only a limited number of records at a time, then resume reading? Separating a single file into batches of records should help improve your overall memory consumption.
In other words, you can first read the first 5000 records and perform whatever processing you require. Then read the next 5000 records, etc . . .