Granny's Programming Pearls
"inside of every large program is a small program struggling to get out"
JavaRanch.com/granny.jsp
Win a copy of Reactive Streams in Java: Concurrency with RxJava, Reactor, and Akka Streams this week in the Reactive Progamming forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Junilu Lacar
  • Jeanne Boyarsky
  • Bear Bibeault
Sheriffs:
  • Knute Snortum
  • Tim Cooke
  • Devaka Cooray
Saloon Keepers:
  • Ron McLeod
  • Stephan van Hulst
  • Tim Moores
  • Tim Holloway
  • Carey Brown
Bartenders:
  • Piet Souris
  • Frits Walraven
  • Ganesh Patekar

POI: OutOfMemoryError on reading big excel file

 
Ranch Hand
Posts: 30
Spring Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am getting an OutOfMemoryError while trying to read a excel file(.xlsm) of size ~2Mb.
I am running this application on a websphere server. Is there any way i could read this file with out changing any server configurations?

Have tried to use this approach, but the red line gave me OutOfMemoryError.

OPCPackage pkg = OPCPackage.open(new File("sample.xlsm"));
XSSFWorkbook wb_template;
wb_template = new XSSFWorkbook(
pkg
);

SXSSFWorkbook wb = new SXSSFWorkbook(wb_template);
 
Ranch Hand
Posts: 10191
3
Mac PPC Eclipse IDE Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Is increasing server heap space an option? If yes, try it!
 
Renjith Panikar
Ranch Hand
Posts: 30
Spring Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I can do it for local deployment(in my machine), but dont have control over the server to which it will be deployed after development.
 
Rancher
Posts: 43011
76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
From the documentation I gather that SXSSF is meant more for creating documents, not reading them: http://poi.apache.org/spreadsheet/how-to.html#sxssf

A low-memory approach to reading would be the XSSF event API: http://poi.apache.org/spreadsheet/how-to.html#xssf_sax_api

2 MB doesn't sound all that much, though. What are the JVM memory settings of the server?
 
Renjith Panikar
Ranch Hand
Posts: 30
Spring Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Not sure about the current heap size. It will be the default of websphere 7.
Even increasing the heap size as given below dint helped me.
Initial heap size = 256 MB
Maximum heap size = 2048 MB


Planning to go with http://poi.apache.org/spreadsheet/how-to.html#xssf_sax_api
 
Renjith Panikar
Ranch Hand
Posts: 30
Spring Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What is relId of a sheet?
I couldn't understand from apache doc.
I tried with both sheetName & Sheet number. Both dint worked out.

--apache doc
public java.io.InputStream getSheet(java.lang.String relId)
throws java.io.IOException,
InvalidFormatException
Returns an InputStream to read the contents of the specified Sheet.
Parameters:
relId - The relationId of the sheet, from a r:id on the workbook
 
Ulf Dittmer
Rancher
Posts: 43011
76
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Not sure. Maybe open the file manually (xlsx files are just zip files containing various XML documents) and see if somewhere in there you find a sheet attribute that looks like "r:id". (And, of course, see what that value is in various spreadsheets so as to figure out the logic behind it.)
 
Renjith Panikar
Ranch Hand
Posts: 30
Spring Java
  • Likes 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sorry for the very very late update on this thread, but better late than never.
I have resolved this issue. Hope if i share it, it will be helpful for some one else.

Thanks Ulf Dittmer, followign reply helped me to resolve it

Maybe open the file manually (xlsx files are just zip files containing various XML documents) and see if somewhere in there you find a sheet attribute that looks like "r:id"



In detail,

Rename the .xlsx file as .zip. It will give you a zip file. Open it and you will see a folder called "xl". Inside "xl" folder concentrate on the following things.

worksheets folder : contains the xml representation of each sheet.
sharedStrings.xml : Common place for storing all the string values across sheets.
styles.xml : common place for holding the styles(color, formula...) applied to cells.
workbook.xml : Contains the information about all sheets in the .xlsx file


Now start the journey from workbook.xm. Inside this file all sheets are covered in <sheets> tag.
sheets are represented as <sheet> Eg: <sheet name="General" sheetId="15" r:id="rId1"/>
Take r:id, that will help you to access the General.xml which represents the General sheet of .xlsx file.

Once you have r:id, http://poi.apache.org/spreadsheet/how-to.html#xssf_sax_api will help you to read the XML file and get its contents.

SharedStringsTable sharedStringsTable = reader.getSharedStringsTable(); // using this we can access sharedStrings.xml
StylesTable stylesTable = reader.getStylesTable(); // using this we can access styles.xml

Since we are streaming the XML, we dont have to worry about memory consumption.

 
Ulf Dittmer
Rancher
Posts: 43011
76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for reporting back how the issue was solved; that should help others.
 
And inside of my fortune cookie was this tiny ad:
Java file APIs (DOC, XLS, PDF, and many more)
https://products.aspose.com/total/java
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!