• Post Reply Bookmark Topic Watch Topic
  • New Topic

Excel is getting corrupt : Apache poi version poi-3.9  RSS feed

 
RatiKanta pal
Ranch Hand
Posts: 88
Eclipse IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi all,
Please find the following snip-net of code.The issue is when i am writing data 1 Million records with 60 column its working fine the moment i am double the columns to 120 excel file is getting corrupt.What I am doing wrong here.

I have also attached the screen sort of corrupt message.




excel_failed.jpg
[Thumbnail for excel_failed.jpg]
 
Ivan Jozsef Balazs
Rancher
Posts: 999
5
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Did you take a look at the generated file? Maybe you hit some limits of the format?
 
RatiKanta pal
Ranch Hand
Posts: 88
Eclipse IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,
If I am clicking yes on recover mode.Its getting open and i am also able to add more rows and column.Limit may be not the issue.
 
Ulf Dittmer
Rancher
Posts: 42972
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Not sure what's going on, but spreadsheets apps and their associated file formats are for human use - no human can (or wants) to work with 1 million rows of data. So I'd suggest to rethink what you're using the file for, and maybe come up with other strategies of storing and processing this data.
 
RatiKanta pal
Ranch Hand
Posts: 88
Eclipse IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,
We are developing a reporting tool.More than one million records are expected not all case but in some case.

I read somewhere instead of writing excel write its internal xml representation for solving this problem,But still same problem.

 
Ulf Dittmer
Rancher
Posts: 42972
73
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
More than one million records are expected

That's what I'd call a failed user design. Nobody can or wants to go through that much data - if you need statistics on it, generate them by the same process that generates the data now. But don't force people to handle all of it, that's what computers are for.
 
Campbell Ritchie
Marshal
Posts: 56518
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Putting one million records into anything suggests you need to get a database instead.

This is too difficult a question for “beginning” so I shall move it.
 
Ivan Jozsef Balazs
Rancher
Posts: 999
5
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
RatiKanta pal wrote:
I read somewhere instead of writing excel write its internal xml representation


Excel uses either an old style binary file format (giving the name to POI for that matter) which is more prone to restrictions or the newer, open office-inspired XML format. The latter might not have those stricter limitations of the file format, but hey, even if you manage to store millions of rows, the user interface program might choke on it. Rightly so.

You might be able to use that XML format to store the huge amount of data for other programs, but not really for direct human interface presentation. Having said that I won't imply it is a good choice over, say, a database.
 
Dave Tolls
Ranch Foreman
Posts: 3056
37
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
RatiKanta pal wrote:Hi,
We are developing a reporting tool.More than one million records are expected not all case but in some case.


That's not a report, that's a monster!

FrankenReport!

Where's my pitchfork and burning torch?
 
RatiKanta pal
Ranch Hand
Posts: 88
Eclipse IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi All,
Try to understand the scenario.Its banking data.We have some yearly reports to generate. Million records are normal guys.

In short we need to catch the monster .
 
Ulf Dittmer
Rancher
Posts: 42972
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Having a million records is normal. That's actually not a large data set. What's not normal is wanting to put them in Excel files, or presenting them to humans.
 
Ivan Jozsef Balazs
Rancher
Posts: 999
5
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The problem is not that amount of the data but the intention to put them into Excel format.
 
RatiKanta pal
Ranch Hand
Posts: 88
Eclipse IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ulf Dittmer wrote:Having a million records is normal. That's actually not a large data set. What's not normal is wanting to put them in Excel files, or presenting them to humans.


Is there any way to do this.
 
Dave Tolls
Ranch Foreman
Posts: 3056
37
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
RatiKanta pal wrote:
Ulf Dittmer wrote:Having a million records is normal. That's actually not a large data set. What's not normal is wanting to put them in Excel files, or presenting them to humans.


Is there any way to do this.


Decide what the report is supposed to show and then put that summary data into the Excel sheet.
That data can be put together via queries on your database.

You're not alone here. I've seen too many places where the solution to their reporting was "stick it all in Excel and get Excel to produce the summary". And it can be a slog to convince them that that is madness.

If it's a banking system then there is a good database there that you are already drawing this data from.

In fact, I'd go down the "security of confidential information" route, as this is bank-related. IMO for reporting you should not be getting more data out of the database than is needed in the report. And you definitely should not be sticking tons of potentially sensitive data into Excel if you can possibly avoid it (and you always can).
 
RatiKanta pal
Ranch Hand
Posts: 88
Eclipse IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dear All,
I ts being quiet frustrating .. can any one help me please!
 
Tony Docherty
Bartender
Posts: 3271
82
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
RatiKanta pal wrote:Dear All,
I ts being quiet frustrating .. can any one help me please!

People are trying to help you but you are ignoring their advice.
 
Steve Fahlbusch
Bartender
Posts: 612
7
Mac OS X Python
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Greetings,

I think it is time to step back and get some perspective.

What the great people have been trying to tell you is that your approach may not be the best.

You are attempting to do something that excel and poi were really not meant to do. But you are more than welcome to do so.....

So if you really want to do this in excel i see two possible ways to complete this, with three options.

From you, when you doubled the columns the build of the spreadsheet exceptioned out, so make 2 smaller spreadsheets and combine them. If you don't like that then get the poi code and update it so that can handle your data requirement ( but that of course will mean that future updates of poi will not contain your updates) so you could join the poi group and help everyone that uses poi.


Or you could do the reasonable thing and use a database to manage the detail records and export summary or summed records to excel (if necessary) or you could always export this to an access database (which holds up to 2 billion records) and produce better reports and charts.

hope this helps,
-steve
 
Dave Tolls
Ranch Foreman
Posts: 3056
37
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Now, I've never used POI for quite this size of data before, but why are you saving the sheet(s) into temp files?
You already have them in a workbook.
And then you're mucking about with the XML, "by hand", which I would expect to be a source of bugs.
I wouldn't be at all surprised if there's something in that SpreadsheetWriter that is breaking things for the larger data.

I'm not sure what exactly this code is trying to achieve, but from a brief look it would worry me a bit. Then again I am slightly paranoid about these things.

By the way:

Why are there two writers, 'out' and '_out'?
 
Tony Docherty
Bartender
Posts: 3271
82
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dave Tolls wrote:By the way:

Why are there two writers, 'out' and '_out'?

That could be down to the forum's code box as it occasionally leaves out underscores - it's a strange bug that is known about, hopefully there will be a fix soon.
 
Ulf Dittmer
Rancher
Posts: 42972
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes, I'm seeing


But I concur that no matter what else is done, mucking around with a structured file format on this low a level seems a bad idea.
 
Dave Tolls
Ranch Foreman
Posts: 3056
37
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Tony Docherty wrote:
Dave Tolls wrote:By the way:

Why are there two writers, 'out' and '_out'?

That could be down to the forum's code box as it occasionally leaves out underscores - it's a strange bug that is known about, hopefully there will be a fix soon.


Ah, that makes sense because otherwise this looks like it wouldn't compile.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!