Win a copy of Murach's Python Programming this week in the Jython/Python forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Serializing a large array to disk.  RSS feed

 
esteban gomez
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

Which will be the most efficient way in terms of space to save a 3 dimensional array of float into disk.

I have made a test using serialization and my test case generate a 74 MB file. This is not an acceptable size for me, is there any alternative way to save the array data?

Then, manually, I compact the file using winRar and I got a 15 MB file, but Im looking to perform this from java.

The array dimensions are [61][366][722].
The code used was:

Thanks.

Regards.

Esteban.
 
manoj r patil
Ranch Hand
Posts: 182
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
May be you can customize serialization by always zipping the end file using java API and doing reverse while de-serializing it.
 
Rob Spoor
Sheriff
Posts: 20893
81
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Each float takes 8 bytes. That's a total of 61 * 366 * 722 * 4 = 64,477,488 bytes or 62,967 kB or 62 MB for the floats alone. Then there is overhead for the arrays etc. Therefore, you can't decrease the size the array requires itself.
By applying the zipping immediately (as mannoj suggested) you can skip the intermediate 62 MB file though:
Reading:
With all zeros, it takes just a fraction longer but decreases your size from 64,701,414 bytes to just 106,964 bytes - that is from 62 MB to just over 104 kB! However, some tests have shown that the size decreases can be a lot less depending on the contents. After filling each element with random data my GZIPped file was still 57,978,208 bytes.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!