Win a copy of Functional Reactive Programming this week in the Other Languages forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

problem while unzipping the non-ascii characters

 
anil prakash
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
we are not able to retain the greek characters after zipping the files(that contain greek characters). the following code shows u a clear sketch about how we are zipping the files on solaris platform.
ZipOutputStream out = null;
try
{



FileOutputStream f = new FileOutputStream(zipFileName);

out = new ZipOutputStream(new BufferedOutputStream(f));


for(int i = 0; i < fileNameLists.getNumElements(); i++)
{


BufferedReader in = new BufferedReader( new FileReader(fileNameLists.getNameAt(i)) );

out.putNextEntry(new ZipEntry(getFileName(fileNameLists.getNameAt(i))));
int c;
while((c = in.read()) != -1)
{
out.write(c);
}
in.close();
}
out.close();
if we are directly open the file which is stored in specified location its displaying greek characters(non-ascii) correctly but our application need to zip those files and save on windows platform.
so once we zip the files and download onto windows platform and extract the files its shows some garbage characters(may be using default character encoding of windows cp1252) instead of greek charcaters.

we tried in many ways using the setEncoding("iso-8859-7") and setEncoding("UTF-8") method in ZipOutPutStream even then no use.

is it because that zip utility while reading takes default character encoding of the platform?
wud be thankful if anyone cud provive solution for this
 
Ulf Dittmer
Rancher
Posts: 42968
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
There are many bugs open in the Java bug database concerning zip files having non-ASCII characters in the file names.

I haven't heard of problems with such characters in the file contents, though. That would be a very serious bug.
 
Vlado Zajac
Ranch Hand
Posts: 245
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Don't use FileReader (or any Reader or Writer at all), use FileInputStream
and BufferedInputStream.

Mixing Reader and OutputStream without proper conversion is bad thing because Readers (and Writers) use characters (unicode) and Streams use bytes.
[ April 26, 2006: Message edited by: Vlado Zajac ]
 
Edwin Dalorzo
Ranch Hand
Posts: 961
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Mixing Reader and OutputStream without proper conversion is bad thing because Readers (and Writers) use characters (unicode) and Streams use bytes.


Actually the encoding used by readers depend on the encoding set in the System property "file.encoding". It is also possible to configure a reader to use another encoding, by means of the InputStreamReader and OutputStreamWriter constructors.

The problem actually looks like an encoding issue.

When saving characters by means of a Writer use the OutpuStreamWriter constructor that let's you define the enconding. And when getting the files back, use the InputStreamReader constructor with the same encoding.

Somewhat like this for writing the file:


And somewhat like this to read it back:


And I guess the zipping and uziping should no be an issue.
[ April 26, 2006: Message edited by: Edwin Dalorzo ]
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic