• Post Reply Bookmark Topic Watch Topic
  • New Topic

accentuation in zip files  RSS feed

 
Edipo Faria
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Hi,

My problem is the accentuation of the files and folder inside of the archive zip!

On my research, i found the "java.util.zip.ZipEntry", but it have a bug which do not allows accentuation.
That bug will be corrected on the JDK 7, but i did not have time to wait...

Now i am try to use the api of apache, specifically the: http://commons.apache.org/compress/

It had a method tha we can set the type of enconding, but i have not had sucess using it...

Here is an exemple of what i am trying to do:
http://pastebin.com/A9TpuuY7

The files:
"abc/êêñ.html";
"ççããââ.html";

will result in:
"abc/ÛÛ±.html"
"þþÒÒÔÔ.html"

Any suggestions?
 
Paul Clapham
Sheriff
Posts: 22841
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
And does that API use the encoding to encode the names of the files?

The problem here is that the zip format was designed back in the days when file names could only be ASCII. So when files with non-ASCII names started to show up, various systems used various ways to deal with that, and those various ways were not all consistent with each other. So that caused problems like what you're seeing there.

If that API does in fact use the encoding to encode the names of the files, then make sure that the encoding you choose is the same one which is going to be used by whoever tries to unzip the archive.
 
Edipo Faria
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
probably yes,

when i change the string passed on the method setEncoding(), the caracters of the zip file change, but on the list of possibles strings, none produced what i expected.

my files will be used in the "windows xp sp2".

it is not necessary to use only this API, any one that make what i need is valid!
 
Paul Clapham
Sheriff
Posts: 22841
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Edipo Faria wrote:probably yes


Let me ask my question differently, then.

Does the documentation for that method specifically say that the encoding will be used to encode file names? (Note that this is a Yes/No question.)
 
Edipo Faria
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
yes
on the description of the method
"The encoding to use for filenames and the file comment."
 
Paul Clapham
Sheriff
Posts: 22841
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Okay, then. I just noticed that you never said what encodings you tried, which is the most important question. Based on Supported Encodings, and assuming that the Windows environment in question is a Western European environment (not Japanese or Cyrillic or something), I would have tried CP1252.
 
Edipo Faria
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
yes!
i got it!
thanks very much!

I used the encodings "Cp850"(MS-DOS Latin-1) and "Cp858" (Variant of Cp850 with Euro character)

was necessary to try all types of encodding .

With much patience, i found these two.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!