posted 15 years ago
I have executed the following experiment:
- created several text files with English, Hungarian, Chinese, Japanese and
Korean name
- attempted to compress them using FilZip, WinZip and PKZip
- attempted to uncompress then using the above tools
My findings are:
- FilZip and WinZip cannot add files with non-English-only names (not even
Hungarian which uses Latin characters); they cannot list files
- PKZip can add add file with any names, but names are transformed: all
non-Western European accented Latin characters are converted to similar
character without accent (e.g. ű->u, ő->o) and all non-Latin characters are
converted to question marks; NOTE: Accented Western European characters are
preserved (e.g. áéíóöúüñ), thus Spanish is supported
- WinZip cannot list non-Western European file names, but can extract the
files when "Extract all" is selected; but non-Latin characters are replaced
with underscore (_); since all non-Western European Latin characters are
converted to non-accented Western European ones during compression, these files
are listed and extracted but without accents.
- FilZip and PKZip can display and extract all files but with transformation;
see above
Summary: ZIp format does not support Unicode in filenames. It might be possible
to pick one specific code page/character set that would be usable for a
specific language, but it is not know how as tested tools do not provide
control for this.
Solution: No real solution. As workaround, Spanish text should be used with all
accented characters replaced with non-accented relative (ú->u, ó->o, etc.) or
compress files using ISO8859P1 character set for filenames.
Note: PKZip is one of the first zip utilities for Windows; WinZip is the market
leader. If they cannot support Unicode, how could we?