• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Rob Spoor
  • Tim Cooke
  • Junilu Lacar
Sheriffs:
  • Henry Wong
  • Liutauras Vilda
  • Jeanne Boyarsky
Saloon Keepers:
  • Jesse Silverman
  • Tim Holloway
  • Stephan van Hulst
  • Tim Moores
  • Carey Brown
Bartenders:
  • Al Hobbs
  • Mikalai Zaikin
  • Piet Souris

Corrupt file name after compression.

 
Ranch Hand
Posts: 419
Mac jQuery Objective C
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I am using the following code to zip a file with name as shown in attachment. The file name consist of spanish characters. After compression file name is different an some characters are shown as +- as shown in attachment. Can any body tell me how to resolve it?

>
zip-pic.JPG
[Thumbnail for zip-pic.JPG]
File names
 
Marshal
Posts: 74004
332
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Don't know. But it may be worthwhile unzipping the file and seeing whether it is restored correctly. Some characters don't display correctly on screen; Windows seems to be worse for that than other operating systems.
 
pawan chopra
Ranch Hand
Posts: 419
Mac jQuery Objective C
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have tried unzipping them but it is not restored correctly. I am able to see the correct file name when file is not compressed don't know why it is happening after compression only.
 
Campbell Ritchie
Marshal
Posts: 74004
332
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Don't know. Sorry.
 
pawan chopra
Ranch Hand
Posts: 419
Mac jQuery Objective C
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I think I have got the problem. I have seen the code in ZipOutPutStream It gets byte array of file name and then write the name of the file. I have tried doing the same in the below program. It prints negative value for special characters like é.





Can any one tel me solution how to do this how can I resolve this.
 
Campbell Ritchie
Marshal
Posts: 74004
332
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Bytes run from -128 (0x80) to +127 (0x7f). The characters used in Western European languages other than English are in the range 0x80-0xff, so they are regarded by two's complement as negative numbers. You can find the numbers in Unicode (1) and (2). I note some of those characters in no (2) are control characters.

Not sure what you are supposed to do next, but it has something to do with casting to a char, or casting to a char and doing a bitwise AND (&) with 0xff.
 
Rancher
Posts: 43026
76
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Not sure if it's related, but there's a longstanding problem with the java.util.zip package in that it doesn't deal well with non-ASCII filenames. See http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4244499 for more information.
 
pawan chopra
Ranch Hand
Posts: 419
Mac jQuery Objective C
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:
Not sure what you are supposed to do next, but it has something to do with casting to a char, or casting to a char and doing a bitwise AND (&) with 0xff.



Actually java api is doing all this which is related to file name I am not sure how to implement this functionality with rest of the features working same. Kindly suggest me what can be the solution for this thing?

 
Ulf Dittmer
Rancher
Posts: 43026
76
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Have you read the bug report I linked to? Are you certain that that is not the problem?
 
pawan chopra
Ranch Hand
Posts: 419
Mac jQuery Objective C
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Ulf Dittmer wrote:Have you read the bug report I linked to? Are you certain that that is not the problem?




thanks for the link Ulf. yes I am facing the same problem but I am very much surprised that after 7 years still that bug exists. That bug was reported in 2001. Any specific reason for not fixing that bug in case You know?
 
pawan chopra
Ranch Hand
Posts: 419
Mac jQuery Objective C
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have executed the following experiment:
- created several text files with English, Hungarian, Chinese, Japanese and
Korean name
- attempted to compress them using FilZip, WinZip and PKZip
- attempted to uncompress then using the above tools
My findings are:
- FilZip and WinZip cannot add files with non-English-only names (not even
Hungarian which uses Latin characters); they cannot list files
- PKZip can add add file with any names, but names are transformed: all
non-Western European accented Latin characters are converted to similar
character without accent (e.g. ű->u, ő->o) and all non-Latin characters are
converted to question marks; NOTE: Accented Western European characters are
preserved (e.g. áéíóöúüñ), thus Spanish is supported
- WinZip cannot list non-Western European file names, but can extract the
files when "Extract all" is selected; but non-Latin characters are replaced
with underscore (_); since all non-Western European Latin characters are
converted to non-accented Western European ones during compression, these files
are listed and extracted but without accents.
- FilZip and PKZip can display and extract all files but with transformation;
see above

Summary: ZIp format does not support Unicode in filenames. It might be possible
to pick one specific code page/character set that would be usable for a
specific language, but it is not know how as tested tools do not provide
control for this.

Solution: No real solution. As workaround, Spanish text should be used with all
accented characters replaced with non-accented relative (ú->u, ó->o, etc.) or
compress files using ISO8859P1 character set for filenames.

Note: PKZip is one of the first zip utilities for Windows; WinZip is the market
leader. If they cannot support Unicode, how could we?
 
Bring me the box labeled "thinking cap" ... and then read this tiny ad:
Thread Boost feature
https://coderanch.com/t/674455/Thread-Boost-feature
reply
    Bookmark Topic Watch Topic
  • New Topic