• Post Reply Bookmark Topic Watch Topic
  • New Topic

Encoding problem  RSS feed

 
Manuraj Kannanth
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have a file whose contents are written in french. I want to read that file and edit the file and then write the contents back to the same file. How do I do that?

I am reading file content is instanceof ByteBlock only. Then I am using
byte[] blk = ((ByteBlock) content).bytes();
FileOutputStream fs = new FileOutputStream("a.txt");
fs.write (blk, 0, blk.length);
fs.close();
During writing a file I want use encoding like ISO-8859-1. I can't use Writer because my contents are byte stream. Same way I can't use BufferedOutputStream or DataOutputStream because its not supporting encoding. How do I solve this. Please help me.
 
Joe Ess
Bartender
Posts: 9436
12
Linux Mac OS X Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you aren't translating the bytes into characters, there's no encoding involved. you should have a look at this article for the basics of working with Unicode and The Java Tutorial IO chapter for how to use IO in Java.
 
Manuraj Kannanth
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,
I converted my byte array to character array. Then I wrote
char[] c = new char[blk.length];
for (int i = 0; i < blk.length; i++) {
c[i] = (char)blk[i];
}
FileOutputStream fos = new FileOutputStream("a.txt");
OutputStreamWriter osw = new OutputStreamWriter(fos, "ISO-8859-1");
BufferedWriter bw =
new BufferedWriter(osw);
bw.write (c, 0, c.length);
bw.close();

Stil I am not getting exact french words. It is coming as mat?riel instead of mat�riel. So please help me regarding this.
 
Joe Ess
Bartender
Posts: 9436
12
Linux Mac OS X Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Manuraj Kannanth:

I converted my byte array to character array.


And there lies the problem. By converting the bytes to character you are using the default encoding on your system. If there is no character to represent the particular byte value, you get a "?".
Why did you change to using a writer rather than an output stream? If the byte array is properly encoded it should just work.
Are you certain that the application you are using to view the output file supports Unicode? As Joel's Unicode article I linked to earlier points out, there are many stupid applications out there that don't support Unicode properly.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!