Win a copy of Spring in Action (5th edition) this week in the Spring forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Bear Bibeault
  • Devaka Cooray
  • Liutauras Vilda
  • Jeanne Boyarsky
Sheriffs:
  • Knute Snortum
  • Junilu Lacar
  • paul wheaton
Saloon Keepers:
  • Ganesh Patekar
  • Frits Walraven
  • Tim Moores
  • Ron McLeod
  • Carey Brown
Bartenders:
  • Stephan van Hulst
  • salvin francis
  • Tim Holloway

Unicode Char Problem  RSS feed

 
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I XML have file which contains é char when i m streaming this file with File contents loss so my XML file Became in validate.

I have Input Stream avaliable to me

File f = new File("fos.xml");
FileOutputStream fos = new FileOutputStream(f);


InputStreamReader br = new InputStreamReader(returnStr,"UTF-8");
StringBuffer documentContent = new StringBuffer();
int len;
byte[] buf = new byte[1024];
while((len = (returnStr.read(buf)))!= -1){
//documentContent.append(line);
fos.write(buf,0,len);
}

IDocument = IDocumentBuilder.parse(f);

where returnStr is my InputStream

please Help me
 
Ranch Hand
Posts: 781
Java Netbeans IDE Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The chances are that your XML file is not UTF-8 encoded. I don't know anything about IDocumentBuilder but I bet it has an method that takes an InputStream. If so then you don't need to create a String from the InputStream - just feed it the InputStream and get it (the IDocumentBuilder) to work out the correct character encoding from the <?xml version="1.0" encoding="some encodiing"?> prefix.

Note - if there is no encoding="some encodiing" in thje prefix then UTF-8 is assumed. You have to make sure that the file encoding matches that specified in the prefix and if it is not specified that the file is UTF-8 encoded.
 
Hemali da
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
My XML file starts with

<?xml version="1.0" encoding="ISO-8859-1"?>

and i have tried it for by passing InputStream but facing same problem.
 
Ranch Hand
Posts: 48
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Please try with UTF-8 in your xml file.


Also, Please ensure the InputStream 'returnStr' is originally in UTF-8 encoding.
 
Hemali da
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I can't change xml like that way.
 
James Sabre
Ranch Hand
Posts: 781
Java Netbeans IDE Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Hemali da wrote:My XML file starts with

<?xml version="1.0" encoding="ISO-8859-1"?>

and i have tried it for by passing InputStream but facing same problem.



Since your encoding is ISO-8858-1 then you should use


though I'm surprised that parser cannot utilise the encoding specified in the prefix.

Are you sure that the file actually is ISO-8859-1 encoded? In the past I have been sent XML files that purport to being one encoding (as specified by the encoding="xxxx") but are actually a different encoding.
 
Raj S Kumar
Ranch Hand
Posts: 48
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hemali, please correct me if I am wrong.

In the code, you are trying to write the InputStream 'returnStr' to fos.xml. is it correct?

Here is an example which reads from a file and writing in another file. Specify the charSet in both Input & Output streams.

  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!