• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Jeanne Boyarsky
  • Devaka Cooray
  • Paul Clapham
Sheriffs:
  • Tim Cooke
  • Knute Snortum
  • Bear Bibeault
Saloon Keepers:
  • Ron McLeod
  • Tim Moores
  • Stephan van Hulst
  • Piet Souris
  • Ganesh Patekar
Bartenders:
  • Frits Walraven
  • Carey Brown
  • Tim Holloway

special characters in xml file

 
Ranch Hand
Posts: 176
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
These spanish characters are causing my xml parser to crash
<TEXT>�Prefieres tus facturas en espa�ol? Llama al 1-866-xxxxx</TEXT>
<TEXT>para m�s detalles.</TEXT>.
giving me grief
org.xml.sax.SAXParseException: Character conversion error: "Unconvertible UTF-8 character beginning with 0xbf" (line number may be too low).
at org.apache.crimson.parser.InputEntity.fatal(InputEntity.java:1100)
at org.apache.crimson.parser.InputEntity.fillbuf(InputEntity.java:1072)
at org.apache.crimson.parser.InputEntity.isXmlDeclOrTextDeclPrefix(InputEntity.java:914)
at org.apache.crimson.parser.Parser2.maybeXmlDecl(Parser2.java:1048)
at org.apache.crimson.parser.Parser2.parseInternal(Parser2.java:520)
at org.apache.crimson.parser.Parser2.parse(Parser2.java:318)
at org.apache.crimson.parser.XMLReaderImpl.parse(XMLReaderImpl.java:442)
at org.apache.crimson.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:185)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:151)
at edocs.xcc.parser.implementation.ParserImp.parse(ParserImp.java:52)
at com.nortel.b2b.dps.parser.xmlparser.planCreator.create(planCreator.java:31)
at com.nortel.b2b.dps.parser.xmlparser.ObjectAgent.run(ObjectAgent.java:60)

Is there a way i can convert these special characters to ? or something like
that
 
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I just went through something similar. Apparently it was caused by the way I was setting up the parser input causing the wrong character conversion to be applied. I had been just feeding the parse method the "ins" InputStream from opening a FileInputStream, which caused an exception similar to yours. Using the following code, which creates a Reader and specifies UTF-8 encoding, worked ok.


... etc etc catching various parse exceptions and checking the line number from the LineNumberReader.
Bill
 
Bhasker Reddy
Ranch Hand
Posts: 176
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Do you guys know any good books for SAX PARSER or any good resouces on internet. Please let me know. I have to start working on converting a Parsing and converting xml file to a preprocessed text file.
--Thanks for your help
 
William Brogden
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, as far as I know, the gold standard for books is still Harold's "Processing XML with Java". There are plenty of free tutorials on the web, doing a Google search for "xml java tutorial" found the Sun tutorial and the entire contents of Harold's book.
Bill
 
Bhasker Reddy
Ranch Hand
Posts: 176
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
My parse method only takes a fileName and the parser. How do I convert inputstreamREader into a File?
 
author
Posts: 11962
5
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Originally posted by Bhasker Reddy:
My parse method only takes a fileName and the parser. How do I convert inputstreamREader into a File?


You don't. Except if you're willing to read the InputStreamReader's contents into a file...

The problem is apparently inside the parsing method so that's what you should be fixing (i.e. that's where you should use the InputStreamReader).
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!