This week's giveaway is in the JDBC forum.
We're giving away four copies of Java Database Connections & Transactions (e-book only) and have Marco Behler on-line!
See this thread for details.
Win a copy of Java Database Connections & Transactions (e-book only) this week in the JDBC forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Devaka Cooray
  • Knute Snortum
  • Paul Clapham
  • Tim Cooke
Sheriffs:
  • Liutauras Vilda
  • Jeanne Boyarsky
  • Bear Bibeault
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Ron McLeod
  • Piet Souris
  • Frits Walraven
Bartenders:
  • Ganesh Patekar
  • Tim Holloway
  • salvin francis

Problem in parsing thru IBM SAX parser API  RSS feed

 
Greenhorn
Posts: 13
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi!
I am facing a problem regarding parsing the XML using IBM SAX parser API .
I am parsing XML by overwriting startElement() , endElement() and character() methods.
Every thing works fine , except for when I get following content in XML:
<SECTION TYPE="COMTEX-JIMDASH">
<PARAGRAPH TYPE="NORMAL">APO Priority=r APO Category=1700 � BLOCK � � BLOCK � KEYWORD: WASHINGTON � BLOCK � SUBJECT CODE: 1700</PARAGRAPH>
<PARAGRAPH TYPE="NORMAL" />
</SECTION>
Please note the character above � BLOCK � .By � BLOCK � i mean a square figure( i am not able to type or paste that character here ) On this , after the startElement() the control doesn�t goes to character() function (otherwise it should go to character () method) and throws exception with message org.xml.sax.SAXParseException occured while parsing XML Invalid XML character. (Unicode: 0x8).
The control passes to fatalError (SAXParseException e)method of HandleBase class.
I am looking for the way I can read these special chracters � �.
Secondly , if I get this special characters like this one , I want to continue with parsing XML . But ,as per API documentation :
The default implementation throws a SAXParseException. Application writers may override this method in a subclass if they need to take specific actions for each fatal error (such as collecting all of the errors into a single report): in any case, the application must stop all regular processing when this method is invoked, since the document is no longer reliable, and the parser may no longer report parsing events.

PLs help me out .
Regards ,
Dharmesh
 
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I think you are hitting an illegal character (as I recall, 0x08 is the ASCII bell character) which is why it shows as a block. If you can't remove the illegal character at the source, you may have to run your input file through a "filter" to remove illegal characters.
Look at the java.io.FilterInputStream class. You could interpose a filter between your source and the XML parser.
Bill


------------------
author of:
 
There is no beard big enough to make me comfortable enough with my masculinity to wear pink. Tiny ad:
how do I do my own kindle-like thing - without amazon
https://coderanch.com/t/711421/engineering/kindle-amazon
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!