Last week, we had the author of TDD for a Shopping Website LiveProject. Friday at 11am Ranch time, Steven Solomon will be hosting a live TDD session just for us. See for the agenda and registration link
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Paul Clapham
  • Ron McLeod
  • Jeanne Boyarsky
  • Tim Cooke
Sheriffs:
  • Liutauras Vilda
  • paul wheaton
  • Henry Wong
Saloon Keepers:
  • Tim Moores
  • Tim Holloway
  • Stephan van Hulst
  • Carey Brown
  • Frits Walraven
Bartenders:
  • Piet Souris
  • Himai Minh

filtering illegal characters in xml documents

 
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
i have the following xml document:
consider the following xml file:
<?xml version="1.0"?>
<attribute>
<other1> > </other1>
<other2> < </other2>
<other3> & </other3>
</attribute>
i'm using jaxp to parse the document. i encountered the following errors when parsing the document:
C:\DOMEcho>java -classpath '.\;C:\DOMEcho;C:\lib\crimson.jar;C:\lib\jaxp.jar;C:\
lib\xalan.jar;.' DOMEcho attribute.xml
Fatal Error: URI=file:C:/DOMEcho/attribute.xml Line=4: The content beginning "<
" is not legal markup. Perhaps the " " ( character should be a letter.
my investigation reveals that the character say '>' (within <other1> > </other1> in invalid. any ideas of solving this? note that i cannot change '>' to its corresponding iso characters (xml document is generated by velocity- publishing framework).
any ideas in solving this so that i can parse my documents successfully. i have tried reading in the entire xml string and convert the illegal characters to its equivalent but it don't work. will appreciate if someone can suggest a solution (or even donate some codes for me).
 
Saloon Keeper
Posts: 25467
180
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You can do it one of two ways.
1. Use the "escape" sequences, such as &amp;, &lt;
2. Wrap the items in a CDATA like so:

Once the XML parser reads in the info, the translation/escaping will have been done for you - This is true for both character entities and CDATA sequences.
[ July 12, 2002: Message edited by: Tim Holloway ]
 
Ervin Loh
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
the xml document is dynamically generated. as such i cannot put the CDATA section into the xml document.
i have been planning of writting a while loop that continuously parse the xml document and replace the illegal characters with its corresponding iso characters. can this work?
when it encounters SAXParseException, i'll call the method getColumnNumber method (to get the column where the character is).
 
Can you smell this for me? I think this tiny ad smells like blueberry pie!
free, earth-friendly heat - a kickstarter for putting coin in your pocket while saving the earth
https://coderanch.com/t/751654/free-earth-friendly-heat-kickstarter
reply
    Bookmark Topic Watch Topic
  • New Topic