This week's book giveaway is in the Programmer Certification forum.
We're giving away four copies of OCP Oracle Certified Professional Java SE 21 Developer Study Guide: Exam 1Z0-830 and have Jeanne Boyarsky & Scott Selikoff on-line!
See this thread for details.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • Liutauras Vilda
  • Jeanne Boyarsky
  • paul wheaton
Sheriffs:
  • Ron McLeod
  • Devaka Cooray
  • Henry Wong
Saloon Keepers:
  • Tim Holloway
  • Stephan van Hulst
  • Carey Brown
  • Tim Moores
  • Mikalai Zaikin
Bartenders:
  • Frits Walraven

SAX Parsing in JDK 1.6_14 and multiple lines in an element's value

 
Greenhorn
Posts: 28
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello,

We're using a SAX parser and currently have a class that extends org.xml.sax.helpers.DefaultHandler.

We've overrode startElement(), endElement(), and characters().

In our characters() method, the current code (a lot of it) considers the invocation of characters() to mean that all of the element's value (the "Hello" in <world>Hello</world) is complete and then goes thru a large if/else-if/else-if/... try/catch(es) statement. Then the endElement() is invoked and more if/else-if/... try/catch(es) statements are executed.

I've read that the semantics of characters() is that it is invoked multiple times if the element's value contains multiple lines and really characters() method should just "buffer" its value. And only once endElement() is called then and only then is the element's value is complete.

Because there is a lot of code, my question is if the default functionality can be overridden in the SAX parser so that characters() is called once irrelevant if the element's value contains multiple lines or not?

We're not running Java (on Windows/Unix) with any special options other than "java -cp . MyParser my_data.xml"

Thanks,Jim
>
 
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The behavior of characters() is rock hard encoded into the SAX api.

You could use the StaX parser approach. See javax.xml.stream package.

As I understand it, when the StaX parser hands you a Characters event all of the element text is available in that event.

If you wrote a lot of code assuming behavior of characters() that does not correspond to the API, regard it as a learning experience.

Bill
 
knowledge is the difference between drudgery and strategic action -- tiny ad
Gift giving made easy with the permaculture playing cards
https://coderanch.com/t/777758/Gift-giving-easy-permaculture-playing
reply
    Bookmark Topic Watch Topic
  • New Topic