• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Liutauras Vilda
  • Ron McLeod
Sheriffs:
  • Jeanne Boyarsky
  • Devaka Cooray
  • Paul Clapham
Saloon Keepers:
  • Scott Selikoff
  • Tim Holloway
  • Piet Souris
  • Mikalai Zaikin
  • Frits Walraven
Bartenders:
  • Stephan van Hulst
  • Carey Brown

jdk1.4.2 SAX DefaultHandler characters method problem

 
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello,

I made my own handler extending org.xml.sax.helpers.DefaultHandler

I parse an XML document containing CDATA Tags.
These CDATA tags contain XML code (1300 characters for each CDATA - always the same content for this test).

When i extract all CDATA content with this :

I noticed that the my "content" variable was sometime containing troncated datas.
I think i don't use the right way to extract CDATA tags or there is bug ?
 
author
Posts: 11962
5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If your XML looks like this:

there is no requirement for a SAX parser to invoke the characters() method only once with "this is some content" as the argument. It could invoke characters() just once or multiple times with smaller pieces of the whole "this is some content" string.

In other words, what you'll have to do is maintain a StringBuffer, append content to that StringBuffer when the parser invokes characters(), and then deal with whatever you've gathered into the StringBuffer when the parser invokes startElement() or endElement() -- i.e. when all the content has been processed and the element either ends or if it's a mixed-content element, a child element begins.
 
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
This exact same way as Lasse Koskela mentioned

sample method implementation

// variable
private StringBuffer textBuffer = null;

public void startElement(String uri, String localName, String qName, Attributes atts)
{
writeContent();
}

public void endElement(String uri, String localName, String qName)
{
writeContent();
}

public void characters(char[] text, int start, int length) {
String content = new String(text, start, length);
if (!content.trim().equals("")) {
if (textBuffer == null) {
textBuffer = new StringBuffer(content);
} else {
textBuffer.append(content);
}
}
}

private void writeContent() {
if (textBuffer == null)
return;
characters(textBuffer.toString());
textBuffer = null;
}
 
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi, pretty new to Java, and probably shooting over my weight , but I have been parsing some XML with a certain amount of success (printing to the command line, which hides the stringBuffer issues above).

The above example doesn't compile, due to this line in the writeContent method,

characters(textBuffer.toString());

I think this doesn't compile as it's not passing in the correct amount of parameters to the method, and it's passing a string instead of a char , could someone point me in the direction of how to solve this?
[ October 07, 2005: Message edited by: Nick Hayday ]
 
Won't you please? Please won't you be my neighbor? - Fred Rogers. Tiny ad:
Smokeless wood heat with a rocket mass heater
https://woodheat.net
reply
    Bookmark Topic Watch Topic
  • New Topic