You would have to write some methods with names like startElement(), endElement(), and characters(). The SAX parser would example the document and then call the methods you provided in this order:
startElement("note");
startElement("to");
characters("Tove");
endElement("to");
startElement("from");
characters("Jani");
endElement("from");
...
endElement("note");
The other kind of parser is a DOM parser. This kind runs through the whole document and builds a tree-like data structure to represent it; then you examine the tree to learn about the document. A DOM parser uses lots more memory to hold that tree. A SAX parser is generally faster, but harder for some people to understand.
It comes from the way in which these events are caused. The SAX parser will look at the start of an XML file/stream and then work its way through it. Each time it meets a certain thing its been looking for (the start of an element, an attribute, some text and so on) it will fire the method appropriate for this particular event. The idea is that the parser moves through the XML only once, and only from start to end. Once the end of the XML is reached, the parsing stops.Originally posted by alfred jones:
but why it is called "event based" ? this is misleading term.
Actually most (all?) DOM parsers use SAX somewhere along the line. What the DOM parser does is to us SAX to move through the XML, and create a tree-like structure of objects as it goes along. The fact that it uses SAX is irrelevant to the user of the DOM because the user is only concerned with the tree-like structure which is produced at the end of the parsing process.How a DOM parser will parse this ?
There will be glitches in my transition from being a saloon bar sage to a world statesman. - Tony Banks
It comes from the way in which these events are caused.
The SAX parser will look at the start of an XML file/stream and then work its way through it. Each time it meets a certain thing its been looking for (the start of an element, an attribute, some text and so on) it will fire the method appropriate for this particular event.
The idea is that the parser moves through the XML only once, and only from start to end. Once the end of the XML is reached, the parsing stops.
Actually most (all?) DOM parsers use SAX somewhere along the line. What the DOM parser does is to us SAX to move through the XML, and create a tree-like structure of objects as it goes along.
The fact that it uses SAX is irrelevant to the user of the DOM because the user is only concerned with the tree-like structure which is produced at the end of the parsing process.
At this point it's really impossible to answer all your individual questions -- we'd be arguing over the meaning of many individual words. But I will tell you what SAX and DOM parsers are again.
A SAX parser goes through your XML file and calls methods that you supply at various points during the process. These method calls are "events". So a SAX parser turns an XML file into a series of method calls.
A DOM parser parses the entire file, creating many Java objects to represent the contents of the file. When you use a DOM parser, you call a single "parse()" method; the return value of this parse() method will be a big tree of those Java objects. You must then search through that tree of objects yourself to find the information you need about the file. A DOM parser turns an XML file into a Java data structure that you can then examine.
you call a single "parse()" method; the return value of this parse() method will be a big tree of those Java objects.
Yes, that's exactly what the term "tree" means. I'm not just making it up, though -- it's a pretty standard computer-science term. One object is the "root" of the tree. It has references to some other objects which are "branches" or "internal nodes." Those branches can refer to other branches or to "leaf nodes", the ends of the tree that you were calling "fruits." And indeed, you can browse around in the tree and "pick the fruits."
As I said, that's the good part of using a DOM parser. The bad part is that this tree can take up a lot of memory for a big document, and building it can be slow. A SAX parser doesn't use all that memory, and it's generally faster.
Imagine an XML document containing 2,000 nodes. When DOM is used to parse this document, there will be at least 2,000 objects stored in memory for the resulting DOM tree. There may even be more (depending on if you counted text, attributes and so on in your original count). This is because each node, each attribute, each bit of text (which are nodes in themselves) and so on need to have a separate object in memory. SAX, on the other hand, will only ever have in memory objects to do with the current part of the document being processed. Although this may consist of several objects, its unlikely to be as much as 2,000.Originally posted by alfred jones:
but why DOM consumes high memory ? is it just because of formation of tree comprising java objects ?
There will be glitches in my transition from being a saloon bar sage to a world statesman. - Tony Banks
Politics n. Poly "many" + ticks "blood sucking insects". Tiny ad:
Gift giving made easy with the permaculture playing cards
https://coderanch.com/t/777758/Gift-giving-easy-permaculture-playing
|