• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Jeanne Boyarsky
  • Ron McLeod
Sheriffs:
  • Paul Clapham
  • Liutauras Vilda
  • Devaka Cooray
Saloon Keepers:
  • Tim Holloway
  • Roland Mueller
Bartenders:

About parsing with DOM ...

 
Ranch Hand
Posts: 179
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi

I'm tryng to parse xml file with dom. So far everithing seems ok, but i get punch of errors to my console that, my elements and attributes are not declared in my xml file:

the "10" at final row is what i was looking ... you will see it in my code.

First my xml file, which is without dtd or other file, just one xml file:

Notice there are 10 "score" elements with id from 0 to 9.

Here is my Java code:


so how to get rid of those errors?

[ August 12, 2004: Message edited by: Juhan Voolaid ]
[ August 14, 2004: Message edited by: Juhan Voolaid ]
 
Ranch Hand
Posts: 385
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You use parser with validation but you didn't provide XML schema to validate your document against. I think this is the problem. Turn validation off or make a schema.

best regards
 
Juhan Voolaid
Ranch Hand
Posts: 179
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
gee ... didn't know that was so simple, but here is another problem with this example. I thought taht is becouse of those declaration errors but they are not.

With the same example, the function getInfo(Element scores):
Element "scores" is the root element and i make NodeList of "score" elements.
Each "score" has nodes called "name" and "points".
I don't know why i can't get access to "name" and "points", But the root element "score" works fine.
I'll show you what i mean:

But


I don't know what is wrong
 
Vladas Razas
Ranch Hand
Posts: 385
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
There are 2 problems. The first problem is that in your sample code you get Name and Value of the first node. So that would be "Name" and "Jux" (from your xml sample above). For the second node name use item(1).

Second problem is tougher. XML was ment for documents. And DOM is not the easiest thing to work with (even Sun site would recommend you JDOM, unless you want all DOM flexibility). Let's take XML sample:

<dog> <name/> </dog>

You would think that DOM would create you element Dog with node Name. But it will create you 5 elements:
<dog>
<#text> </#text>
<name/>
<#text> </#text>
</dog>

Parser does this so you wouldn't lose your whitespace characters. Possible solution would be write something that would remove you all <#text> elements that consist only of whitespace. The second way is to get factory which would create parser that will to that for you. Look DocumentBuilderFactory.setIgnoringElementContentWhitespace(). But there is a problem for this to work (look Javadoc) you have to get validating parser (and again for this you will have to have schema). Also you may want to look at DocumentBuilderFactory.setIgnoringComments() (in case user will want to write comments in your XML.

Well, Sun recommends to use DOM only if you want to deal with all this. Otherwise they say you can use JDOM for simplicity. But that would add 1-2 mb to your runtime.

best regards

P.S. I wrote my whitespace remover. Haven't tried that validating parser way yet.

Here is my remover. Give it document root node.

 
Vladas Razas
Ranch Hand
Posts: 385
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Sorry,

<dog>
<#text> </#text>
<name/>
<#text> </#text>
</dog>

is not 5 elements. But you've got the idea
 
author
Posts: 30
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You don't need to write any code. DOM includes the method normalize(), which combines all the child text nodes of an element into a single text node. Then your attempts to access those elements will work as expected.
 
Vladas Razas
Ranch Hand
Posts: 385
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Does it remove whitespace?
 
Vladas Razas
Ranch Hand
Posts: 385
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I've tried normalize(). Didn't work for me. I didn't 100% understand what it does from javadoc. I understood it joins adjacent #text nodes, but does it eliminate them completely (those that consist of whitespace only). I've tried it on my xml and still got #text between elements, like:

#text = "\n "
<elem1/>
#text = "\n "

Can you explain?

Thanks!
 
Tom Passin
author
Posts: 30
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The parser has no way to know whether white space between elements is significant or not, unless there is a dtd or schema to tell it (and the parser is instructed to use it). Therefore in most cases the whitespace nodes are kept. Some parsers have parameters that can change how that kind of whitespace is handled, so if you are interested, read up on the docs for your parser.
 
Juhan Voolaid
Ranch Hand
Posts: 179
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I allso didn't get to wark that normalize() method and I still have difficulties with parsing my xml file. I allso added DTD to my XML file.
I get the same problems by doing so:

And the problem is simply that after the parsing process my info array consists of empty strings.

Here is again my xml file "scores.xml":

document type definition "scores.dtd":

and this is how i make dom document in Java:


I don't know what should i do. Seems to me that the finalize() method still doesen't work. I think i have probles with those white_space elements.

Please help.
 
Our first order of business must be this tiny ad:
Smokeless wood heat with a rocket mass heater
https://woodheat.net
reply
    Bookmark Topic Watch Topic
  • New Topic