• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

building dom tree from html file

 
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi all,
I have the task to build a dom tree from an html file.
Concerning this I have two Questions.
1. Knows everyone a good way to build a dom tree from a
html file? ( html is not wellformed -> DOM Parser )
2. Knows everyone a good api, which can do this?
Thanks for your help.
Frank Piorko
 
Sheriff
Posts: 5782
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Frank - anything that is not a well-formed XML document is not an XML document. You will first have to think about making it well-formed. Any parser will error out if you try to form a malformed document.
 
Ranch Hand
Posts: 47
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yeah - i also search for such a solution, i know html is not werllformed , but there must be some custom parser out there building a dom tree from html.

 
Ajith Kallambella
Sheriff
Posts: 5782
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Why not tweak the HTML and make it well-formed??
Remember - a malformed XML document isn't an XML document in the first place. So parsing has no meaning in that context!
 
Frank Piorko
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I cannot make the html file wellformed by hand.
The amount of html files is to big. The application
gets every some days many html files from other programmers,
who are not familar with the xml/html problem.
 
Leverager of our synergies
Posts: 10065
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Frank, as Ajith said, you can convert your HTML to XHTML (well-formed HTML). You do not need to make it "bu hand", just search for "Converting HTML to XHTML" on the Internet, and you'll find something like this: http://www.vbxml.com/xhtml/articles/html_to_xhtml/default.asp
Or you can check this site: http://www.xmlsoftware.com/convert/
W4F looks good.
or HEX on http://www.xmlsoftware.com/parsers/

[This message has been edited by Mapraputa Is (edited April 30, 2001).]
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
reply
    Bookmark Topic Watch Topic
  • New Topic