• Post Reply Bookmark Topic Watch Topic
  • New Topic

Remove tags in a DOM Document  RSS feed

 
Larry Homes
Greenhorn
Posts: 25
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello

I wish to remove tags in a document, but not delete what is contained within them. So for example



I would like to get rid of the a tags but leave the rest intact. So the Document would then look like



Is there any simple way to do this?

Another question I have is getting all the children of a node. I know you can use getChildNodes(), but that doesn't recursively retrieve the child nodes. It just retrieves the direct children, but I would like to also retrieve the children of the children and so on. Is there a method for this?

Thanks
 
Paul Clapham
Sheriff
Posts: 22835
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You can't do anything with the tags in a DOM document, you can only work with the nodes. So that means you have to think of the document as a tree consisting of nodes, rather than as a text document consisting of tags and text.

In this case the document consists of an <a> element with three children. You want to convert that to a document whose root element is one of those three children. You will have to decide whether you want to create a new document, or whether you want to modify the existing document.

As for your second question, I don't think there's a DOM method to return all the descendant nodes of a node. You can express that easily in XPath, if you want to take that route, or you could write the recursive method if you like writing Java code better.
 
Larry Homes
Greenhorn
Posts: 25
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
In my case, I think modifying the document rather than making a new one. The method that needs to do this gets an html page in the form of a DOM Document and needs to remove certain tags. For making local modifications like this, it seems just modifying the Document is a better solution. But you didn't really mention how I would go about doing this.

I am actually using XPath anyway to another use, so using XPath would probably be an ideal solution. The problem is, I don't know it well enough to come up with the statement you are talking about.


Any help is greatly appreciated.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!