• Post Reply Bookmark Topic Watch Topic
  • New Topic

Parsing HTML

 
vaibhav mukund
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I want to read HTML file and want to get only the content of the file after removing all the HTML tags. How will i write parser for this situation.

Please help me...
Thanks! in advance.
 
Joe Ess
Bartender
Posts: 9361
11
Linux Mac OS X Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The Java API has an HTML Parser built in.
 
Rob Spoor
Sheriff
Posts: 20817
68
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You may want to use ParserDelegator in the same package - it's a bit easier to instantiate, since you don't need a DTD object for it. And in the end it will create a DocumentParser anyway.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!