• Post Reply Bookmark Topic Watch Topic
  • New Topic

how to convert the html file content to .txt file format  RSS feed

 
Ranch Hand
Posts: 199
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
i have taken an web page ans saved it in my system.i want that .html file to be converted to .txt format. can any one tell how to do in java coding part?
 
Saloon Keeper
Posts: 4038
94
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
For that you'd have to define what "conversion" means - removal of all HTML tags? There might be parts of the content that are loaded via AJAX; those would not be in the HTML file, and thus not in the result of the "conversion".
 
Bartender
Posts: 2856
10
Fedora Firefox Browser Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If conversion means extracting the text out of the HTML, then you will need to write a HTML parser.
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!