Win a copy of Programmer's Guide to Java SE 8 Oracle Certified Associate (OCA) this week in the OCAJP forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

HTML pretty printer

 
Adnan Chaudhry
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Im doing a project on converting an HTML file into a well formatted version. To make the source code look 'nicer'.
Basically I have to take care of indentation and removing redundant tags.
Any suggestions of how I might tackle this problem?
-Adnan
 
Cindy Glass
"The Hood"
Sheriff
Posts: 8521
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Read in the file.
Parse it looking for specific no-no's
Replace those with prettier html
Write out the file.
Pretty straight forward.
 
Steve Deadsea
Ranch Hand
Posts: 125
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The devil is in the details. The parsing is the hard part. The best way is to learn to use lexer and grammar generators.
http://dmoz.org/Computers/Programming/Languages/Java/Development_Tools/Translators/Lexer_and_Parser_Generators/
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic