• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Paul Clapham
  • Ron McLeod
Sheriffs:
  • Jeanne Boyarsky
  • Liutauras Vilda
Saloon Keepers:
  • Tim Holloway
  • Carey Brown
  • Roland Mueller
  • Piet Souris
Bartenders:

Converting '>' to "& l t ;"

 
Greenhorn
Posts: 17
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I am reading in an html file and converting it to xml. The problem is that when I come across special characters like < > & ... it makes the xml invalid when you try to read it with explorer. I started to write code to convert '<' to "& l t ;" (without spaces) but I keep running into new ones every time I add one to the list. Is there a java class that I can use to convert these symbols or find a list of invalid xml characters and there appropriate alternative?
 
Leverager of our synergies
Posts: 10065
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You can use Tidy for automatic HTML -> XML conversion. It converts all illegal symbols into escape sequences.
[This message has been edited by Mapraputa Is (edited July 10, 2001).]
 
Don't get me started about those stupid light bulbs.
reply
    Bookmark Topic Watch Topic
  • New Topic