Escape HTML characters - what's the good practice?
posted 5 years ago
I have a line of text that needs to be displayed on a web page, printed in PDF and also possibly output as xml. There are multiple places in my application where this property is displayed. Recently we encountered a situation where this text had a "Less Than" symbol, a well know html character. The text gets truncated in html but prints fine in PDF. If I use StringEscapeUtils.escapeHtml in JSP it prints the entire text without problems. But I have a lot of places in the application where this property is rendered in HTML. I can find all these places and use StringEscapeUtils... to escape the HTML character. But I felt it is bad design. What if I have future needs to display this character else where. It could be a potential ongoing problem.
So I thought why not wrap the escaping logic in the domain object that returns this text. But soon it started printing the lt; in the PDF reports. Not a good option either.
I thought of the visitor pattern. But somehow I need to provide a context e.g. html, pdf, xml etc. to get a well formatted text.