This week's giveaway is in the Java/Jakarta EE forum. We're giving away four copies of Java EE 8 High Performance and have Romain Manni-Bucau on-line! See Java class. If you're interested then it's a Decorator class for an old and clunky JSP Table widget called DisplayTag.
For data that is pulled from our database I need to encode it to HTML so that we are not vulnerable to putting raw and potentially malicious data directly into the browser. Currently we use the org.owasp.esapi.ESAPI library to do ESAPI.encoder().encodeForHTML(rawVal) but it doesn't play nice when unit testing and always fails due to some Reflection lookup failure. That kinda sucks.
I also heard that the ESAPI project is dead now. Is that true?
What is the best Java tool to encode my raw data into safe HTML Strings? What do you guys use?
What's the nature of the date being encoded? If all you need is replacement of special characters (such as <) with HTML entities, then simple string replacements could be used. But I suspect you need more than that...
There's some data being presented that's retrieved from the database, and that data was put in the database as the result of some user input somewhere or other. The purpose of the encoding is to ensure that if the user entered malicious data then we do not present that back to the browser as is. Perhaps the user entered some JS code, I don't want to put that back on the browser and have it interpreted as a runnable script that could result in some unwelcome action being taken against the application. Or perhaps they've entered an anchor link to somewhere we really don't want to go. Or perhaps an img tag with a huge picture of a bear. You know the deal.
The ESAPI library I mentioned is just String to String transformation where HTML markup, such as < and the like, get replaced by their HTML encoded equivalent, like < or whatever it is.
In that case, I'd just write a simple method that uses String.replace to change all < and > characters to their HTML entity equivalents. I've seen some methods that also change quote characters but not sure that that's necessary unless you will using the text as attribute values. Would that satisfy the requirements?
Of course you still need to be careful where you put the text in your own markup. If you stick it inside <script> tags, well...
Thanks for that Christian. I was just coming back to talk about that very thing.
From what Bear has told me, there doesn't appear to be much to it. Just a handful of character replacements are required. So as I already have apache commons available in the project, and I'm too lazy to roll my own, I think StringEscapeUtils will be a suitable replacement to the troublesome to test ESAPI.