As far as i am aware, all the content that is being attempted to be served is UTF-8. Well, maybe that's not the case. You probably need to gather more info. How about using the getContentType() to find out if whatever you're connecting to has provided you with any more info about the encoding? Typically this may be something like
"text/html; charset=UTF-8"
So you can try parsing out a charset from this field. If none is provided, the default is supposed to be ISO-8859-1. In practice it is sadly not that unusual for servers to fail to specify these fields correctly. The next line of defense is to initially assume the encoding is ISO-8859-1, and use that to interpret the subsequent HTML, and look for a meta tag which has the
real encoding in it. E.g.
<meta http-equiv="Content-Type" content="text/html; charset=Shirt-JIS">
Natually, servers that feil to properly specify their encodings are EVIL.

But we may have to deal with them nonetheless...