Win a copy of Murach's Python Programming this week in the Jython/Python forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

HTTPClient-character encoding  RSS feed

 
mj zammit
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am using HTTPClient in my web application.
With HHTPClient i can use getResponseCharSet method to retrieve the encoding of the response body.
I am not understanding this phrase which can be found at the url HTTPClient-Character encoding

If the response is known to be a String, you can use the getResponseBodyAsString method which will automatically use the encoding specified in the Content-Type header or ISO-8859-1 if no charset is specified.


Does this mean that it handles all types of encodings?
Also what does it mean "if the response is known to be a String"?
I am not understanding how HTTPClient is going about it.

Any comments or suggestions will be greatly appreciated.
 
Ulf Dittmer
Rancher
Posts: 42970
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The response body doesn't need to be a string (in other words, text) - it could be binary data, such as an image. That sentence explains which encoding will be used if it is treated as a string.
 
mj zammit
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I see...
so basically if the Content-Type header is not of type text/html with charset=UTF-8 or by default ISO-8859-1 I should not use the getResponseBodyAsString method to retrieve the contents but do it in bytes.
Am i correct?
 
Ulf Dittmer
Rancher
Posts: 42970
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The way I read it (and mind you, I have no first-hand knowledge of how it actually works) is that you can always use getResponseBodyAsString IF the content is really text. Obviously you can't use it if the body is binary content.

All that sentence tells you is which encoding the library uses to transform the response body into a Java string. So you may be in trouble if a) the Content-Type header does not specify a charset AND b) the charset being used is NOT ISO-8859-1. If that's the case you're simply out of luck, and the web site that does it is crappy.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!