This week's book giveaway is in the Jython/Python forum.
We're giving away four copies of Murach's Python Programming and have Michael Urban and Joel Murach on-line!
See this thread for details.
Win a copy of Murach's Python Programming this week in the Jython/Python forum!
    Bookmark Topic Watch Topic
  • New Topic

Unable to read charset using  RSS feed

Rudy B Baylor
Posts: 6
  • Mark post as helpful
  • send pies
  • Report post to moderator
Using package, I am trying to read a html page, which has Content-Type as

Now it is very critical for me to be able to read the charset which is mentioned in tag above.

Using urlConnection.getContentType(), urlConnection.getHeaderField("Content-Type") just returns "text/html", which I believe is because the above methods derive value from some other place rather than the <meta> tag shown above.

Is there a way of getting the values of <meta> tags beforehand so that one can determine what charset to use while reading ?.

I need to read a html page and write that to a already initialized response object. For that it is critical for me to determine the encoding of the html file.

Transferring bytes directly from InputStream to response OutputStream, irrespective of encoding, is not working as the response.getWriter() has already been called and hence response.getOutputStream() throws IllegalStateException !!!.

Someone please advise ways to resolve the problem.

Thanks in advance
Joe Ess
Posts: 9406
Linux Mac OS X Windows
  • Mark post as helpful
  • send pies
  • Report post to moderator
Please do not post the same question more than once. It causes confusion and duplication of effort as the community tries to help everyone.
    Bookmark Topic Watch Topic
  • New Topic
Boost this thread!