Okay I tried this out and here's the update ...
You are getting this XML in response to an HTTP request? Then it's possible that the encoding declared in the XML is different than the charset of the response. The rule in this case is that the charset of the response should be used by the parser, rather than the encoding declared by the XML.
Yes. This XML is a HTTP response. The charset of the response is utf-8(as set in the header
Content-Type: text/xml;charset=utf-8). When I open the link in the browser and save it as XML the root tag shows the encoding as "utf-8" also:
<?xml version="1.0" encoding="utf-8" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
However you're passing an InputStream to the parser, so the parser has no way to find out what was the charset of the response. Try passing the URL of your HTTP request instead and let the parser deal with the response directly. Or alternatively, get the charset from the response and construct a Reader which uses that charset.
As per the javadocs of the Pull parser bundled with the Android SDK, when I call setInput on the parser instance, the parser tries to determine the type of encoding based on certain conditions :
public abstract void setInput (InputStream inputStream, String inputEncoding)
Since: API Level 1
Sets the input stream the parser is going to process. This call resets the parser state and sets the event type to the initial value START_DOCUMENT.
NOTE: If an input encoding string is passed, it MUST be used. Otherwise, if inputEncoding is null, the parser SHOULD try to determine input encoding following XML 1.0 specification (see below). If encoding detection is supported then following feature http://xmlpull.org/v1/doc/features.html#detect-encoding MUST be true amd otherwise it must be false
Parameters
inputStream contains a raw byte input stream of possibly unknown encoding (when inputEncoding is null).
inputEncoding if not null it MUST be used as encoding for inputStream
I tried setting the encoding explicitly as "utf-8" but this still doesn't work; i get exceptions.
When I looked into the HTTP traffic using a sniffer(CharlesProxy) and tried to view the response XML, the tool tells me that there is an invalid unicode character in CDATA and so it cannot parse the XML to fill up the view.
[Failed to parse data: org.xml.sax.SAXParseException: An invalid XML character(Unicode 0x12) was found in CDATA section.]
Maybe I should try creating a reader with appropriate charset(utf-8) and pass that to the parser?