So considering the input as an XML document, you would be extracting the content of all text nodes. Was that what you meant by "plain text", Fred?
Note also that if you're really extracting text nodes from an XML document
you should really perform XML unescaping on the text nodes; consider the document
where the text node represents the
string "Fortnum & Mason".