The
AccessingFileFormats wiki page has a bunch of information on the subject. Look into Jakarta POI for converting
DOC to text, and JPedal or PDFTextStream for extracting text from PDFs.
By the way,
you should make the topic of your posts more descriptive - "guidance" conveys nothing.
[ June 16, 2006: Message edited by: Ulf Dittmer ]