Win a copy of Functional Reactive Programming this week in the Other Languages forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

java code to extract EMBED pdf from doc file

 
Sunil Baboo
Greenhorn
Posts: 16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
How can we extract embed pdf file from doc file using java.


Here's a link to one of these Word files that has documents embedded into it:
http://www.seattle.gov/purchasing/docs/bids/ITBCTY11150.doc

Is there a way to get the Java code to extract these embedded files? Like extracting files from a zip file?

Any suggestion worth lot to me.
Thanks in advanced.
 
Lester Burnham
Rancher
Posts: 1337
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You best bet is probably the (open source) Apache POI library, or the (commercial) Aspose stuff. Those are about the only Java libraries that have initimate knowledge of the DOC/DOCX formats.

Using OpenOffice in server mode -and accessing it from a Java client- could be another possibility, assuming it has a way to extract embedded file programmatically.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic