• Post Reply Bookmark Topic Watch Topic
  • New Topic

"parsing" PDF documents  RSS feed

 
Stephen Huey
Ranch Hand
Posts: 618
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I see online that there are quite a few PDF libraries:

http://www.geocities.com/marcoschmidt.geo/java-libraries-pdf.html

I also see that some of them support extracting text from PDFs:

http://multivalent.sourceforge.net/Tools/doc/ExtractText.html

I see that JPedal claims it can extract form data, etc, but it costs some serious money, and I don't know if it would be feasible for me.

My basic problem is that I have some PDFs that I would like to be able to send to people to type in answers to questions and then send back to me, and I need to get those answers out of the PDF with Java. Would the best option if you can't spend much money be to just use one of these tools to extract the text and then parse the text?

If you have any tips on tools that you particularly like, please let me know! Thanks...
 
Julian Kennedy
Ranch Hand
Posts: 823
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Opinion that I've seen (including my own) is that PDF is great if you want to print stuff or if you waht really polished presentation.

If you want to get people to fill in forms so that you can collate the data, why not just use HTML (the web)? Then it's easy-peasy!

Jules
 
Stephen Huey
Ranch Hand
Posts: 618
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Very true. I love to use the web, but alas, sometimes people want stuff like this...having other folks edit PDFs and submit them, etc.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!