• Post Reply Bookmark Topic Watch Topic
  • New Topic

Problem using PDFBox to extract text from PDF documents  RSS feed

 
Greenhorn
Posts: 16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi all,

I am trying to extract the textual content of PDF files from my Java code. I (am trying to) use PDFBox 0.7.3 and the examples I have found online so far are rather limited. Basically, I did something like this:


and I get the following exception:


Any suggestions from the more PDFbox-experienced users?
 
Rancher
Posts: 42975
76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

org/fontbox/afm/AFMParser


Do you have that class on your classpath? Maybe PDFBox comes in several jar files.
 
Konstantinos Vasileiou
Greenhorn
Posts: 16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Ulf Dittmer wrote:

org/fontbox/afm/AFMParser


Do you have that class on your classpath? Maybe PDFBox comes in several jar files.




Yes, you are right. I needed to add the FontBox jar to my build path in order to make it work... Thanks!
 
His brain is the size of a cherry pit! About the size of this ad:
Thread Boost - a very different sort of advertising
https://coderanch.com/t/674455/Thread-Boost-feature
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!