This week's book giveaway is in the Jython/Python forum.
We're giving away four copies of Murach's Python Programming and have Michael Urban and Joel Murach on-line!
See this thread for details.
Win a copy of Murach's Python Programming this week in the Jython/Python forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Problem while converting PDF to text convertion in Java  RSS feed

 
knazeer ahmed
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi,
I am facing the below problem in converting PDF to text.
I have a scanned document which is in PDF. I want to extract the data from that PDF. I tried with PDFbox and Fontbox. but it will work only when the content of the PDF is real text (but not text in image).


Can any one help me in this?..


Thanks and Regards,
Nazeer
 
Rob Spoor
Sheriff
Posts: 20893
81
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You will need to use an OCR library for retrieving text from any kind of image.
 
Ulf Dittmer
Rancher
Posts: 42970
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
... something like Tesseract.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!