Win a copy of Murach's Python Programming this week in the Jython/Python forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Reading Text from a PDF file  RSS feed

 
A C Arun
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi ,

Can anyone help me reading a PDF properly using iText or PDF box or some similar technologies.

I am able to read the lines but the problem creeps in when ther are some watermark or shadow or some formating for the text..

Thanks in advnace..

AC
 
Ulf Dittmer
Rancher
Posts: 42970
73
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
iText has no provisions for extracting text from PDFs, but either PDFBox or JPedal can do it; check their documentation for details. PDFBox in particular has a class called PDFTextStripper for this.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!