Win a copy of Java Concurrency Live Lessons this week in the Threads forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Extract only the PDF Page Text Content using iText5.0.5  RSS feed

 
Divya Kambhatla
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I want to extract the text out of a PDF using iText5.0.5. The problem is when i extract text, all the text,including page numbers, figure titles, pae titles get extracted. I am completely new to the iText api. Could anyone please let me know if there is any method/interface in iText which could help extract ONLY the text content (or) atleast let me know if the page numbers, page titles, figure titles also come under as page text?

Thanks in advance!
Divya.
 
Divya Kambhatla
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

I addition to the above, could anyone please help me know how the Pdf page could be split into the header, footer and its trailer. When i analysed the iText source code , i came across the above info and also a PdfBody class. But i am not understanding how exactly i could go about creating a PdfBody and extract the text content out of it.

Thank You,
Divya.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!