[Logo]
Forums Register Login
iText: pdf conversion into other formats
Hello Bruno,

Does iText support converting pdf into other formats?

Or, if not, do you have any experience/preferences in conversion tools?

Cheers,

Gian
 

Gian Franco wrote:Does iText support converting pdf into other formats?



Given the nature of PDF (PDF Creation is supposed to be a "One-Way Process"),
I don't believe in converting PDF into other formats (unless you're talking about
rendering PDF to a raster format).

iText can do a best effort to extract a PDF to text, and if the PDF is "tagged",
it can convert the PDF to XML, but I don't trust any software that claims it
can convert PDF to Word, Excel, RTF, HTML,...

It's sufficient to look inside the PDF and to inspect the PDF syntax to understand
why they are promising something that is (in many cases) impossible.

Gian Franco wrote:Or, if not, do you have any experience/preferences in conversion tools?



No, that would be against my religion ;-)
 

Bruno Lowagie wrote:No, that would be against my religion ;-)





I was thinking of PDF -> TIFF, we have two systems where the latter
format prevails so we're thinking of adding conversion of the former...

Cheers,

Gian
See http://www.coderanch.com/t/497492/java/java/Convert-PDF-files-Tiff-files for one approach to do PDF -> TIFF conversion using PDFBox.
 

Gian Franco wrote:I was thinking of PDF -> TIFF, we have two systems where the latter
format prevails so we're thinking of adding conversion of the former...



OK, you are referring to conversion in the sense of "rendering".
I saw that Paulo (co-developer of iText) has been adding TIFF creation functionality,
but it's untested and undocumented, and certainly not capable of rendering PDFs to TIFFs yet.

Right now, I would say that PDF-to-TIFF isn't a priority for us, because:
- PDF rendering is not our core business
- there are other tools that do this, and why would we want to compete?

Then again: we've always said the same about parsing PDF, and
the PDF parser is getting better and better with every new release...
And if we can parse a PDF, we could render it, although I don't believe
any third party company can compete with Adobe with respect to viewers:
stuff like transparency is just too difficult. Some people dislike Adobe Reader
in favor of other tools, saying that Adobe Reader is bloated, but as soon as
the PDF has special features, you'll see that third party viewers fail when
compared to Adobe Reader.
Cob is sand, clay and sometimes straw. This tiny ad is made of cob:
ScroogeXHTML 7.1 - RTF to HTML5 / XHTML converter
https://coderanch.com/t/690611/ScroogeXHTML-RTF-HTML-XHTML-converter


This thread has been viewed 5166 times.

All times above are in ranch (not your local) time.
The current ranch time is
Feb 20, 2018 16:25:22.