• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Tim Cooke
  • Campbell Ritchie
  • paul wheaton
  • Ron McLeod
  • Devaka Cooray
Sheriffs:
  • Jeanne Boyarsky
  • Liutauras Vilda
  • Paul Clapham
Saloon Keepers:
  • Tim Holloway
  • Carey Brown
  • Piet Souris
Bartenders:

Finding Page Count in any file

 
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

I want to find out the page count in any file (doc,ppt,pdf,image,etc...). Is there any universal API to deal with any file irrespective of the file extension.Its bit urgent. Please let me know your inputs on this.
I had gone through iText API for converting the files into PDF. But I dont think so it converts any file into PDF.

Thanks,
Srinivas.
 
Bartender
Posts: 3323
86
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Welcome to the Ranch.

This is not possible. Every file type has a different format, some have no concept of pages and those that do have a different way of representing page breaks.

Take a plain text file (ie .txt) as an example, the number of pages in that file depends on how many lines the printer will print on each piece of paper. This will vary with the paper size, the font size, how close the printer can print to the edge of the paper etc.
Or an image file - where are the page breaks in an image file?
 
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If this was my problem I'd look into what Apache Tika can do. Its sole purpose is to handle structured document formats, and put abstractions on top of them as far as possible. This might be a good start (just ignore what it says about Alfresco).
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
reply
    Bookmark Topic Watch Topic
  • New Topic