• Post Reply Bookmark Topic Watch Topic
  • New Topic

Document Conversion  RSS feed

 
Sahil Sharma
Ranch Hand
Posts: 152
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I have a requirement where i have to convert a document(doc,pdf) into an xml file. Are there any api's available that can help me to achieve this?

Thanks
 
Prafulla N. Patil
Ranch Hand
Posts: 106
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
not that much clear about your requirement, is there is xml file in Doc or PDF format and you want to create .xml file from that ??

Apache POI - HWPF - Java API to Handle Microsoft Word Files can help you with reading word files and then you can use XML generation with JAVA to create XML files.
 
Sahil Sharma
Ranch Hand
Posts: 152
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
the document will contain normal contents or images but in a particular format. e.g. [Heading, sub-heading, paragraphs etc]
 
Ulf Dittmer
Rancher
Posts: 42972
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
PDFs are tough; the best you will be able to do is to extract any text they contain, but structural information will be lost (unless you're prepared to invest a lot of time).
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!