Win a copy of Java Persistence with Spring Data and Hibernate this week in the Spring forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Tim Cooke
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • Junilu Lacar
  • Rob Spoor
  • Jeanne Boyarsky
Saloon Keepers:
  • Stephan van Hulst
  • Carey Brown
  • Tim Holloway
  • Piet Souris
Bartenders:

Java And OpenDocument Files

 
Ranch Hand
Posts: 101
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Friends,

Now, lets say you have a file, lets call it document.odt

I have noticed that .odt files are not "stand alone" files like .txt or .dat files. Rather they are like zipped files with other defining files in there.

Before I get out of track, let me go to my question.

document.odt contains some text, say "Hello World". I want to edit this text through Java and save the resulting text. How can this be achieved?

Thanks and regards.
 
Bartender
Posts: 9626
16
Mac OS X Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
OpenOffice has a Java API.
 
Arthur Buliva
Ranch Hand
Posts: 101
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
And how can I use the API to open/edit files through a TextArea on my Java application?
 
Rancher
Posts: 43075
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Accessing ODF files through the OO Java API and using the data in GUI elements are separate activities. You'll need to dig into the API to figure out how to access the data that you need for the GUI. The AccessingFileFormats wiki page links to a number of articles about the OO Java API and the ODF file format.
 
Arthur Buliva
Ranch Hand
Posts: 101
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Dear Dittmer,

I am simply lost in the array of suggestions in those pages...
 
Java Cowboy
Posts: 16084
88
Android Scala IntelliJ IDE Spring Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

The OpenOffice Java API

  • OpenOffice can read a number of file formats, and makes them accessible through its API. A starting point might be this article and of course the OO developer site
  • Some introductory information about the OO file format can be found here and here - basic Java code for reading OO files is here
  • Reading an OpenOffice file is not as simple as reading a plain text file, simply because OpenOffice contains a lot more features than a plain text editor.
    [ October 31, 2007: Message edited by: Jesper Young ]
     
    Arthur Buliva
    Ranch Hand
    Posts: 101
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    So whats your suggestion on how to open them?
     
    Ulf Dittmer
    Rancher
    Posts: 43075
    77
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    Either use the OO Java API to convert the file to some other format that you do know how to open, or -based on the articles linked above- write code that opens and processes the files.

    From my cursory look at the articles OO files appear to be zipped-up XML files. Since Java has APIs for dealing with ZIP and XML files, getting at the actual contents shouldn't be too hard. Making sense of those is a different matter, of course - I'd recommend to start with a simple document, to see if you can manage to extract whatever information you need.
     
    Arthur Buliva
    Ranch Hand
    Posts: 101
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    I guess this leads to the next question, my good people.

    How do I use the Java API to zip/unzip file packages?
     
    Ulf Dittmer
    Rancher
    Posts: 43075
    77
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    Examples of using the ZIP API -as well as just about all other java.* classes- can be found at the Developer's Almanac: http://www.exampledepot.com/egs/java.util.zip/pkg.html
     
    Arthur Buliva
    Ranch Hand
    Posts: 101
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator


    from the reference you gave me has solved my most immediate problem. Thanks!
     
    Arthur Buliva
    Ranch Hand
    Posts: 101
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    This is a sample output:



    and the code



    retrieves only the first entry whereas I would want the file called content.xml to be extracted, if not the entire odt file, onto a specified folder. How or what do I need to modify in the code?
     
    Rancher
    Posts: 3742
    16
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    You need a loop. ZipInputStream.getNextEntry() returns null when there are no more entries, so you can use that as the loop controller.
    If you only want to output certain files, then you need to check the name of each ZipEntry object before you write it out. Check the API docs to see if there is a method that gets the name of the ZipEntry object.
     
    Arthur Buliva
    Ranch Hand
    Posts: 101
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    My redo of the code is




    What could be the issue here as it returns lots of null and empty folders?
     
    Joanne Neal
    Rancher
    Posts: 3742
    16
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    You create a new FileOutputStream object every time through the loop, but you only close the last one after you exit the loop. Put the out.close() call inside the loop.
     
    Arthur Buliva
    Ranch Hand
    Posts: 101
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    After doing that it renders the files as folders/directories instead of just files.
     
    Ulf Dittmer
    Rancher
    Posts: 43075
    77
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator

    After doing that it renders the files as folders/directories instead of just files.


    It looks as if the code is supposed to extract the files that are port of the ODF file, and write them to disk, each in the physical directory where it would logically be inside of the ODF file. Doesn't it do that? If not, what does it do, and where does it go wrong?
     
    Arthur Buliva
    Ranch Hand
    Posts: 101
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    Yes, this is what this code is supposed to do:

    From the list of files inside the ODT file, which are

    mimetype
    Configurations2/statusbar/
    Configurations2/accelerator/current.xml
    Configurations2/floater/
    Configurations2/popupmenu/
    Configurations2/progressbar/
    Configurations2/menubar/
    Configurations2/toolbar/
    Configurations2/images/Bitmaps/
    content.xml
    styles.xml
    meta.xml
    Thumbnails/thumbnail.png
    settings.xml
    META-INF/manifest.xml

    I need to extract the files as they are in the odt file. For instance, thumbnail.png is to be extracted in a folder called Thumbnails.

    So far, the code



    returns



    Sorry if I am going back on my progress but I hope its for the best interest of clarity here.

    Thanks.
     
    Ulf Dittmer
    Rancher
    Posts: 43075
    77
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    There are a couple of problems with the code.

    Firstly, you're creating the FileOutputStream (FOS) before creating the directories. That won't work, because the FOS constructor tries to access the file.

    Secondly, if the name does not contain a "/", then it's a top-level file, and no directories should be created.

    Thirdly, if the name ends with a "/", then it's an empty directory, and you should not open a FOS and try to copy bytes.

    Lastly, if the name contains a "/" somewhere in the middle, then it's a file, and mkdirs should only be called with the part up to the last "/".

    You can list the entry names your code should be expecting via "jar tf filename.odt"
    [ November 17, 2007: Message edited by: Ulf Dittmer ]
     
    Arthur Buliva
    Ranch Hand
    Posts: 101
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    Eureka!

    Thanks Dittmer



    Has solved my problem
     
    Arthur Buliva
    Ranch Hand
    Posts: 101
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator


    Has crudely but successfully 'ripped' up the xml file. Now it is to 'reverse engineer' this class to save the changes.
     
    Ulf Dittmer
    Rancher
    Posts: 43075
    77
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    I think you'd be better off with a real XML parser. Scanning an XML file like you showed above gets tricky in the presence of nested elements. Something like the following would do the trick (it doesn't extract the files, just reads them in-place, but it's easy to adapt to your situation).

     
    Arthur Buliva
    Ranch Hand
    Posts: 101
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    How is this class used?

    This is what am getting:

    Is there something am missing?
     
    Ulf Dittmer
    Rancher
    Posts: 43075
    77
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    The for loop should run from 0, not 1. Silly typo.
     
    Arthur Buliva
    Ranch Hand
    Posts: 101
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    Great


    is what I get.

    Now, my main aim so far of going through the unzipping process is the editing part. Am building a simple online odt editor. So on saving, I was thinking of working with the content.xml file, editing whatever I want, then zipping up the entire package and renaming the resulting file to a dot odt. Thus I would have achieved the intended result.
     
    Goodbye moon men. Hello tiny ad:
    The Low Tech Laboratory Movie Kickstarter is LIVE NOW!
    https://www.kickstarter.com/projects/paulwheaton/low-tech
    reply
      Bookmark Topic Watch Topic
    • New Topic