Win a copy of Functional Reactive Programming this week in the Other Languages forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Reading from .doc or .docx file

 
Sawan Mishra
Ranch Hand
Posts: 47
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

I understand the above program but my problem is reading from
Ms-word(.doc file or .docx file) and writing result to console gives
unexpected output.
How can I read from .doc file and write content to console correctly??

thanks in advance
with regards
 
Jesper de Jong
Java Cowboy
Saloon Keeper
Posts: 15492
43
Android IntelliJ IDE Java Scala Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Microsoft Word .doc and .docx files are not simple text files that you can read this way with a FileReader.

You'll need a library that understands the specific MS word file formats, such as Apache POI.
 
Ulf Dittmer
Rancher
Posts: 42968
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Those are structured file formats which contain much else besides the plain text. You need to use a library like Apache POI (which can extract the plain text, and also provides an API to get at the structured content).
 
Paweł Baczyński
Bartender
Posts: 1827
33
Firefox Browser IntelliJ IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Don't read doc as a regular text file!
http://stackoverflow.com/questions/7102511/how-read-doc-or-docx-file-in-java
 
Tony Docherty
Bartender
Posts: 2989
59
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You need to use a library that understands the format that doc and docx files are saved in. Fortunately there are free libraries available such as POI which can be found at http://poi.apache.org/
 
Campbell Ritchie
Sheriff
Pie
Posts: 50258
79
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Too difficult for “beginnign”: moving.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic