Win a copy of Emmy in the Key of Code this week in the General Computing forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Junilu Lacar
  • Jeanne Boyarsky
  • Bear Bibeault
Sheriffs:
  • Knute Snortum
  • Devaka Cooray
  • Tim Cooke
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Tim Holloway
  • Ron McLeod
  • Carey Brown
Bartenders:
  • Paweł Baczyński
  • Piet Souris
  • Vijitha Kumara

Reading word Documents Using JAVA

 
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi All,

I have to read Word document(97-2003), using java. And data is in tabular form as name value pairs. I have already implemented it using
apache POI. But I am facing some issue with that, as when i change the document of word of 2003 in office 2007, POI API throws null pointer exception related to
LittleEndian. When i saw apache site, it is a bug and will be resolved in coming releases.

http://apache-poi.1045710.n5.nabble.com/Unable-to-get-paragraphs-from-test-doc-td2314726.html

My question here is has anybody used Open office for reading word documents. If yes please share some sample code.

Dear Friends, Please provide your opinions.

Thanks,
sushil
 
Rancher
Posts: 43011
76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The discussion is from 2005; whatever it talks about as being in the future, most likely has happened by now. Why do you think your problem is related to that, and why do you think endianness has anything to do with it? Can you post an SSCCE?
 
sushil grover
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Ulf,

Thanks for your reply. I have tested it using latest POI version 3.7 as well as 3.8 beta release.

Below is the code snippet

InputStream fis = new FileInputStream(fileName);
POIFSFileSystem fs = new POIFSFileSystem(fis);
HWPFDocument doc = new HWPFDocument(fs);

Range range = doc.getRange();
for (int i=0; i<range.numParagraphs(); i++){
Paragraph tablePar = range.getParagraph(i); //Here i am getting exception
if (tablePar.isInTable()) {
Table table;
try{
table = range.getTable(tablePar);
}catch(Exception e){
continue;
}
for (int rowIdx=0; rowIdx><table.numRows(); rowIdx++) {
TableRow row = table.getRow(rowIdx);
for (int colIdx=0; colIdx><row.numCells(); colIdx++) {
TableCell cell = row.getCell(colIdx);
}
}
}

I am getting following exception

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 16
at org.apache.poi.util.LittleEndian.getShort(LittleEndian.java:46)
at org.apache.poi.hwpf.sprm.SprmOperation.getOperand(SprmOperation.java:98)
at org.apache.poi.hwpf.sprm.ParagraphSprmUncompressor.unCompressPAPOperation(ParagraphSprmUncompressor.java:87)
at org.apache.poi.hwpf.sprm.ParagraphSprmUncompressor.uncompressPAP(ParagraphSprmUncompressor.java:63)
at org.apache.poi.hwpf.model.PAPX.getParagraphProperties(PAPX.java:136)
at org.apache.poi.hwpf.usermodel.Range.getParagraph(Range.java:828)
at com.jp.processor.Docfile_Reading.main(Docfile_Reading.java:62)

Sometimes it is nullpointer exception while uncompressing.


>
 
No holds barred. And no bars holed. Except this tiny ad:
Java file APIs (DOC, XLS, PDF, and many more)
https://products.aspose.com/total/java
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!