Win a copy of Emmy in the Key of Code this week in the General Computing forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Junilu Lacar
  • Jeanne Boyarsky
  • Bear Bibeault
Sheriffs:
  • Knute Snortum
  • Devaka Cooray
  • Tim Cooke
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Tim Holloway
  • Ron McLeod
  • Carey Brown
Bartenders:
  • Paweł Baczyński
  • Piet Souris
  • Vijitha Kumara

Regarding .DOC file

 
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
is it possible to recognize a table from .DOC file
following is code which i tried using apache poi jar file

public class DocReader{

public void readDocFile() throws IOException {
Table t=null;
Paragraph para=null;
File docFile = null;
HWPFDocument doc=null;
WordExtractor docExtractor = null ;

try {
docFile = new File("D:\\Kiran\\Programs
table.doc");
//A FileInputStream obtains input bytes from a file.
FileInputStream fis=new FileInputStream(docFile.getAbsolutePath());
//A HWPFDocument used to read document file from FileInputStream
doc=new HWPFDocument(fis);
docExtractor = new WordExtractor(doc);
Range range = doc.getRange();
StyleSheet styleSheet = doc.getStyleSheet();

int numberOfParagraphs = range.numParagraphs();
int numberOfSections = range.numSections();
int numberOfRuns = range.numCharacterRuns();

for(int i=0;i<=numberOfSections;i++){
Section section = range.getSection(i);
int numberOfColumns = section.getNumColumns();
numberOfParagraphs = section.numParagraphs();
numberOfRuns = section.numCharacterRuns();


System.out.println("Columns: " + numberOfColumns);

System.out.println("Paragraphs: " + numberOfParagraphs);
System.out.println("Runs: " + numberOfRuns);

for (int paragraphIndex = 0; paragraphIndex < numberOfParagraphs; paragraphIndex++) {
Paragraph paragraph = null;
try {
paragraph = section.getParagraph(paragraphIndex);
}
catch (Exception exception) {
System.out.println("Ignore paragraph exception: " + exception.toString());
}
System.out.println("--------- paragraph " + paragraph.text() + " ---------");

if (paragraph != null) {
//till here it is working fine
try{
//here i am getting the data from a table but i am unable to read the data.but this line is not at all working
Table table = range.getTable(paragraph.getParagraph(i));
//this is line is not printing at all
System.out.println("table: ");
if (table != null) {
System.out.println("table rows: " + table.numRows());
}
}
catch (IllegalArgumentException exception) { // not in paragraph
}

int styleIndex = paragraph.getStyleIndex();

}

}
}
}

catch(Exception exep)
{
System.out.println(exep.getMessage());
}

//This Array stores each line from the document file.
String [] docArray = docExtractor.getParagraphText();

for(int i=0;i<docArray.length;i++)
{
if(docArray[i] != null){
System.out.println("Line "+ i +" : " + docArray[i]);

}
}
}

public static void main(String[] args) throws IOException {
DocReader reader = new DocReader();
reader.readDocFile();
}
}






Regards

Praveen
 
Rancher
Posts: 43011
76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
That's a lot of code to read and understand in this unformatted layout. Please edit your post (by clicking the little paper-and-pencil icon - - and post formatted code surrounded by CODE tags: UseCodeTags)

Also, tell us in detail what the code currently does. "not at all working" is not a useful problem description.
[ April 01, 2008: Message edited by: Ulf Dittmer ]
 
praveen valaboju
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
//A FileInputStream obtains input bytes from a file.

FileInputStream fis=new FileInputStream(docFile.getAbsolutePath());

//A HWPFDocument used to read document file from FileInputStream

HWPFDocument doc=new HWPFDocument(fis);

docExtractor = new WordExtractor(doc);


Range range = doc.getRange();


for (int paragraphIndex = 0; paragraphIndex < numberOfParagraphs;

paragraphIndex++) {


Paragraph paragraph = null;


try {


paragraph = section.getParagraph(paragraphIndex);

}

catch (Exception exception) {

System.out.println("Ignore paragraph exception: " + exception.toString());

}

System.out.println(paragraph.text());

if (paragraph != null) {

//till here it is working fine

try{

//here i am getting the data from a table but i am unable to read the data.



Table table = range.getTable(paragraph.getParagraph(i));


//this is line is not printing at all

System.out.println("table: ");


if (table != null) {


System.out.println("table rows: " + table.numRows());


}

}
catch (IllegalArgumentException exception) { // not in paragraph


}

I can read the data from a word file i am getting problem in identifying the table from a .doc file as i am in need of only table data
 
Get out of my mind! Look! A tiny ad!
Java file APIs (DOC, XLS, PDF, and many more)
https://products.aspose.com/total/java
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!