• Post Reply Bookmark Topic Watch Topic
  • New Topic

lucene and office 2007  RSS feed

 
Ranch Hand
Posts: 41
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Has anyone indexed Office 2007 documents into lucene?

Im trying to index a docx document and allow full text searching of it. The indexing is working for *.doc documents but FT doesn't seem to work for *.docx documents.

If someone could point in the in right direction I would appreciate it.
 
Rancher
Posts: 42975
76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Which library are you using for reading DOC files - Apache POI? If so, that doesn't support the XML-based Office files yet. You could try the beta version of POI, which does support DOCX to a certain degree.
 
Seamus Minogue
Ranch Hand
Posts: 41
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ill take a look at that :-) Thanks
 
Consider Paul's rocket mass heater.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!