• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Jeanne Boyarsky
  • Devaka Cooray
  • Paul Clapham
Sheriffs:
  • Tim Cooke
  • Knute Snortum
  • Bear Bibeault
Saloon Keepers:
  • Ron McLeod
  • Tim Moores
  • Stephan van Hulst
  • Piet Souris
  • Ganesh Patekar
Bartenders:
  • Frits Walraven
  • Carey Brown
  • Tim Holloway

Search bug (this site)

 
Ranch Hand
Posts: 17424
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Just noticed a bug in the search feature on this site (2.1.8 test?). If you include a "noise" word like "in" on the search criteria, you will get no matches.

Here's and example:

Go to the search form.
Fill in keywords with "method in class".
If needed, select "Search All Terms".
Do search.

0 results

Do the same search with "method class"

66 or so records.

What I think is happening is that the indexer is dropping the common noise words out of the index (good thing) but the search is not doing the same thing. So, since the word "in" is not indexed, it can't be found with an "all terms" match. IMHO, the proper behavior would be to drop noise words from the search criteria since people will unknowingly add them.
[originally posted on jforum.net by monroe]
 
Migrated From Jforum.net
Ranch Hand
Posts: 17424
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Another nice feature would be having the choice of OR and AND linking the words ... when already working on the search ;)

The danger on "noise" words is that those are language dependand. It may be valuable and important abbreviations in some languages maybe... so I dont know if it's the best choice to always kill them
[originally posted on jforum.net by Sid]
 
Migrated From Jforum.net
Ranch Hand
Posts: 17424
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I don't think it is a bug, as you are requiring the "in" word. Ok, if we look at how google does it, it removes noise words.

I'll see if there is something in Lucene that says "this word is noise, dont' use it".

Rafael
[originally posted on jforum.net by Rafael Steil]
 
Migrated From Jforum.net
Ranch Hand
Posts: 17424
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
OK, if I meant to require the "in" word, then why doesn't this thread show up.. It has the phase "method in class"? So is the bug that "in" is not indexed?
[originally posted on jforum.net by monroe]
 
Migrated From Jforum.net
Ranch Hand
Posts: 17424
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I get your point. It is not indexed because the Analyzer removes it, but the search query is built using a different approach (just split the words and include all of them).

I'll see how to solve this.

Rafael
[originally posted on jforum.net by Rafael Steil]
 
Migrated From Jforum.net
Ranch Hand
Posts: 17424
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ok, bug fixed.

https://jforum.dev.java.net/source/browse/jforum/src/net/jforum/search/LuceneSearch.java?rev=1.38&r1=1.37&r2=1.38

Rafael
[originally posted on jforum.net by Rafael Steil]
 
Consider Paul's rocket mass heater.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!