• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

"Did u mean" functionality using java

 
s mahen perera
Ranch Hand
Posts: 101
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Not sure whether this is the correct Forum to post this Question, anyway,

I need to implement the "Did you mean" functionality programatically using some java api.
basically, i need to integrate this in to a search engine that we have developed, and say if the user enters some misspelt search word, then we must be able to say "Did you mean XXXXX" , where XXXXX is the corrected word, and then search results relating to the corrected word.
hope i am clear,

thanks for any feedback and replies,
 
Ulf Dittmer
Rancher
Posts: 42968
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
One approach that comes to mind is to maintain a dictionary of words and search phrases that you want to cover. Whenever a search turns up few results, check the dictionary whether there is a word or phrase that is close the original search phrase (e.g. using the weighted Levenshtein distance as a measure of "closeness"). If that word/phrase has a lot more hits than the original one, offer it as a "did you mean..." alternative.
 
s mahen perera
Ranch Hand
Posts: 101
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks Ulf for that!! ,,appreciate that,,
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13071
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Years ago I coded a phonetic lookup tool to help a legal transcription service resolve different spellings of witness, etc names. There were lots of variations since the text came from court reporters listening to witnesses.

Here is an online demo.

Code for the Metaphone and other phonetic match algorithms is in the Apache Commons project here.

Bill
 
s mahen perera
Ranch Hand
Posts: 101
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
waaw,, super tool! will check that and see,, Thanks!
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Metaphone is cool, Bill. I never got past Soundex which was invented in 1918. I wonder what language they were programming in back then. That WikiPedia page had a link to other Phonetic Algorithms of interest.
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13071
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Metaphone is indeed cool - Lawrence Philips identified a real need when he came up with it.

I had an astonishing amount of interest when I first put that demo up - including many from folks having to match other language pronuciation - which would require completely new code of course. I have often wondered how many managed to make it work in their chosen language.

Bill
 
Stefan Wagner
Ranch Hand
Posts: 1923
Linux Postgres Database Scala
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
In the german javamagazin, there was just an article about such searches, mostly dealing with lucene:
http://lucene.apache.org/java/docs
 
Ulf Dittmer
Rancher
Posts: 42968
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
William Brogden wrote:
I have often wondered how many managed to make it work in their chosen language.

I just noticed that the DoubleMetaphone algorithm -which has been part of Commons Codec for a while- has provisions to detect and adapt to Slavo-Germanic languages. I'll definitely be looking into that, since I have a current need for something that works with German phrases.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic