• Post Reply Bookmark Topic Watch Topic
  • New Topic

Phonetic String search for mis-spelled strings  RSS feed

 
Nina Anderson
Ranch Hand
Posts: 148
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hey Guys,

I have a web application that has a search page. When a user enters a search string, I query my Sql Server database for results.

However, if the user mis-spells a word in the search string...the database does not find anything. Does anyone know have to handle intelligent phonetic searches.

I'm CLUE-LESS, so I'll appreciate any insight...
 
Paul Sturrock
Bartender
Posts: 10336
Eclipse IDE Hibernate Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What you are talking about is not really the domain of database searches. You could implement this sort of logic with lots of like queries (or some such approach) but that just sounds like hard work. Instead, if you want free text searching, you might consider using Lucene which includes fuzzy searching.
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
This is a problem I know alot about due to my work with a legal service firm transcribing documents where peoples names are spelled all sorts of ways. Actually it is an old problem in computing.

You have to do some extra work to create a list of phonetically encoded words and use that list to look up possible matches.

This servlet demos one solution based on the "metaphone" phonetic coding algorithm.

The Apache Commons Project has gathered several phonetic encoding scheme toolkits for free download.

Bill
 
Ulf Dittmer
Rancher
Posts: 42972
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The Metaphone algorithm Bill mentions is very helpful. If the words (phrases, names, ...) are not English, have a look at DoubleMetaphone instead, which is also implemented by the Commons Codec library.

Another helpful algorithm for comparing words for similarity is the Levenshtein/Damerau Edit Distance.
 
Nina Anderson
Ranch Hand
Posts: 148
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I found out the SQL Server already has the Full-Text search engine for this.

I hope it helps someone out their!!
http://www.eggheadcafe.com/articles/20010422.asp
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Just glancing at that SQL Server page, it appears that there is no provision for phonetic matching or any other handling of mis-spelled words.

Bill
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!