Win a copy of Escape Velocity: Better Metrics for Agile Teams this week in the Agile and Other Processes forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Tim Cooke
  • Paul Clapham
  • Jeanne Boyarsky
Sheriffs:
  • Ron McLeod
  • Frank Carver
  • Junilu Lacar
Saloon Keepers:
  • Stephan van Hulst
  • Tim Moores
  • Tim Holloway
  • Al Hobbs
  • Carey Brown
Bartenders:
  • Piet Souris
  • Frits Walraven
  • fred rosenberger

Phonetic String search for mis-spelled strings

 
Ranch Hand
Posts: 148
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hey Guys,

I have a web application that has a search page. When a user enters a search string, I query my Sql Server database for results.

However, if the user mis-spells a word in the search string...the database does not find anything. Does anyone know have to handle intelligent phonetic searches.

I'm CLUE-LESS, so I'll appreciate any insight...
 
Bartender
Posts: 10336
Hibernate Eclipse IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
What you are talking about is not really the domain of database searches. You could implement this sort of logic with lots of like queries (or some such approach) but that just sounds like hard work. Instead, if you want free text searching, you might consider using Lucene which includes fuzzy searching.
 
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
This is a problem I know alot about due to my work with a legal service firm transcribing documents where peoples names are spelled all sorts of ways. Actually it is an old problem in computing.

You have to do some extra work to create a list of phonetically encoded words and use that list to look up possible matches.

This servlet demos one solution based on the "metaphone" phonetic coding algorithm.

The Apache Commons Project has gathered several phonetic encoding scheme toolkits for free download.

Bill
 
Rancher
Posts: 43028
76
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The Metaphone algorithm Bill mentions is very helpful. If the words (phrases, names, ...) are not English, have a look at DoubleMetaphone instead, which is also implemented by the Commons Codec library.

Another helpful algorithm for comparing words for similarity is the Levenshtein/Damerau Edit Distance.
 
Nina Anderson
Ranch Hand
Posts: 148
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I found out the SQL Server already has the Full-Text search engine for this.

I hope it helps someone out their!!
http://www.eggheadcafe.com/articles/20010422.asp
 
William Brogden
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Just glancing at that SQL Server page, it appears that there is no provision for phonetic matching or any other handling of mis-spelled words.

Bill
 
reply
    Bookmark Topic Watch Topic
  • New Topic