Granny's Programming Pearls
"inside of every large program is a small program struggling to get out"
JavaRanch.com/granny.jsp
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Java String Search Algorithm/Implementation  RSS feed

 
Phoenix Kilimba
Ranch Hand
Posts: 64
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dear Sirs et Madames,
I am trying to implement a "Search Engine" of sorts that can search for a specific location/s based on user input in a search field. ... for example I have a location called IKWIRIRI (a little village in Tanzania): but allowing for errors in spelling from the user, I need a search algorithm/implementation which can allow for the possibility of an input like IWKIRII or IRKWIIR and at least give me back a list of best possible matches... is this possible? is there a ready made API or implementation that would allow me to do such a thing? If not on the java level, is it possible at the database level (MySQL: I know this is a java forum, so forgive me for squeezing this question in...).

Thanks in advance,
 
Paul Beckett
Ranch Hand
Posts: 96
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It may not be exactly what you want but there are some algorithms out there for "sounds like" matches (eg soundex and double metaphone). See the apache commons codec for an implementation.

Each algorithm produces a encoded form of a string. You could then compare the encoded form of the "real" name with an encoded form of the search string.
 
Ulf Dittmer
Rancher
Posts: 42972
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I've had good success using the Double Metaphone implementation of the library Paul mentioned. I'd advise against using Soundex - it's primarily useful for English, and can't deal with misspellings at the beginning of a word. Double Metaphone addresses both these issues.
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!