This week's book giveaway is in the Artificial Intelligence forum.
We're giving away four copies of Pragmatic AI and have Noah Gift on-line!
See this thread for details.
Win a copy of Pragmatic AI this week in the Artificial Intelligence forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Jeanne Boyarsky
  • Liutauras Vilda
  • Campbell Ritchie
  • Tim Cooke
  • Bear Bibeault
Sheriffs:
  • Paul Clapham
  • Junilu Lacar
  • Knute Snortum
Saloon Keepers:
  • Ron McLeod
  • Ganesh Patekar
  • Tim Moores
  • Pete Letkeman
  • Stephan van Hulst
Bartenders:
  • Carey Brown
  • Tim Holloway
  • Joe Ess

String searching  RSS feed

 
Ranch Hand
Posts: 70
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm trying to implement a language filter on a message board. Essentially I'll have a list of words in a database that are flagged as inappropriate. I need to search the incoming string from the post for those words. I only need to detect if any of them exist. Of course, I can just search on each individual word with a while loop, but that seems really inefficient. Is there any way I can search a string for ANY of say 10 different words?
 
Ranch Hand
Posts: 529
C++ Java Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I can't think of any other way to do this. If the string is not too long and you're only looking for 10 or so words, this processing should not take long. indexOf() method is pretty fast.
Does anyone else have ideas?

 
Ranch Hand
Posts: 241
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi, everyone.
Barry, two things: your code will only do the "do something" if all of the taboo words exist in the string; I think Ben wants to not post the message if any of the naughty words exists in the string.
Second, even if Ben changes the &&'s to | |'s, Java will still have to do an indexOf() on every word in the list every time, even if the first one in that list invalidates the message. Could Ben do something like this?
1. Put the list of sought words in an array of Strings.
2. Set up a boolean, badWordFound, to false. Set up an int counter to zero.
3. Go through a while loop: While badWordFound is false, AND the array still has Strings to look for in it, loop.
4. If the array's word at position "counter" is found in the message string, set badWordFound = true.
5. Increment the counter
6. **End of while piece of code**
7. if (badWordFound) do something else postMessage()
Possible, no?
Art

[This message has been edited by Art Metzer (edited December 05, 2001).]
 
Ben Roy
Ranch Hand
Posts: 70
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
That is essentially what I have implemented for now. Turn's out the really interesting part of the problem is in filtering out legitimate words that contain naughty ones. Like glass, bass, mass, etc. All of those words are ok, but under the filters we've discussed here the posts would be blocked.
 
Ben Roy
Ranch Hand
Posts: 70
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
At first I was thinking of finding the word, then checking the char before and after it to see if they were spaces. But then...if I just checked for " " + myNaughtyWord = " " in the first place, I could save a lot of yucky mucking around.
 
"The Hood"
Sheriff
Posts: 8521
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Originally posted by Ben Roy:
Like glass


Discrimination!!!

[This message has been edited by Cindy Glass (edited December 06, 2001).]
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!