Win a copy of Head First Agile this week in the Agile forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

how to find if string contains string from enum or hash without loop  RSS feed

 
Meir Yan
Ranch Hand
Posts: 599
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello all
i have here small dilemma , im trying to filter some inputstream
with checking every line im reading if it contains string out of collection of strings
but i can't find any way to do that without looping the collection , i will like to avoid the looping .

that is for example if i have :
line.indexOf(.....here i dont know what to put ...)
if i store the string collection in hash or some kind of list or enum i will need to loop each time
how can i avoid it?
 
Ernest Friedman-Hill
author and iconoclast
Sheriff
Posts: 24217
38
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If the collection of Strings to check for is in a HashSet, then you can just say

if (aHashSet.contains(aString)) ...

and there's no loop involved, either in your code or in the HashSet.
 
Meir Yan
Ranch Hand
Posts: 599
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello and thanks for the fast reply
maybe i didn't explain my self well , or i just don't understand ( probably this is the reason...).
how can it works?
say i have line contains :
line = "blah blah with thisIsCommand_1 foo"
now i have hashtable looks like:
hashSet = [thisIsCommand_1,thisIsCommand_1]
[thisIsCommand_2,thisIsCommand_2]
[thisIsCommand_3,thisIsCommand_3]

now say i like to check if my string "line"
contains one of the hash elements , how can it be done?
i dont know what to look in the string i only know i have hashtable contains some strings that one of them
im looking in the giving string .
thanks
 
Ernest Friedman-Hill
author and iconoclast
Sheriff
Posts: 24217
38
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ah. Sure, you will have to loop over the words in the line (you might use the String.split() method to break out the words). I thought you were asking about looping over the words to which you were comparing them.

I suppose you could always say something crazy like

Set tokens = new HashSet(Arrays.asList(theLine.split(" ")));
if (!Collections.disjoint(tokens, theWordSet)) ...

but I think that's unlikely to be better than looping yourself.
 
Henry Wong
author
Sheriff
Posts: 23284
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
now say i like to check if my string "line"
contains one of the hash elements , how can it be done?
i dont know what to look in the string i only know i have hashtable contains some strings that one of them
im looking in the giving string .


I don't see a way of avoiding the loop. You'll need to test each member of the container to see if it contains a substring of the line.

Henry
 
Bauke Scholtz
Ranch Hand
Posts: 2458
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
This is simple:
output

Keep in mind that retainAll itself already does a loop ..

AbstractCollection.java
[ October 10, 2006: Message edited by: Bauke Scholtz ]
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you're going to check the same set of words against millions of lines, you could loop through the words once and build a big hairy regular expression, compile it once, match with it many times. That might be better than looping through all the words of every sentence.

Regex will look something like (word)|(word)|(word) ... You can find the matching group to tell which word matched:

I borrowed this trick from the Fitnesse Wiki.
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
[Stan]: That might be better than looping through all the words of every sentence.

In theory it could be, but in practice I don't think so. The java.util.regex.Pattern class seem to compile this into a series of Branch nodes, which is roughly analogous to replacing a loop with a series of if / else if / else statements. So the "loop" is hidden now, but it's still there. There are pattern-matching algorithms which could do this sort of thing more efficiently, e.g. Aho-Corasick or Rabin-Karp. But they don't seem to be implemented in the Pattern class. (Boyer-Moore is, but doesn't help us where alternation is concerned.) Maybe there's another regex package out there that does already implement a fast multi-pattern matching algorithm. Otherwise it's roll your own I guess. Probably not a good idea since this is posted in the Beginner forum.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!