It's odd that Pattern.matches and pattern.matcher.matches provide such different results. Pattern.matches compiles a new Pattern, gets a Matcher for the input and calls matches on that:
So in fact, they are the same (save one method call).
Oh wait, of course. The compiling of the Pattern object is taken outside of the loop. That's a bit cheating, isn't it?
Matt Cartwright wrote:PS: I wanted to attach the source code I've used. Unfortunately
.java, .zip, .tar and .gz are not valid extensions.
Could you try to rename file into *.txt or *.jpg? Or could you please send your test example to me via private message? I've found no performance measure of the regular expressions in the internet yet.
By the way, matching ".*" + needle + "*.*" means quite a bit of work including a huge amount of backtracking especially for strings which don't contain needle in them.
See, it first greedily matches ".*" until the end of string (because dot matches every character except newline), then tries to match "needle", fails, backtracks one character back, tries to match once more - and this in cycle moving backwards in the string until finds the last entry of "needle" (or reaches the beginning of string finishing the whole roundtrip). Then it matches the rest of string by the last ".*".
Moreover, regex ".*needle*.*" isn't actually doing the thing you want (for example, "needle*" doesn't mean 0 or more times "needle" but instead it means "needl" and 0 or more times "e"). If you want to find any number of repeated "needle" substring within the string you should use ".*?(needle)+.*" instead.
First ".*?" means shy matching (it doesn't tries to possess the whole string and then backtrack - instead it tries to match the next expression after each character match).
So could you please try testing match of the this example?
Next, indexOf() performs only search of the substring. Using the whole matching regular expression is quite unfair since the regular expression with the matches() method should match each string character.
Exact equivalent of text.indexOf("needle") would be Pattern.compile("needle").matcher(text).find();