Win a copy of The Java Performance Companion this week in the Performance forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

String.equals() versus regex

 
Tom Scott
Greenhorn
Posts: 21
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi all,

I need to compare two Strings.

String a = "{xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}"; // constant

String b; // b will vary during execution but have the same length and format only with various characters instead of 'x'.

I have written:

if (!a.equals(b))

...in my code.

I have been advised that this test could be improved upon by using Regex expressions - somehow! I am unsure what precisely is being suggested.

Can anyone explain precisely why this would be the case and how it would be done? I had always assumed that String.equals would be the most efficient way to compare the strings.

Thanks in advance.
Tom
 
Peter Chase
Ranch Hand
Posts: 1970
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sounds improbable. Surely, to compare strings for exact equality, you have to compare every character, and regular expressions cannot help. A regular expression might be faster for more-sophisticated string comparisons, but not for exact equality, I think.

If you happened to know that certain parts of the string were more likely to differ than others, then you might win with a carefully-constructed regex, I suppose. Or you might win with your own customised comparison method.
 
Tom Scott
Greenhorn
Posts: 21
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Many thanks for replying.

In this specific example the { } and - characters are expected to be the same in all strings compared and to be in the same positions.

I am unsure how a regex expression would allow this fact to be taken advantage of - I suppose I can imagine an equality test that doesn't bother looking at the 'fixed' characters it wouldn't be a true equality test though...what if the assumption about these chars turned out to be wrong due to bad data or changing system behaviour.

Anyway - thanks again!

Tom
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Let's try some examples ... I'm guessing at your rules so they may be a bit off ...

That kind of thing is pretty easy to do with regular expressions - once you've made your first one work. Look at the JavaDoc for Pattern and see if you can make up a pattern that matches something simple, then build it up to this.
 
Peter Chase
Ranch Hand
Posts: 1970
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Tom Scott:
what if the assumption about these chars turned out to be wrong due to bad data or changing system behaviour


You have to determine whether you really do need to compare all characters. If you do, nothing will beat String.equals(). If you don't, a custom-written equality test or a regex might theoretically do a bit better - if that little bit of performance really matters to you.

If you use assertions, you might consider an assertion that documents and checks your assumptions about the format. This would alert you to problems, during development, but could be turned off for production code.
[ October 18, 2007: Message edited by: Peter Chase ]
 
Ilja Preuss
author
Sheriff
Posts: 14112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It depends on what was meant with "improved upon".

Regular expressions certainly won't help in making the code faster, or more expressive.

They *will* help if exact equality is in fact not what you want to test for.
 
Adam Schaible
Ranch Hand
Posts: 101
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Depending on where this equality check takes place, you may consider using the == operator. Considering all compile-time constant strings are interned, the comparisson would be just as intuitive and drastically more efficient (although, a 50% reduction of 2 microseconds is just 1 microsecond).

It's not directly testing equality of the strings, but it does follow the transitive logic pattern (ie: if a=b and b=c, then a=c).

If your strings are not compile time constants, you could always call String.intern() prior to calling your equality test method.

Just my $.02
 
Consider Paul's rocket mass heater.
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic