• Post Reply Bookmark Topic Watch Topic
  • New Topic

String: Lexicographically comparing them  RSS feed

 
Shyam Prasad Murarka
Ranch Hand
Posts: 209
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dear Readers,
I am implementing my own compareTo() method for comparing two Strings. I have done it and it works. I am not quite satisfied and was looking for improvements, if any.


If you read this post, and find that there is no room for improvement, then please post a message saying so. Otherwise, I will be led into thinking that no one has read this topic.(I am assuming this is the best way of comparing it, so nobody would reply if I am right)
 
Joni Salonen
Ranch Hand
Posts: 53
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Why would you write a method of your own when the standard API provides us with lexicographic comparators?

There's String.CASE_INSENSITIVE_ORDER for, well, case-insensitive order, and java.text.Collator for true alphabetic order (e.g. in English a < � < b).
[ May 06, 2006: Message edited by: Joni Salonen ]
 
Robert Hill
Ranch Hand
Posts: 94
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Joni Salonen:
Why would you write a method of your own when the standard API provides us with lexicographic comparators?

There's String.CASE_INSENSITIVE_ORDER for, well, case-insensitive order, and java.text.Collator for true alphabetic order (e.g. in English a < � < b).

[ May 06, 2006: Message edited by: Joni Salonen ]


It is called learning how to program.
 
Ernest Friedman-Hill
author and iconoclast
Sheriff
Posts: 24217
38
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
There's lots of room for improvement. There are two (sometimes conflicting) ways to go: more efficient, or more clear. My recommendations are for things that help both of these areas, but there are ways that would push one of these two to the extreme at the expense of the other -- depends what you want to do.

For example toLowerCase() is not a terribly expensive method to call, but neither is it free, so it's worth avoiding the calls if you can. You can do that by comparing the characters first, and only calling toLowerCase() if they're unequal. Similarly, all those calls to String.length() are unneeded.

Another thing is that the control flow you've chosen is confusing and hard to follow. do-while loops are rarely used because they're often hard to understand, and especially here, where you're just stepping an integer to a predertermined limit, they're vastly inferior to the more straightforward "for" loop, which puts all loop control on one line.

Finally, compareTo() is not required to return +/-1 or 0; it can return any positive or negative value. Therefore a nice shortcut to compare n1 and n2 is to return n1-n2, which returns a value of the right sign, or zero, as needed in each case.
 
Shyam Prasad Murarka
Ranch Hand
Posts: 209
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dear Readers,
So there is an eficient way, after all. I have read the suggestions and I shall improve my code as soon as possible and post it here. Thanks very much.
But, I still haven't understood the last suggestion:


Finally, compareTo() is not required to return +/-1 or 0; it can return any positive or negative value. Therefore a nice shortcut to compare n1 and n2 is to return n1-n2, which returns a value of the right sign, or zero, as needed in each case.



When you say that I have to return n1-n2, does it refer to the Person references(in my program)? How and why will it return an Integer. I am confused!
 
Ernest Friedman-Hill
author and iconoclast
Sheriff
Posts: 24217
38
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I meant that you can, for example, replace this:

result = (currentName.length()<paraName.length())?-1:1;

with this, which compiles to less bytecode and is easier to type and to read:

result = currentName.length() - paraName.length()
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hmmm, I'm hesitant to recommend that technique in general, though it works out perfectly well here. The problem is that it can lead to problems with overflow which can be somewhat difficult to detect. If you have two ints n1 and n2, n1 - n2 may give a result outside the range of an int - which means the result will be mangled by losing the most significant bit (effectively either adding or subtracting 2^32, whichever is needed). E.g. 2000000000 - -1000000000 does not yield 3000000000 as expected; it yields -1294967296. For this reason I recommend using > and < to compare integers, unless you are sure that overflow will not be an issue.

So why does it work out OK in this problem? Because the numbers being compared are both positive ints, and the largest possible difference between two positive ints is Integer.MAX_VALUE. Which fortunately still fits into the range of an int. Similarly there's another opportunity to use subtraction to compare two chars, and that works out OK because the result of char - char will be widened to an int, which is more than large enough to accommodate the difference between two chars.

If you follow the reasoning in the preceding paragraph, then you can use similar logic to decide on a case by case basis whether or not subtraction is a safe way to compare numbers. However if you're not sure, for any reason, then you're probably better off avoiding the subtraction and just relying on > and < instead. The code may be a little longer, but it will also probably be safer.
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!