• Post Reply Bookmark Topic Watch Topic
  • New Topic

Calling all Jeff's - or anyone else with an analytical mind :-)  RSS feed

 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Likes 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Back at the start of June, I got into an interesting debate with Jeff Verdegan on the virtues of instanceof vs getClass() when writing equals() methods for hierarchies. I've always preferred instanceof because it seems more "objective" to me; he likes getClass() because it's simple and safe (and there's no denying that).

During our discussion, Mike Simmons kindly pointed me to this article by Martin Odersky (which I'd seen before), that details a very neat instanceof solution that uses a canEqual() method (apparently quite well-known in Scala circles; it's near the bottom of the page).

However, I think I may have improved on it, and I'd like any deep thinkers out there to tell me if I'm right or not. I've tested it quite hard and found no flaws so far; but given the number of possibilities, I may well be missing the one that bursts my balloon.

Assumptions are as follows:
  • You don't override equals() unless you intend to restrict comparisons to a sub-branch of the hierarchy (the idea being that you don't have to make it final).
  • Whenever you override equals(), you ALSO override canEqual() to check for the current class.

  • Code is as follows. I've commented it with my reasoning, which hopefully you can follow.

    For the top-level class in your hierarchy:
    and for a subclass that overrides equals():and you can add as many "SubClass" levels as you like (well until you break the stack limit, anyway ).

    The reason for canEqual(), coupled with the 'that.canEqual(this)' check is to fulfill the "transitivity" requirement (if x.equals(y) and y.equals(z), then x.equals(z)).

    Basically, the difference between Odersky's approach and mine is that mine always bubbles up to the top level, whereas his eliminates current class or type differences immediately, so I suspect mine will be slower.
    On the other hand, mine checks superclasses in hierarchical sequence (ie, top-down), so I suspect it might be more conducive to subtler "subtree logic". However, I have absolutely no evidence for that beyond a "feeling".

    All comments gratefully received; even if it's just to tell me I'm an idiot and burst my bubble (but if you do, please say why ).

    Winston
     
    Mike Simmons
    Ranch Hand
    Posts: 3090
    14
    • Likes 2
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Hm, I agree. I just went through the exercise of starting with the Odersky implementation and eliminating all redundancies I could find, and I ended up with something very much like what you jut showed. Which is cool since I hadn't really grokked what you were writing about at first; I did mine pseudo-independently, and arrived at the same place. I think this is probably the preferred way to do it. At least, I prefer it.

    The only other improvement I came up with is that canEqual() might as well be protected. There's no need for anyone outside the hierarchy to ever call it. This keeps the API cleaner for clients to understand.

    Winston Gutkowski wrote:Basically, the difference between Odersky's approach and mine is that mine always bubbles up to the top level, whereas his eliminates current class or type differences immediately, so I suspect mine will be slower.

    I don't think there's any reason to assume that differences in subclass fields are more important than differences in superclass fields. Depends on the domain I guess - but I see no reason to think this method will be slower in general.
     
    Mike Simmons
    Ranch Hand
    Posts: 3090
    14
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Oh, and Winston: please be forthright when crossposting to other sites. [link]
     
    Chan Ag
    Rancher
    Posts: 1090
    14
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Thank you, Winston and Mike for sharing your knowledge with us. As for me, I will try to use these guidelines when I'm working on similar stuff. Perhaps there is a lot more to know, but I'm sure with you guys ( and others ) around, things won't remain very complicated for long.

    Thank you and best regards,
    Chan.
     
    Winston Gutkowski
    Bartender
    Posts: 10575
    66
    Eclipse IDE Hibernate Ubuntu
    • Likes 1
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Mike Simmons wrote:The only other improvement I came up with is that canEqual() might as well be protected. There's no need for anyone outside the hierarchy to ever call it. This keeps the API cleaner for clients to understand.

    Absolutely. Oddly enough, I came to the same conclusion myself after posting.

    I don't think there's any reason to assume that differences in subclass fields are more important than differences in superclass fields. Depends on the domain I guess - but I see no reason to think this method will be slower in general.

    Well, his method avoids a few calls to "super.equals()" (and that first "if (this == other)" check). Mind you, if your hierarchy is so enormous that it makes a lot of difference, you'd probably want to look at your design anyway.

    Thanks very much for your efforts Mike, I really appreciate it. And it's nice to know that we've come to similar conclusions independently.

    Winston
     
    Winston Gutkowski
    Bartender
    Posts: 10575
    66
    Eclipse IDE Hibernate Ubuntu
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Mike Simmons wrote:Oh, and Winston: please be forthright when crossposting to other sites. [link]

    Ooops, quite right. My apologies. Didn't get a sausage out of them though. I suspect the site is semi-dormant.

    Winston
     
    Winston Gutkowski
    Bartender
    Posts: 10575
    66
    Eclipse IDE Hibernate Ubuntu
    • Likes 1
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Chan Ag wrote:Thank you, Winston and Mike for sharing your knowledge with us.

    You're most welcome. Glad it interests you; it makes me feel slightly less geeky.

    As for me, I will try to use these guidelines when I'm working on similar stuff. Perhaps there is a lot more to know...

    Fine, but do make sure you test well. Like I say, I think it's right, but I haven't yet proved it - at least, not to my satisfaction - so: Caveat programmer.

    Also, the style that uses getClass() has two great merits to it:
  • It's simple.
  • Its safe.

  • That simplicity comes at some loss of "objectivity", but you should probably ask yourself if you really need it.

    Winston
     
    Martin Vajsar
    Sheriff
    Posts: 3752
    62
    Chrome Netbeans IDE Oracle
    • Likes 2
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Winston Gutkowski wrote:That simplicity comes at some loss of "objectivity", but you should probably ask yourself if you really need it.

    That's what I wanted to ask you. So here I go - in how many cases did you need something like this?

    I've mucked about with equals a few times and always got burned. In my opinion, anything but the getClass() technique strongly violates the KISS principle. Having equals bad is horrible mostly because it manifests itself in some code far, far away from the original sin (HashSet cannot be buggy! It must be something else - someone removed that object from my set...), and secondly (this might actually be just my own disposition), you don't expect equals to be wrong - equals is such a stupid simple method!

    (I've once spent several hours discovering my compareTo was not transitive. I've even resorted to debug the internals of TreeSet. I'm not going to forget for some time.)
     
    Winston Gutkowski
    Bartender
    Posts: 10575
    66
    Eclipse IDE Hibernate Ubuntu
    • Likes 1
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Martin Vajsar wrote:That's what I wanted to ask you. So here I go - in how many cases did you need something like this?

    A very fair question: Truth be told - not often. If I had, I'd have come up with the above long before now.

    However, I've usually got around it by making my equals() methods final.

    I've mucked about with equals a few times and always got burned. In my opinion, anything but the getClass() technique strongly violates the KISS principle.

    And here's where we disagree. KISS is great - I'm a huge believer in keeping things simple - but it MUST be correct.

    As Einstein said (paraphrased): "Everything should be as simple as possible, but no simpler."

    Yes, canEquals() is simple; but to my mind, it's inherently flawed: It assumes that every class lives in its own little world and has no interaction with any other, which is diametrically opposed to the idea of Object-Orientation.

    Objects (or classes) inherit - that's their nature - but the minute you use getClass(), you're saying: "I know this method is public, but you might just as well have made it private, because it doesn't interact with anything else."

    The simplest example I can give you is that an anonymous subclass of a class that uses getClass() is NOT EQUAL to its superclass. And that, to me, is a sin.

    Having equals bad is horrible mostly because it manifests itself in some code far, far away from the original sin (HashSet cannot be buggy! It must be something else - someone removed that object from my set...), and secondly (this might actually be just my own disposition), you don't expect equals to be wrong - equals is such a stupid simple method!

    Yeah, you'd think so wouldn't you? What could be simpler than asking: "am I equal to some other object?".

    Trouble is: it isn't simple; and both methodologies (getClass() or making the method final) simply side-step the issue instead of dealing with it.

    (I've once spent several hours discovering my compareTo was not transitive. I've even resorted to debug the internals of TreeSet. I'm not going to forget for some time.)

    Oh, don't get me started on Comparable...

    Thanks very much for your input Martin. Much appreciated.

    Winston
     
    Winston Gutkowski
    Bartender
    Posts: 10575
    66
    Eclipse IDE Hibernate Ubuntu
    • Likes 1
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Martin Vajsar wrote:Having equals bad is horrible mostly because it manifests itself in some code far, far away from the original sin (HashSet cannot be buggy! It must be something else - someone removed that object from my set...)

    Or that your object violates another principle (which I've so far only seen mentioned in Martin Odersky's article) - Defining equals() in terms of mutable fields.

    Basically, if you want to put an object into a HashSet, it better be well nigh immutable; otherwise you're in for trouble - getClass() or no getClass().

    Winston
     
    Martin Vajsar
    Sheriff
    Posts: 3752
    62
    Chrome Netbeans IDE Oracle
    • Likes 2
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Winston Gutkowski wrote:
    I've mucked about with equals a few times and always got burned. In my opinion, anything but the getClass() technique strongly violates the KISS principle.

    And here's where we disagree. KISS is great - I'm a huge believer in keeping things simple - but it MUST be correct.

    I was actually trying to express a slightly bit different idea - that equality among different classes is inherently not simple. It's not overwhelmingly difficult, but there are a few points that one has always to keep in mind, and - especially when returning to an old code to make just a small modification, it is really easy to forgot about some of them (for me, it's mostly transitivity).

    The canEqual concept is great - and surprisingly simple to understand or visualize, since it allows to define boundaries in the class hierarchy that cannot be surpassed by equality. The problem is that it is not well known. The next Java developer won't learn about it from javadoc or Java tutorials, and when he finds it in the code, he might misunderstand it (why the heck have two methods - equals and canEqual? Can't I do with just one of them?). If canEqual would be included in Java 8 or 9, well, that would be great. The alternative is to have all developers read this discussion.

    Winston Gutkowski wrote:Or that your object violates another principle (which I've so far only seen mentioned in Martin Odersky's article) - Defining equals() in terms of mutable fields.

    This is actually mentioned in the javadoc of Set and Map, so I was aware of it from the beginning. But I've now looked up the Set interface in the Java Collections tutorial, and it is not mentioned there. It's a serious omission, in my opinion.

    But I have a small advantage here - I strongly prefer immutable objects. They make life so much easier! When I do need to compare mutable objects, I sometimes create another method (different from equals) for that purpose, to avoid exactly this pitfall.
     
    Winston Gutkowski
    Bartender
    Posts: 10575
    66
    Eclipse IDE Hibernate Ubuntu
    • Likes 1
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Martin Vajsar wrote:I was actually trying to express a slightly bit different idea - that equality among different classes is inherently not simple.

    Oh, my apologies.

    It's not overwhelmingly difficult, but there are a few points that one has always to keep in mind, and - especially when returning to an old code to make just a small modification, it is really easy to forgot about some of them (for me, it's mostly transitivity).

    Yup, it's an absolute bugger.

    The canEqual concept is great - and surprisingly simple to understand or visualize, since it allows to define boundaries in the class hierarchy that cannot be surpassed by equality. The problem is that it is not well known. The next Java developer won't learn about it from javadoc or Java tutorials, and when he finds it in the code, he might misunderstand it (why the heck have two methods - equals and canEqual? Can't I do with just one of them?). If canEqual would be included in Java 8 or 9, well, that would be great. The alternative is to have all developers read this discussion.

    Actually, that's not a bad idea at all. All you'd need is:defined in Object; and the rest would be up to authors of Java books and tutorials.

    I can even see a theme: The Three Musketeers - equals(), canEqual() and hashCode() - "All for one; and one for all".

    Winston
     
    With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
    • Post Reply Bookmark Topic Watch Topic
    • New Topic
    Boost this thread!