Win a copy of Kotlin Cookbook this week in the Kotlin forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Bear Bibeault
  • Paul Clapham
  • Jeanne Boyarsky
Sheriffs:
  • Junilu Lacar
  • Knute Snortum
  • Henry Wong
Saloon Keepers:
  • Ron McLeod
  • Tim Moores
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
Bartenders:
  • Frits Walraven
  • Joe Ess
  • salvin francis

Why exactly override equals() and hashCode() method ?

 
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello to all.

It bugs me something. Please help me figure out why exactly we override equals() and hashCode() methods. I read that we must override both methods because they are used by Hash collections.

For what I can understand, hashCode() is needed because in adding an element(in a hash collection) it adds it to a bucket based on hashCode() value (I read it somewhere that this is obj address maybe???), and equals() it use if we add some other obj, compares it to the other.

Frankly speaking, I don't know if I am mistaking and I am not 100% sure, why it needs both methods and when exactly it uses them.
 
Sheriff
Posts: 14620
243
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Likes 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to the Ranch!

It's all about completeness and correctness, really. There might be some cases where you could potentially get by with using a class that only has equals() overridden, as long as it is never used in a scenario where the contract between equals() and hashCode() is assumed to be adhered to.

It's like a car that's supposed to be all wheel drive (AWD) but only ever runs in two-wheel drive (2WD) mode because its AWD is broken. If you never drive the car in conditions where you need AWD, then you'll never know the difference. However, if you get stuck in deep snow or mud and decide to go to true AWD (or expect the onboard computer to automatically switch to AWD for you), you'd be in a lot of trouble. The same kind of thing is true for a class that doesn't honor the equals()-hashCode() contract.

Hash codes are used to make searching in a hashtable fast. The equals() method is used to find the correct item. To simplify the concept, think of a huge box of colored balls. There are multiple colors, say 16 distinct colors. Think of the color as being the hashCode. Each of these balls will also be labeled with a number. The range of numbers that balls can be labeled with is from 0 to 255. Additionally, each color will have a specific range of consecutive numbers assigned to it (0-15 for white balls, 16-31 for yellow, 32-47 for green, etc.). This means that balls are labeled in such a way that if two balls have the same number label they are guaranteed to have the same color. However, it's not guaranteed that if two balls have the same color that they will have the same number. This rule is similar to the contract between equals() and hashCode, where two objects that are equals() are guaranteed to have the same hashCode() but two objects with the same hashCode() are not necessarily equals().

This is how these rules are important to follow when used with sorting/searching that uses hashCode() and equals(). So you have big box of mixed colored balls. Now imagine 16 smaller boxes. You're going to take balls from the big box and separate them according to color into the 16 smaller boxes, each small box containing only balls of the same color. This is what happens in a hashTable, where each "bucket" will contain objects with the same hashCode() value. Since the range of colors is small, separating the balls according to color is a relatively fast and easy task because you don't have to look at the number, which has a wider range of values.

Now say you want to find a #5 ball. Because of the color-number scheme, you know a #5 ball is going to be a white ball. This helps you quickly eliminate 15 other small boxes to look in and you go directly to the box of white balls. Then you look through the balls in that box until you find one that's labeled #5. Similarly, a hashCode allows a search to quickly eliminate many items and zero in on a few possibilities in a "bucket" of things with the same hashCode. Then equals() is used to find the exact item you're looking for.

Now, if the color-number rules were not followed and it's possible that a white ball is labeled #20, then someone expecting to find a #20 ball in the box of yellow balls will never find it. Same thing goes if the hashCode-equals contract is violated.

Does that make sense?

 
Marshal
Posts: 67036
255
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to the Ranch

Let's imagine you had a String class which hadn't overridden hashCode(), but had overridden equals(). Let's imagine you put something into a HashMap like this:-The Unicode escape means ’ = apostrophe. That code would probably tell you null.

Now, that sort of error doesn't happen because String does override hashCode(). You can however get that very same problem if you use a mutable object as a “K” in a hash collection. What happens is that the hash collection is backed by an array and it uses the hash code to find the bucket. There is more information in this old post of mine, but what if you changed the state of your “K” object? You would then have a different hash code, and you would probably search in the “wrong” bucket and not find your “K‑V” pair. If you did find the right bucket, the hash collection first checks that the hash codes are the same, only then applying equals(). If equals() returns true, then the collection has found the correct pair. But any differences in the hash code will cause the collection to fail to find the correct pair.

The collection uses the hash code's rightmost n bits after some bitwise jiggery‑pokery with shift operators and the exclusive OR operator as an array index. That works very nicely when the size of the array is exactly 2. If you find out about the behaviour of HashMap, you will see it doubles its array's size whenever it detects that its backing array is “full” depending on its load factor, so its capacity can always be expressed by 2.
 
Saloon Keeper
Posts: 21474
144
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Here's a very specific case where you'd override equals() and hashCode().

When using a framework such as Enterprise JavaBeans/Java Persistence Architecture, an object may represent one row of a database table (an Entity in JPA terminology). For performance sake, Entity objects may be stored in a cache. However, half the point of having a database is to be able to update the database if you want to. So if the equals() and hashCode() methods of an Entity before and after the update were different (since one or more data properties have changed), then the persistence manager (Entity Manager) wouldn't be able to properly track things. It would consider the two Entity objects to represent completely different things.

So by overriding equals() you can make it so that the EntityManager only compares the database key fields instead of all fields in the record. Since hashCode is a quick check method used to avoid tedious equals() searches, it, too must be overridden because when two objects are equals() they must, by definition have the same hashCode value. The hashCode, incidentally, makes it possible to do things like rapidly search a (hashMap) cache. As I said, avoids tedious - and inefficient - linear searches using equals(). Only when two objects have the same hashCode is it worth doing an equals() check.
 
Jonh Dash
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
First, thank you all. It is a great honor to see, that one of the best replyers replied to me quickly, to my first question.

It makes sense Junilu Lacar, a lot actually. It is a good analogy and clears how exactly works these collections.

I have seen similar code examples and now better understand why that happens.

It is a great example when I need to override them.
Now I finally understand why hash collections need both methods to work correctly and why it's important to override them. It definitely makes a lot of sense now and I can say with confidence that I know when and why they are used.

Thanks again for your time and help.
 
Jonh Dash
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I wanna ask you, till we are in the subject, now I reed Joshua Bloch - Effective Java and I search for other interesting books. Can you offer me some other books, like best practices, books explaining things like this topic, things that you have to know and etc?
 
Campbell Ritchie
Marshal
Posts: 67036
255
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Remember that other people using your code will think you have overridden hashCode() if they see you have overridden equals().
I think there is so much to read in Effective Java (Joshua Bloch) that you won't get onto any other books for a week at least
 
Saloon Keeper
Posts: 11018
243
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It's not a book, but when I started out Java programming, I learned a lot from this site: http://www.javapractices.com

I don't know how it's held up over the years, but a quick glance indicates that they're keeping it updated.
 
Campbell Ritchie
Marshal
Posts: 67036
255
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The fist page I looked at in that website, Stephan, said not to bother too much with coding conventions. That worries me; I think you should stick strictly to coding conventions, even if you have to create your own conventions as you go. If you have to work in a team, that team will determine their conventions which everybody sticks to. They have a page about, “basic style errors,” which I think describes, “serious design errors,” but maybe that is only a naming difference. They show Hungarian notation and say you will see it in old code.

So I would take that website with a pinch of salt. You are right about updating; it was last updated in September 2018.
 
Campbell Ritchie
Marshal
Posts: 67036
255
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Despite my misgivings, there is good stuff on that website, Stephan.
 
Those are the largest trousers in the world! Especially when next to this ad:
Sauce Labs - World's Largest Continuous Testing Cloud for Websites and Mobile Apps
https://coderanch.com/t/722574/Sauce-Labs-World-Largest-Continuous
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!