Abhishek Bose wrote:
1.Why it is necessary to override hashcode, and what if we do not override equals()..
2.also explain these in context of hashtable...!!!
Let's extend the Automobile example a bit.
You're the owner of a rental car company, with 4 rental-car lots. One is near the airport, one is near the train station, one near the ferry, and one near the bus-depot. You have a bunch of cars, and you need to decide which lot each car should go into.
You could allocate them by color: all the red cars at the airport, the blue ones at the train station, etc. Assuming an even distribution of colours, you'd have about the same number of cars at each lot. But your customers might complain: they want a red car, but all they find at the bus-depot is the gold ones.
You could allocate them by make ... BMW at the airport, Ford at the train-station, etc. Again, this might evenly distribute the number of cars into the 4 lots, but customer choice is again limited.
You could allocate them by model: compact, convertible, truck, SUV ... with the same problem.
So instead, you decide to create a hash function ... the output of which will determine the lot of each car:
Color: Red=0, Blue=1, Green=2, Gold=3Make: BMW=0, Toyota=1, Ford=2, Chevy=3Model: Compact=0, Convertible=1, Truck=2, SUV=3
Lot # = (Color + Make + Model) % 4
4 colours x 4 makes x 4 models = 64 varieties, with 16 varieties in each lot.
A customer at the airport (lot #0) could find a red compact BMW, or a blue Ford convertible, or a gold Chevy truck, etc. If they are looking for a particular model of car (convertible) they should find 4 kinds: Red-Chevy, Blue-Ford, Green-Toyota, Gold-BMW.
Here, the simple hashing function distributed the cars into four different lots. If 1000 new BMW's arrived, they all wouldn't end up in the same lot, they would be distributed roughly equally into the 4 lots. This is one of the properties we look for in a hashing function - it should produce an even distribution of values, even if the input is highly correlated (all BMW's, ... or all convertibles, ... or all blue).
Searching. A hash function should make it easier to find an item. Let's say one of our rental cars was involved in an accident. The police don't know exactly which one, but know it was a green Chevy convertible. Which lot to we search in? Green=2, Chevy=3, Convertible=1, so the lot # is 2 ... so the police should look for the car in the ferry lot. As opposed to looking at all cars, the search gets narrowed down to 1/4 of the entire set.
Our hashing function takes make, model, and colour of an Automobile and turns it into a lot #. Our hash table is the rental car lots; cars with the same hashcode go into and will be found the same place. Can two cars have the same hashcode? Of course! Will two red cars have the same hashcode? Not usually. Will two cars with the same colour, make, and model have the same hashcode? Yes. Will two cars with the same hashcode have the same colour, make, and model? Not likely.
A customer owns a gold, BMW convertible at home. They fly into our city, and want to rent the same kind of car as they have at home. Fortunately, gold-BMW-convertible hashes to 0, which is the airport lot, so they can find that kind of car at the airport.
If we don't override the hashcode(), a default hashcode is provided ... the object's memory location. This is not in anyway related to our car's make, model, and colour. Which lot would it go into? Any of them! How would we find it? We'd have to search through every lot! By providing a hashcode that relates to the search criteria, we can narrow down our search.
The hashcode is not used to
test for equality. The equals() function is used for that. Using the hashcode would not speed things up, and would likely slow things down. To test if two Automobiles are equal, we just check if they are both gold, both BMW's, and both convertibles. If we added the hashcode into the mix at the start, we'd have to get the first car's make, model and colour, do some math, get the second car's make, model and colour, do some math, and if they are the same, then we'd check if they are the same make, model and colour. We've done more work! But, it the objects have been segregated into different lots, a search for a particular object can be much, much faster if we just look in the appropriate lot.