Stephen Bell

Ranch Hand
+ Follow
since May 11, 2013
Merit badge: grant badges
For More
Cows and Likes
Cows
Total received
2
In last 30 days
0
Total given
0
Likes
Total received
2
Received in last 30 days
0
Total given
14
Given in last 30 days
0
Forums and Threads
Scavenger Hunt
expand Ranch Hand Scavenger Hunt
expand Greenhorn Scavenger Hunt

Recent posts by Stephen Bell

Campbell Ritchie wrote:I thought put() runs in amortised constant time.



Interesting, as I checked back to ask just this.  I'm trying to get my head around the time complexities for put and get operations of a HashMap and my understanding thus far is:

get operations are 'expected O(1)' because with good hashing, most get operations would be expected to complete in O(1) time but could be O(log n) if multiple elements are in the bucket and are stored in a balanced tree, or log(n) if the elements are stored in a linked list.  So, expected O(1) with worst case O(n).

put operations are 'amortised O(1)' because put operations where the load factor has not yet been exceeded, would again be expected to be O(1) (with similar reasoning to the 'get' above operations above), with this rising to O(n) when the load factor is exceeded and re-hashing is required.

Am I on the right track with these assumptions?


Thanks :-)
3 years ago
That sounds interesting Tim.  Thanks for the information, much appreciated.
3 years ago
Thanks Mike,

This is just what I was looking for, an actual code example showing the problem that this initial value fixes.

Cheers



3 years ago
Here is one of many

https://courses.cs.ut.ee/2017/OOP/spring/Main/HashCode

It appears that this quote is taken from the book 'effective Java' by Joshua Bloch.

Re the subclassing example I gave, just to clarify.  If we have Animal (superclass) then a Dog and let's say this time a Cat class (subclasses extending the Animal superclass), all of the classes take 2 parameters, 'int noOfLegs and int age' in their constructors.  The equals() method has been overridden using both 'noOfLegs' and 'age' to determine equality, and the hashCode() method has been overridden using the code in the initial post with our 2 parameters.  Now suppose we did the following...



Now, obviously calling



would return false, as they are clearly not 'equal' (because their runtime classes are different).  However, calling:



Would result (assuming both classes are using the hashCode() method from their superclass and haven't yet implemented their own), in both returning the same hashcode and therefore being shoved into the same bucket in our HashMap.

However if we override hashcode() in each of the subclasses but this time used a distinct initial value, then calling hashCode() as above would yield different values, and therefore we would have less chance of the 2 objects going into the same bucket.

Like I say, it's just an example and I only mention it in the context of trying to understand this initial value (I don't think this is what is being alluded to in the book, but it's the only real example I can come up with where the initial value would make any difference).

Cheers
3 years ago
Thanks Paul, appreciate your reply.

I realise that 2 distinct objects can return the same hash code and that this doesn't break the hashcode contract (as I stated above).  But say we had an Animal class and Dog and Bird subclasses, then, it's perfectly possible (and valid) for a Dog and a Bird to return the same hashcode... and if they do they will be placed into the same HashMap bucket.  Here, we could use different initial values to further reduce the possibility of a collision, I mean we know a Dog and a Bird can't possibly be the same object so I guess there's no harm in trying to get them to hash to different buckets!)... and this is the only reason I can think of for the existence of that initial value (although to be honest, we could just as easily return 'super.hashCode() * someArbitaryValue;' from the sublasses too!)

I keep seeing this quote all over the internet...

A nonzero initial value is used in step 1, so the hash value will be affected by initial fields whose hash value, as computed in step 2.a, is zero. If zero was used as the initial value in step 1, the overall hash value would be unaffected by any such initial fields, which could increase collisions. The value 17 is arbitrary



(Where 17 is 7 in my example).  I understand the first part here, but what I don't get is how this in any way reduces the possibility of collisions (apart from the use-case stated).

While I do understand that we can just ignore it, there clearly is a reason why this initial value exists, I'm simply trying to find that reason!!

Thanks again for the reply

3 years ago
Hi All,

I've recently been looking more into the inner workings of Java's HashMap which led me to revisit the hashCode() and equals() methods.

Now, specifically concerning the hashCode() method—I've see various examples that use an 'initial' value which should be a non-zero (odd prime) number.  This is not the multiplier which is also a prime number.  The same 'initial value' is present if I let my IDE create the code in the method for me.  Take the following example if you will...



I've been trying to figure out in what situation(s) this initial value might be beneficial in reducing bucket collisions but I can't seem to find any actual examples anywhere.

Is it possible someone could help me out here?  Why this initial value?  Could someone possibly show a simple code example where there would be more collisions if this value were either...

  • Omitted altogether / set to 0
  • Set to any value other than an odd prime


  • I've been running my own tests and as yet they're inconclusive.

    The only situation I can think of is when sub-classing.  If the equals() method in the super-class were to use the getClass() method to compare instances, then obviously an instance of the sub-class could not be 'equal' to an instance of the super-class. However they could still return the same hash code... so if we then if overrode the hashCode() method in the sub-class we could use a different initial value to get a different hash code, potentially avoiding a collision.  I realise that the 2 objects returning the same hash code does not break the hashcode contract but if we know the objects aren't of the same type, then we may as well take action.

    Am I on the right track or are there other (non sub-classing) situations where this initial value is actually beneficial??

    Thanks all, your thoughts are really appreciated!
    3 years ago

    Paul Clapham wrote:
    I don't believe the compiler will look through your class hierarchy



    Thanks Paul, yes I agree that is probably the case, I just wanted to make sure I was on the right page and understanding things.  It just confused me because I know when casting objects the compiler does seem to look through the hierarchy and making classes final can affect whether the compiler will allow something to compile or throw an error.  I thought it might be the same when using Generics but I guess not.

    Thanks very much!!
    6 years ago
    Just out of interest, I've noticed that the opposite will not compile (assigning a Upper-Bounded ArrayList to a Lower-Bounded List reference) even when I would expect it too...

       

    So, the above list can only really be of type 'SemiDet' as that is the 'lowest' class in my hierarchy, ie, it has no subclasses.  Now, if I try this...

       

    ... it doesn't compile even thought 'upperBoundedList' must be of type 'SemiDet' and 'lowerBoundedList' can assign an ArrayList with a type of SemiDet all the way up to 'Object'.

    I think I understand why, but I'd like to run it by you guys, I think it's because although SemiDet has no subclasses, the compiler doesn't take this into consideration.  Because I've declared my upperBoundedList with a Upper Bounded wildcard, <? extends SemiDet> it just assumes that subclasses could exist and therefore won't allow me to assign it. (say I had a TwoBedSemiDet that extended SemiDet, then upperBoundedList could be of type TwoBedSemiDet which obviously wouldn't be assignable to my lowerBoundedList reference (as its lower bounds is SemiDet).

    It's odd though that it still doesn't compile even if I make SemiDet final and therefore exclude the possibility of subclassing it.

    As alway thanks for your valuable help! :-)
    6 years ago

    Henry Wong wrote:

    Can someone with deep knowledge of generics explain this one? First, it is using the diamond operator. This means that the type of the generics (that is being instantiated) is inferred. However, it is also using the wildcard. So, what is the type of the generic? How is it inferred? I have seen code like this before, and this always bothered me....



    That's interesting because I've wondered this too!  I've always assumed that the inferred type of the instantiated ArrayList is the actual Upper/Lower bound, so in the following example...



    I've always assumed the inferred type is 'Residential' - can anyone confirm this?
    6 years ago
    Piet,

    I think I just got what you mean :-)

    I'm forgetting about 'Object' and not factoring that in, so in fact, upperBList isn't compatible with all possible types of lowerBList because lowerBList also includes 'Object'?!

    Thank you so much  for pointing that out - I can't believe I didn't see it, I've been trying understand this for hours!!

    Regards
    6 years ago
    Hi Piet and many thanks for the reply!

    Yes, I've been starting at the compiler message for ages and I can't figure it out.

    I realise that the last 4 examples I gave compile OK - that's what was confusing me, they seem to be equivalent to the line that doesn't compile (ie, all the possible types of 'lowerBList' are valid for 'upperBList' (or so it seems)).  I know that list with upper bounds are not modifiable (well, apart from clearing it or removing elements). But I just can't understand for the life of me why the assignment doesn't compile - it's very confusing :-(

    the 'So the compiler cannot guarantee that '? extends Building' is correct, and therefore it won't compile.



    Could you clarify this part for me please?

    Many thanks for your help, really appreciate it!
    6 years ago
    Sure, the compiler throws this at me...

    incompatible types: List<CAP#1> cannot be converted to List<? extends Building>
    where CAP#1 is a fresh type-variable: CAP#1 extends Object super: Residential from
    capture of ? super Residential



    Many thanks
    6 years ago
    Greetings all,

    I'm Having a few issues understanding something regarding Upper and Lower Bounded Lists and would really appreciate if someone could help me understand what's going on here!!

    Let's say I have a simple class hierarchy: Building > Residential > House > SemiDet (SemiDet extends House, House extends Residential and Residential extends Building).

    Then let's say I create a List with a Lower Bounded type...



    Now, the type of the above List can only be Residential or Building.  So, let's say I now create another List reference (this time Upper Bounded) to which I will attempt to assign my first List...



    I'm trying to figure out why the above fails to compile.  The type of of upperBList can be either Building, Residential, House or SemiDet and lowerBList can be either Residential or Building, so should this not be assignable?

    In other words, all of the following are valid...



    Am I missing something!?

    Thanks all, appreciate any advise that could improve my understanding :-)
    6 years ago
    Thanks for the replies all, much appreciated!
    8 years ago