This generate the following disassembled code for i++ and j++ (remaining disassembled code removed):
This is what I understand about the following assembly code:
mov r10,7d5d0e898h : Moves the pointer to the IncrementClass.class to register r10
inc dword ptr [r10+74h] : Increments the 4 byte value at the address at [r10 + 74h],(i.e. i)
mov r11d,dword ptr [r10+70h] :Moves the 4 value value at the address [r10 + 70h] to register r11d (i.e move value of j to r11d)
inc r11d : Increment r11d
mov dword ptr [r10+70h],r11d : write value of r11d to [r10 + 70h] so it is visible to other threads -lock add dword ptr [rsp],0h : lock the memory address represented by the stack pointer rsp and add 0 to it.
JMM states that before each volatile read there must be a load memory barrier and after every volatile write there must be a store barrier. My question is:
Why isn't there a load barrier before the read of j into r11d?
How does the lock and add to rsp ensure the value of j in r11d is propogated back to main memory. All I read from the intel specs is that lock provides the cpu with an exclusive lock on the specified memory address for the duration of the operation. Why is i++ not atomic or thread safe. If you look at the instruction incrementing i "inc dword ptr [r10+74h]" this should directly write to memory and every other thread should be able to see this value. From what I understand when the CPU writes to memory as above this value is cached in the cache line and doesn't go all the way to memory and so an explicit instruction is needed for it to write to memory. Which I believe is the LOCK statement but how does a LOCK on the stack pointer ensure the value in the cache gets written to memory
First your example looks flawed to me in that (although this may not be the issue in this case) volatile does not guarantee a memory barrier even on CPU's with the weakest of memory models. If Java is sure at runtime that the volatile variable is not accessed by multiple threads (ignore class loader static initialisation) , then it can optimise the memory barrier away. This is really clever and efficient.
So in your example I think it's a valid optimization to ignore volatile. Could we add a second thread reading /updating j and check the result ?
After that were into Intel's memory model which from memory is quite strong with one caveat normally I just look at the java byte code and assume they get the assembler right ;-) . I would tweak the java code and retest first and we can look at it again.
PS i++ is not thread safe in java but that doesn't mean it might not always be safe in a given particular circumstance (your arguing a given CPU) ie you may get away with it always on given hardware / OS / Java version etc but Java is meant to be portable.
"Eagles may soar but weasels don't get sucked into jet engines" SCJP 1.6, SCWCD 1.4, SCJD 1.5,SCBCD 5
You showed up just in time for the waffles! And this tiny ad:
Building a Better World in your Backyard by Paul Wheaton and Shawn Klassen-Koop