Originally posted by Jeroen T Wenting: the location in (type of) memory where the String your program receives is stored.
No, actually, but this is a very common misconception. Jeroen is referring to the fact that "String literals are stored in the string pool," which is correct. But all that means is that a reference to the String is stored in an internal Hashtable-like data structure. This string literal itself is just an ordinary String, stored in ordinary heap memory.
To answer the original poster's question correctly: the first line constructs a brand-new String object, while the second line refers to an existing String that is constructed when the class is loaded and initialized. The first version is a needless expense: because Strings are immutable there is essentially never any reason to copy one.
So, does it mean that if "Hello" has not been instantiated previously then String a = "Hello" will create the String object (as in 'String a = new String("Hello");') and store reference 'a' in the data structure?
String a = "Hello" will always just use a preexisting String. That literal "Hello" is compiled into a String object that is created when the class is loaded and initialized. All literals "Hello" (at least, all literals "Hello" used by all classes loaded by a given class loader) will use the same object whenever the quoted String "Hello" appears in the code.
A literal "Hello" in class A will be a different object from literal "Hello" in class B, is this correct? Does each class have its own String pool? Or, is the String pool shared across different classes in a single JVM?
Within one class, if both method M1 and M2 have 'String literal = "Hello"' then does 'literal' reference actually points to the same object (i.e. that is created when the class is first initialized)?
All the classes that are loaded by a given class loader definitely share a single literal pool. Early versions of the language spec were unclear on whether different class loaders may have their own string literal pools; I'm not aware of whether this was clarified in the most recent spec version.
But unless you're doing something fancy with multiple class loaders, every literal "Hello" in a program definitely refers to the same single String object -- and they may still even if you are.
I'm pretty sure there's just one string intern pool, as described in String.intern(). Classes may contain additional references to the strings which are referenced in the intern pool, but that doesn't change that there's just one intern pool.
There are, however, several other pools which are referenced in the specs. These include a constant pool which is part of the class file structure, and a runtime constant pool, which is an in-memory representation of the same information (after a class is loaded). These pools are not the same as the intern pool described in String.intern() and elsewhere. The contant pools contain all compile-time constants for a class (not just Strings),and there's one pool per class. (Well, one constant pool per class file, and one runtime constant pool per loaded class.) The String intern pool contains only Strings, and there should be only one intern pool in the whole JVM. As far as I've been able to determine, anyway.
EFH, when you say earlier references were unclear about this, is it possible this was confusion between the different types of pools? Or was there something else which suggested there might be multiple intern pools? There are, of course, various parts of the specs that can be ambiguous; the JLS and JVMS are not perfect. But is there something specific about this issue which is (or was) ambiguous?
[EFH]: But unless you're doing something fancy with multiple class loaders, every literal "Hello" in a program definitely refers to the same single String object -- and they may still even if you are.
Based on past discussions here, plus testing, it seems that the only case that we have found where two identical literals can refer to different instances is this one: if you use different classloaders to load and unload classes, you can create a situation in which a literal refers to one instance, and then later, an identical literal refers to a different instance. This is only possible if the literals (or rather, the classes that contain them) are never in the JVM at the same time. Once you unload a class, the intern pool does not prevent GC, so it's possible to forget the old reference. In which case an identical literal may later refer to a different instance. Which is mostly academic, as there are very few ways to compare the identities of two instances which are not in memory at the same time, and usually no reason to do so. Except for the occasional discussion such as this one...
The arguments that I recall I think hinged on the fact that the relevant JLS section (I'm looking at 3.10.5, now) just doesn't mention class loaders at all, and therefore perhaps ClassLoaders threw a monkey into the works. But now that I look at it, I can't how you could infer that. I'd say this is just me being overcautious.
The String constant pool is described in more detail in the JVMS than it is in the JLS. The JVMS indicates that there may indeed be more than one, at least that's how I read it.
The way I read the JLS also leads me to believe that using the String constructor explicitly MAY create a heap instance of String containing a copy/clone of the instance retained in the String constant pool for the relevant classloader rather than just a new reference to that instance.
posted 12 years ago
[Jeroen]: The String constant pool is described in more detail in the JVMS than it is in the JLS. The JVMS indicates that there may indeed be more than one, at least that's how I read it.
What do you mean by "String constant pool"? There are several types of pools described in the JVMS, and "String constant pool" sounds like a conflation of two (or three) different things. As I said earlier, there is a constant pool in every class file, and a runtime constant pool associated with each individual class that has been loaded in the JVM. Neither of these are for Strings only. There is also a pool described in the API for String.intern() (and elsewhere) which is for Strings only. This is the one I generally call the intern pool, or String intern pool, and it is NOT the same as the previously-mentioned constant pools or runtine constant pool. Generally the intern pool is the one that comes up most often in discussion, because it's the one responsible for making String constants evaluate to the same instance. So - which of these are you referring to?
The only parts of the JVMS that I could find referring to the intern pool (albeit indirectly) are here and here. Which mostly just repeat the info presented elsewhere. Is there something more you're referring to?
[Jeroen]: The way I read the JLS also leads me to believe that using the String constructor explicitly MAY create a heap instance of String containing a copy/clone of the instance retained in the String constant pool for the relevant classloader rather than just a new reference to that instance.
Well, using a constructor is certainly supposed to create a new instance, period. And as far as I know that's always the case. (Not that they couldn't have designed the language differently, but they didn't.) I'm not sure why there would be any question about this, or how it relates to the other questions. Why "MAY"? [ July 17, 2006: Message edited by: Jim Yingst ]
As you can see from the responses above, the differences are subtle. For the functionality of your program it usually doesn't matter, the two statements have the same result (your String variable points to a String object with the text "Hello").
you should never use the new String("...") constructor with a string literal - it is unnecessary, it only makes your code longer and less efficient. [ July 17, 2006: Message edited by: Jesper Young ]