(Level: beginner)
Several
Java classes, in particular:
String,
Integer,
Character,
Long,
Byte, and
Short have internal caches to help avoid creating redundant objects for frequently used values.
In the case of
String, it's actually built into the language (or rather the compiler - more below); for all the other classes mentioned, these caches are used when you call their
valueOf() factory method - which all of them have - or when the object is boxed from a primitive value.
(If youre unfamiliar with autoboxing,
you should probably read up on it before continuing)
In my opinion, the fact that they have become public knowledge is unfortunate, because it prompts a lot of questions from beginners who start obsessing about them when there really is no need. The fact is that they're an
implementation decision, and they are ONLY concerned with saving space.
However; that said, read on.
.
THE STRING POOL
The Java compiler creates a cache called the 'String pool' for all String
literals. Thus, if you write:
you will get the
same reference in both s1 and s2.
The side effect of this is that if you write:
your program will print out
since both s1 and s2 refer to
the same object. However,
this is a very bad habit to get into.
Unless you're studying for the
SCJP exam, the only thing you really need to know about String pooling (ie, caching) is that you're better off not using the
new keyword with String literals. In fact, you're generally better off not using it
at all, since there is usually a better alternative (see the CONCLUSION section at the end). But, for illustration purposes only:
and
essentially do the same thing - create a String with the
word 'Hello' in it - but in the first case the
new keyword forces the JVM to create a
new String object unnecessarily.
Since Strings - in fact,
all of the classes listed above - are immutable, there's no reason to create multiple objects that contain the same value (at least, in 11 years of writing Java, I've never found one), which is why these caches were created in the first place.
.
WRAPPER CACHES
The other classes listed above also employ caches for their most commonly used values. However, unlike String, they're not generated by the compiler, but are part of the class itself.
For the
Character class, it holds all the values associated with the standard
ASCII character set; for Byte,
Short and
Integer, it holds the values -128 to +127. Character, Short and Integer might cache values in a larger range, and
Long caches values, but does not specify which values. Thus, if you write:
or:
the JVM will place the
same reference in both fields.
These caches are used when the Wrapper objects are created using
autoboxing, as well as when they are created using the wrapper class'
valueOf() method. In both cases, the same internal cache is used so they will only return a new object if they need to. Thus, you should rarely (if ever) have any reason to use the
new keyword with any of them. Because the sizes of the caches can be increased beyond the range -128 to +127, the behaviour of the '==' operator on boxed wrapper class objects can vary from implementation to implementation and so should
not be used.
Note: The documentation for
valueOf() in both
Double and
Float suggests that they also implement internal caches; but I see no sign of it as yet. However, you may save yourself a lot of refactoring if they ever
are implemented by treating them as though they do - ie, by using
valueOf() (or boxing) in your code, rather than creating Doubles or Floats with
new.
You should also use
valueOf() or boxing to create
Boolean values, even though, strictly speaking, the class does not have a "cache". Or you can use the two static constants, Boolean.FALSE and Boolean.TRUE.
.
WARNING:
Although internal caches can help to eliminate duplicate objects,
they are NOT an excuse to use '==' instead of equals().
If youre in any doubt about this, read the
AvoidTheEqualityOperator page.
.
ADDITIONAL STUFF
intern()
If you're really space-conscious about Strings, you can also use its
intern() instance method to eliminate duplicates. For example:
actually points 's' to a pool String.
The mechanics go something like this:
The compiler sees the literal "Hello" and checks whether it's already in the pool. If it is (because it already found a similar literal), it uses the existing reference; if not, it adds a new String containing 'Hello' to the pool.At runtime, the JVM creates a new String object (because of the new keyword) containing 'Hello'.It then calls the intern() method on that newly-created object, which discovers that there is already a String in the pool with the same value, so it returns its reference - and THAT is what gets assigned to 's'.The String object created in Step 2 eventually gets garbage-collected, because there's nothing pointing to it any more.
Pretty tortuous, eh? But the upshot is that 's' contains the reference to the String that's in the pool,
not a duplicate.
Personally, I haven't found much use for it; but if you find yourself building lots of Strings, particularly piece-by-piece, it may be useful for eliminating duplicates.
.
String.valueOf()
Unlike the other classes listed above, String's
valueOf() methods (and there are several) do NOT explicitly state whether they check the String pool before creating a String to return. Indeed,
String.valueOf(Object) (the version linked to above) is really meant for
displaying values, and so has one very annoying feature:
will return the String
"null" if 'someOtherString' is null, when what you usually want when
creating Strings is for it to return an empty (ie, 0-length) String or throw an Exception.
I leave you to sort that one out for yourselves

. Alternatively, see the CONCLUSION section below.
.
CONCLUSION
The internal caches that have been described, while worth knowing about (and maybe also interesting), are
not that important. Indeed, the
Java Tutorials make no mention of them beyond showing you how to initialise numbers and Strings correctly.
Unless you plan on creating thousands upon thousands of similar Strings or Integers or Characters in your program, the chances are that you won't run into any problems because you didn't use them.
Furthermore, if you code in a natural, simple style, you'll probably end up using them without even realising it. It's when you start getting "clever" that you run into problems (and for more information on that, read
this).
However, there are a couple of simple rules you can follow if you want to avoid creating duplicate objects, at least with the classes listed above:
1. Use an appropriate literal/primitive or
valueOf() for everything except Strings, viz:
2. Use literals or direct assignment for Strings, ie:
or
intern() if you think that what you're creating might be a duplicate:
But above all:
NEVER write code that uses '==' instead of equals() because you THINK objects are cached.
CategoryWinston