Win a 3 month subscription to Marco Behler Videos this week in the Spring forum!
    Bookmark Topic Watch Topic
  • Mark post as helpful
  • send pies
  • Report post to moderator
(Level: beginner)

Several Java classes, in particular: String, Integer, Character, Long, Byte, and Short have internal caches to help avoid creating redundant objects for frequently used values.

In the case of String, it's actually built into the language (or rather the compiler - more below); for all the other classes mentioned, these caches are used when you call their
valueOf() factory method - which all of them have - or when the object is boxed from a primitive value.

(If you’re unfamiliar with autoboxing, you should probably read up on it before continuing)

In my opinion, the fact that they have become public knowledge is unfortunate, because it prompts a lot of questions from beginners who start obsessing about them when there really is no need. The fact is that they're an implementation decision, and they are ONLY concerned with saving space.

However; that said, read on.

.

THE STRING POOL

The Java compiler creates a cache called the 'String pool' for all String literals. Thus, if you write:

you will get the same reference in both s1 and s2.

The side effect of this is that if you write:

your program will print out

since both s1 and s2 refer to the same object. However, this is a very bad habit to get into.

Unless you're studying for the SCJP exam, the only thing you really need to know about String pooling (ie, caching) is that you're better off not using the new keyword with String literals. In fact, you're generally better off not using it at all, since there is usually a better alternative (see the CONCLUSION section at the end). But, for illustration purposes only:

and

essentially do the same thing - create a String with the word 'Hello' in it - but in the first case the new keyword forces the JVM to create a new String object unnecessarily.

Since Strings - in fact, all of the classes listed above - are immutable, there's no reason to create multiple objects that contain the same value (at least, in 11 years of writing Java, I've never found one), which is why these caches were created in the first place.

.

WRAPPER CACHES

The other classes listed above also employ caches for their most commonly used values. However, unlike String, they're not generated by the compiler, but are part of the class itself.

For the Character class, it holds all the values associated with the standard ASCII character set; for Byte, Short and Integer, it holds the values -128 to +127. Character, Short and Integer might cache values in a larger range, and Long caches values, but does not specify which values. Thus, if you write:

or:

the JVM will place the same reference in both fields.

These caches are used when the Wrapper objects are created using autoboxing, as well as when they are created using the wrapper class' valueOf() method. In both cases, the same internal cache is used so they will only return a new object if they need to. Thus, you should rarely (if ever) have any reason to use the new keyword with any of them. Because the sizes of the caches can be increased beyond the range -128 to +127, the behaviour of the '==' operator on boxed wrapper class objects can vary from implementation to implementation and so should not be used.

Note: The documentation for valueOf() in both Double and Float suggests that they also implement internal caches; but I see no sign of it as yet. However, you may save yourself a lot of refactoring if they ever are implemented by treating them as though they do - ie, by using valueOf() (or boxing) in your code, rather than creating Doubles or Floats with new.

You should also use valueOf() or boxing to create Boolean values, even though, strictly speaking, the class does not have a "cache". Or you can use the two static constants, Boolean.FALSE and Boolean.TRUE.

.



WARNING:

Although internal caches can help to eliminate duplicate objects, they are NOT an excuse to use '==' instead of equals().

If you’re in any doubt about this, read the AvoidTheEqualityOperator page.


.

ADDITIONAL STUFF

intern()

If you're really space-conscious about Strings, you can also use its intern() instance method to eliminate duplicates. For example:

actually points 's' to a pool String.

The mechanics go something like this:

  • The compiler sees the literal "Hello" and checks whether it's already in the pool. If it is (because it already found a similar literal), it uses the existing reference; if not, it adds a new String containing 'Hello' to the pool.
  • At runtime, the JVM creates a new String object (because of the new keyword) containing 'Hello'.
  • It then calls the intern() method on that newly-created object, which discovers that there is already a String in the pool with the same value, so it returns its reference - and THAT is what gets assigned to 's'.
  • The String object created in Step 2 eventually gets garbage-collected, because there's nothing pointing to it any more.

  • Pretty tortuous, eh? But the upshot is that 's' contains the reference to the String that's in the pool, not a duplicate.

    Personally, I haven't found much use for it; but if you find yourself building lots of Strings, particularly piece-by-piece, it may be useful for eliminating duplicates.

    .

    String.valueOf()

    Unlike the other classes listed above, String's valueOf() methods (and there are several) do NOT explicitly state whether they check the String pool before creating a String to return. Indeed, String.valueOf(Object) (the version linked to above) is really meant for displaying values, and so has one very annoying feature:



    will return the String "null" if 'someOtherString' is null, when what you usually want when creating Strings is for it to return an empty (ie, 0-length) String or throw an Exception.

    I leave you to sort that one out for yourselves . Alternatively, see the CONCLUSION section below.

    .



    CONCLUSION

    The internal caches that have been described, while worth knowing about (and maybe also interesting), are not that important. Indeed, the Java Tutorials make no mention of them beyond showing you how to initialise numbers and Strings correctly.

    Unless you plan on creating thousands upon thousands of similar Strings or Integers or Characters in your program, the chances are that you won't run into any problems because you didn't use them.

    Furthermore, if you code in a natural, simple style, you'll probably end up using them without even realising it. It's when you start getting "clever" that you run into problems (and for more information on that, read this).

    However, there are a couple of simple rules you can follow if you want to avoid creating duplicate objects, at least with the classes listed above:

    1. Use an appropriate literal/primitive or valueOf() for everything except Strings, viz:

    2. Use literals or direct assignment for Strings, ie:

    or intern() if you think that what you're creating might be a duplicate:


    But above all:

    NEVER write code that uses '==' instead of equals() because you THINK objects are cached.



    CategoryWinston
     
    It is sorta covered in the JavaRanch Style Guide.
      Bookmark Topic Watch Topic
    • New Topic
    Boost this thread!