• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Rob Spoor
  • Tim Cooke
  • Junilu Lacar
Sheriffs:
  • Henry Wong
  • Liutauras Vilda
  • Jeanne Boyarsky
Saloon Keepers:
  • Jesse Silverman
  • Tim Holloway
  • Stephan van Hulst
  • Tim Moores
  • Carey Brown
Bartenders:
  • Al Hobbs
  • Mikalai Zaikin
  • Piet Souris

Does substring of a string literal create a new string literal or object on heap?

 
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

As the the subject says, does a substring of a string literal (string in the string pool) create a new string in the string pool or an object on the heap? Or neither?



Yes, I know that the substring is not being assigned to anything. My initial guess is that a new, albeit short-lived, object is created on the heap.

Can someone with greater knowledge of the VM chime in hear?

Thanks.

Les
 
Bartender
Posts: 1849
15
Eclipse IDE Spring VI Editor Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I think you can test this easy enough:



Let me know how it turns out.
 
Janeice DelVecchio
Bartender
Posts: 1849
15
Eclipse IDE Spring VI Editor Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Also, while you're testing, you could find the behavior is different if substring would return the whole string. You might be interested in trying that, too. ;)
 
Ranch Hand
Posts: 104
10
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The gist of your question is how Java handles the internal character memory when you make a call to substring. I believe part of the confusion is that the behavior of Java changed in Java 7. There is a discussion of this at

http://www.javamex.com/tutorials/memory/string_memory_usage.shtml

According to the article cited above, which by the way is not from official Oracle documentation and may not necessarily be an unimpeachable authority, older versions of Java substring would try to save memory by reusing the internal content of a string. A string object is basically two things: an internal array of characters and a object that serves as a wrapper around the array. Even in earlier versions of Java, each call to substring produced a unique wrapper object, but the internal character content was shared. On one hand, this was an excellent idea because it kept the overall memory use small and there was no performance overhead for making a redundant copy of the character content. One the other hand, it had a potential problem in that if you took a tiny substring of a huge string, put the huge string out-of-scope, and kept the substring around, the internal character memory would never go out of scope... it would remain huge and would never get garbage collected. According to the article, later versions of the substring make a unique (and redundant) copy of the content of the string.

You might wonder why the old style was ever a problem and argue that nobody would you ever do such a foolish thing in the first place. Well, I kind of agree, but I can think at least one counter example. Consider the case where you get a ResultSet from a database call, pull out a couple of string values, and discard the result set. If the internal data representation of the result set was a big string, and the copies were substrings, it would lead to just the memory bloat described above. I'm willing to bet that at least one early implementation of a database API got burned by that behavior until somebody figured out what was happening and did a work-around.

All that being said, I wrote a quick experiment. The attached code starts with a big string and makes 100 calls to substring creating an array strings of 99999 characters each. The total length of all strings created by substring is 9999900. The measured memory use is about 19.3 megabytes, or about 2 bytes for characters. This result is consistent with the idea that each substring call makes a unique instance of the string content. When I tried this with a Java 6 JVM, the memory use for the array was well under 1 megabyte, which would be consistent with the idea that the internal character array was reused.





 
Janeice DelVecchio
Bartender
Posts: 1849
15
Eclipse IDE Spring VI Editor Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I disagree.

From the JDK 1.7 code:

/**
* Returns a new string that is a substring of this string. The
* substring begins at the specified <code>beginIndex</code> and
* extends to the character at index <code>endIndex - 1</code>.
* Thus the length of the substring is <code>endIndex-beginIndex</code>.
*


* Examples:
* <blockquote><pre>
* "hamburger".substring(4, 8) returns "urge"
* "smiles".substring(1, 5) returns "mile"
* </pre></blockquote>
*
* @param beginIndex the beginning index, inclusive.
* @param endIndex the ending index, exclusive.
* @return the specified substring.
* @exception IndexOutOfBoundsException if the
* <code>beginIndex</code> is negative, or
* <code>endIndex</code> is larger than the length of
* this <code>String</code> object, or
* <code>beginIndex</code> is larger than
* <code>endIndex</code>.
*/
public String substring(int beginIndex, int endIndex) {
if (beginIndex < 0) {
throw new StringIndexOutOfBoundsException(beginIndex);
}
if (endIndex > count) {
throw new StringIndexOutOfBoundsException(endIndex);
}
if (beginIndex > endIndex) {
throw new StringIndexOutOfBoundsException(endIndex - beginIndex);
}
return ((beginIndex == 0) && (endIndex == count)) ? this :
new String(offset + beginIndex, endIndex - beginIndex, value);
}

 
Bartender
Posts: 10780
71
Hibernate Eclipse IDE Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Gary W. Lucas wrote:...The total length of all strings created by substring is 9999900. The measured memory use is about 19.3 megabytes, or about 2 bytes for characters. This result is consistent with the idea that each substring call makes a unique instance of the string content. When I tried this with a Java 6 JVM, the memory use for the array was well under 1 megabyte, which would be consistent with the idea that the internal character array was reused.


Beautiful analysis. Well done Gary. That merits a 2-cow post (very rare).

@Les: And the reason is that Strings in Java are immutable (look it up); so the class can re-use its internals - in this case, the array of characters - as it sees fit, without anyone (especially the designers) having to worry about what happens if somebody changes something.

That said, this sort of "memory poking" isn't likely to make you a better Java programmer. Within reason, if you're worried about how much memory something takes up, or "where it's stored", you're using the wrong language.

HIH

Winston
 
Sheriff
Posts: 26770
82
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Gary W. Lucas wrote:When I tried this with a Java 6 JVM...



Janeice DelVecchio wrote:From the JDK 1.7 code...



In other words, the answer to the OP's question is JVM-dependent. There's probably a reason why Oracle changed the behaviour between Java 6 and Java 7, but chances are it isn't relevant to the code you're writing.
 
Janeice DelVecchio
Bartender
Posts: 1849
15
Eclipse IDE Spring VI Editor Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
It's not JVM dependent the way the question was posed. The JVM's have different internal workings, but the substring method acts the same String-wise.

Here's the resource I was looking at, explains the difference:
http://www.programcreek.com/2013/09/the-substring-method-in-jdk-6-and-jdk-7/

But EITHER WAY you get a new String(), not one from the pool. You only get the SAME String from the String pool when it's substring of the whole length.

 
Les Hartzman
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Glad I could provide some discussion on this topic :-)

Thank you Janeice for the simple test. I should have thought of that first.

But I am surprised about the whole string substring returning the pool string. Based on the shorter substrings I would have expected a new object.

@Winston, yes, I do know that string are immutable. Clearly the example did not reflect an attempt to change the contents of the original string just what would happen with a substring of a string literal. Had I tried Janeice's little test first, it would have been clear that the substring is not created within the pool.

Les
 
Rancher
Posts: 3742
16
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Les Hartzman wrote:But I am surprised about the whole string substring returning the pool string. Based on the shorter substrings I would have expected a new object.


It's an easy optimisation to do. If the begin index is zero and the end index is the length of the String, then return the String else create a new String.
 
Paul Clapham
Sheriff
Posts: 26770
82
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Janeice DelVecchio wrote:But EITHER WAY you get a new String(), not one from the pool. You only get the SAME String from the String pool when it's substring of the whole length.



That's true, as far as the question goes. But whether you create a String with 12,000 characters or a String with 12 characters, the String objects are both the same size. It's the underlying char arrays which take up most of the space, and you can see from the article you linked to that the underlying char arrays are very different between Java 6 and Java 7.

I don't know if that's what Les had in mind or not; they didn't say why they asked the question.
 
Janeice DelVecchio
Bartender
Posts: 1849
15
Eclipse IDE Spring VI Editor Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I totally agree that the workings in the background are different, but I read the OP's question as "do I get a new String on the heap, or a String from the pool?"

I didn't speculate on the reason for the question.
 
Les Hartzman
Ranch Hand
Posts: 35
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Just to set things straight, I was just asking about the creation of a new object on the heap or in the pool.

However, I've really enjoyed the discussion about the finer details and nuances.

Les
 
Winston Gutkowski
Bartender
Posts: 10780
71
Hibernate Eclipse IDE Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Les Hartzman wrote:Just to set things straight, I was just asking about the creation of a new object on the heap or in the pool.


Personally, I think it's a bit of a pity that Sun/Oracle ever let slip the fact that there is such a thing as a "String pool", because now beginners tend to obsess about it, when in fact it's really of very little importance. I'm not sure that it's even mentioned in the tutorials (if it is, I've never seen it); they just show you how to initialise Strings properly.

It's an implementation detail, nothing more - a way to help save space when it can be saved - and the fact that it's there makes almost no difference to you as a programmer. And isn't the whole point of object-orientation to hide implementation?

And the same is true of the number caches used by some wrapper classes.

Winston
 
Les Hartzman
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The only thing I'd disagree with you about that is that it could be important if you don't know that there is a single instance of a literal string no matter how many identifiers you declare with the same literal. Is that the right way to do things? No. But it doesn't mean it won't happen.

Les
 
Winston Gutkowski
Bartender
Posts: 10780
71
Hibernate Eclipse IDE Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Les Hartzman wrote:The only thing I'd disagree with you about that is that it could be important if you don't know that there is a single instance of a literal string no matter how many identifiers you declare with the same literal. Is that the right way to do things? No. But it doesn't mean it won't happen.


Actually, I'd suggest that because there's a String pool, it doesn't actually matter how many identifiers you have for the same literal, since they won't take up any extra space - well, not much.

What it does mean is that you should use:
String hello = "hello";
rather than:
String hello = new String("hello");
when initialising.

However, unless you end up doing the latter millions of times, you're not likely to run into any memory issues. And if you're using '==' to compare two Strings (which you shouldn't) because you think they're in the pool, then you're missing the whole point of why it's there.

My 2¢.

Winston
 
You showed up just in time for the waffles! And this tiny ad:
Building a Better World in your Backyard by Paul Wheaton and Shawn Klassen-Koop
https://coderanch.com/wiki/718759/books/Building-World-Backyard-Paul-Wheaton
reply
    Bookmark Topic Watch Topic
  • New Topic