• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

String Concatenation Operator & Garbage Collection (K&B7 CH5 Q9)

 
Raul Saavedra
Ranch Hand
Posts: 38
1
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I want to submit this as a possible errata, because two parts of the book seem inconsistent to me.

The books states that the correct answer to question 5.9 (Kindle version of the book, location 6265) is B = About 1000 objects created.

But chapter four, section "String Concatenation Operator" (location 5025) states: "The previous code can be read as “Add the values of b and c together, and then ****take the sum and convert it to a String*** and concatenate it with the String from variable a.”"

That suggests that a temporary String object is created representing the number, and then it gets concatenated to the string on the left of the + operator.

If the above is truly the case, then the answer to Self-test question 5.9 cannot be approximately 1000 but 2000, or in fact maybe 3000? (For example, if an Integer wrapper object is created, and then its .totring() method called to get the string to concatenate, then we have the Integer object, and the String resulting from its tostring() method, so two intermediate objects right there).

In any case, I think either the answer to question 5.9 might not be 1000, or if it is, then said quote from chapter four is rather misleading or wrong. From the book now I´m not really sure how many objects do get created exactly (and then become elligible for GC) in the simple code: String s = "Result=" + 100;

If this errata submission is incorrect, then still a clarification on that would be most welcome.

PS. Happy Holidays!
 
Roel De Nijs
Sheriff
Posts: 10662
144
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Raul,

First of all, that's a great observation. Have a cow!

Honestly, I don't have any idea about what's the correct answer to this question these days. And it's also hard to check for yourself. With another question it's fairly easy: you compile and run the program and you know the answer. That's why questions about how many objects are eligible for GC are a real nightmare, certainly in combination with string concatenation It's also the reason why I moved your post out of the errata thread, so we can discuss this question more easily. If it turns out to be an errata, I'll post a link to this thread in the errata thread (and update the errata overview accordingly). Hope that's ok with you!

It's all about this question: given the following code (and if GC doesn't run) how many objects will be exist in memory when the loop is done. Possible choices are: Less than 10, about 1000, about 2000, about 3000 or about 4000.


If I have a look at the javadoc of the String class, I see this statement about the + operator: "The Java language provides special support for the string concatenation operator ( + ), and for conversion of other objects to strings. String concatenation is implemented through the StringBuilder(or StringBuffer) class and its append method. String conversions are implemented through the method toString, defined by Object and inherited by all classes in Java.". And if you look at the decompiled code, that's exactly what you see:


So with each iteration you create a new StringBuilder object. On this instance you invoke the append method with i as parameter. I checked the source code of the append method, no other objects are created to append an int parameter. Finally you invoke the toString method and create a new String object (representing the value of the StringBuilder). So based on this analysis the correct answer should be "About 2000" (with each operation you create 2 new objects: 1 StringBuilder and 1 String). But I wonder if you need to know the underlying implementation of the String concatenation operator...

Kind regards,
Roel
 
Jeanne Boyarsky
author & internet detective
Marshal
Posts: 35279
384
Eclipse IDE Java VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I agree with the book that the answer is 1000. I looked at the bytecode to confirm. (You can see how I got the bytecode and what it actually is on my blog.) The gist is that there is only one String created from


That suggests that a temporary String object is created representing the number, and then it gets concatenated to the string on the left of the + operator.

I think the authors mean this logically rather than on the implementation level. I also did a test with

This one is more interesting. The butecode shows that Java is smart enough to use a StringBuilder. There are two calls to the StringBuilder. The String (a) is passed to the constructor. The combined b+c is added to the StringBuilder as an int directly. It is not converted to a String first.


The compiler optimizes for you. This isn't something anyone would need to know for the exam, so the book simplifies and covers what is logically happening. Remember that the exam doesn't expect you to know this level of detail so don't read into the questions!
 
Raul Saavedra
Ranch Hand
Posts: 38
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

This one is more interesting. The butecode shows that Java is smart enough to use a StringBuilder.


So there was indeed an intermediate StringBuilder object created there. Which suggests the right answer, as in Roel's analysis, ought to be 2000 and not 1000

In your first example (String = " " + 3) did you mean there was no StringBuilder created? If not, why not? And why would it have created it for your second example then? Maybe because your second example has String and int variables, as in the book question by the way, while the first example has only literals? Maybe compiler-wise things don´t get handled the same way because of that (e.g. String s = "" + 3; is not treated the same way as String s = "" + i;) Or maybe because the second example has more than one "+" operator??? Why use an auxiliary StringBuilder in one case and not in the other?

To be honest, still not clear to me what´s going on. From your second example, it seems reasonable to think that in general an intermediate auxiliary object is created, and therefore answer ought to be 2000. Yet from first example, it seems answer ought to be 1000. :???

PS. Thanks much for the cow, Roel!
 
Raul Saavedra
Ranch Hand
Posts: 38
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

In fact I´d like to add, if there is variability in this question's correct answer depending on the expression used to build the String ("" + 3 as opposed to "" + i or s + (i + j)), then that is something not covered by the book. And if knowing how they get treated differently is not expected for the exam, then these types of question ought to be removed from the book and hopefully from the exam as well. Otherwise, there should be a better/more precise clarification of why things actually happen the way they do behind the scenes for each case.
 
Jeanne Boyarsky
author & internet detective
Marshal
Posts: 35279
384
Eclipse IDE Java VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Raul Saavedra wrote:So there was indeed an intermediate StringBuilder object created there. Which suggests the right answer, as in Roel's analysis, ought to be 2000 and not 1000

No. The StringBuilder is in the more complex example. In the question, we only have String s = " " + 3 where there is no StringBuilder. The actual bytecode for a main method with just that statement is:



There are two variables. One is the parameter to the main method. The other is String s.

Raul Saavedra wrote:And why would it have created it for your second example then?

Because the second example has more to concatenate so Java optimizes it. The original example is a straightforward concatenization of two things.

Raul Saavedra wrote:Maybe because your second example has String and int variables, as in the book question by the way, while the first example has only literals?

Nope. This is the bytecode for





Raul Saavedra wrote:Or maybe because the second example has more than one "+" operator??? Why use an auxiliary StringBuilder in one case and not in the other?

Yes. In the first example, there's nothing to optimize since it is all done in one statement.

This is why the exam doesn't go into bytecode. It is confusing; especially for a beginner.

One thing that might make you feel better: on the real exam, there is a beta period. Oracle looks at questions that had comments (for example "this is confusing.") They also look at questions a lot of people got wrong. So a question like this that got a strong showing for 1000 vs 2000 would probably be removed. Or at least have choice "around 2000" removed. The book is intended to be harder than the exam.
 
Raul Saavedra
Ranch Hand
Posts: 38
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for the clarifications, Jeanne.

So in conclusion, if we have (as in the Exam watch location 5042 on the book):
int b = 2;
System.out.println("" + b + 3);

Then a StringBuilder gets created + the final string. (I'm assuming the extra parenthesis in "" + (b + 3) make no difference with respect to the creation of the auxiliary StringBuilder, though not completely sure.)

But if we had:
int b = 2;
System.out.println("" + b);

Then no StringBuilder gets created, just the final String. Is that correct?

Even with what you said ("I think the authors mean this logically rather than on the implementation level") I would still point out, an apparent inconsistency still stands in the book. Answer to question 5.9 = 1000 being correct means the statement near location 5025 is rather misleading, and it truly doesn´t always apply, as shown by your own bytecodes. Precisely the simpler case (String s = "" + i), as in the self-test question, is the exception, but it was never explained that there was such an exception. So I would still suggest that this is something to be fixed/changed in the book: either question 5.9, or the explanations and choice of examples near location 5025.
 
Roel De Nijs
Sheriff
Posts: 10662
144
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Raul,

I probably added with my initial post more to the confusion than that I was clearing any confusion. Apologies!

Let's try to set things straight!

You know String is immutable. So when you are concatenating a string with other strings and/or primitives/objects, you'll end up with a bunch of temporary strings which take up memory for no reason. To avoid wasting precious memory, the Java compiler performs an optimization. It creates a mutable StringBuilder, so no temporary strings are created.

For the certification exams (like OCAJP7 and OCPJP7) you do not need to know about this optimization performed by the Java compiler (nor the bytecode)! You just have to evaluate the (source) code you see. Illustrated with the examples of your previous post:

You will have 2 String objects: "" and "2".


You will have 3 String objects: "", "2" and "23".

So time for another mock question: given the following code (and if GC doesn't run) how many objects will exist in memory when code is done? And for a bonus point: how many will be eligible for GC?


Hope it helps!
Kind regards,
Roel
 
Roel De Nijs
Sheriff
Posts: 10662
144
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Raul Saavedra wrote:Even with what you said ("I think the authors mean this logically rather than on the implementation level") I would still point out, an apparent inconsistency still stands in the book. Answer to question 5.9 = 1000 being correct means the statement near location 5025 is rather misleading, and it truly doesn´t always apply, as shown by your own bytecodes. Precisely the simpler case (String s = "" + i), as in the self-test question, is the exception, but it was never explained that there was such an exception. So I would still suggest that this is something to be fixed/changed in the book: either question 5.9, or the explanations and choice of examples near location 5025.

The statement in chapter 4 (location 5025) is indeed misleading/wrong as it implies the sum is first turned into a String before being concatenated to the other String (which would double the number of objects). So I would propose the following fix.

Currently: The previous code can be read as "Add the values of b and c together, and then take the sum and convert it to a String and concatenate it with the String from variable a."

Should be: The previous code can be read as "Add the values of b and c together, and then take the sum and concatenate it with the String from variable a."

Thoughts?

Kind regards,
Roel
 
Raul Saavedra
Ranch Hand
Posts: 38
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Roel De Nijs wrote:For the certification exams (like OCAJP7 and OCPJP7) you do not need to know about this optimization performed by the Java compiler (nor the bytecode)!


Hi Roel,

If that is the case, I tend to think the best errata addition would be to actually remove question #5.9. The correct answer for it depends on a compiler optimization, but that´s an exceptional case not covered at all by the instructional material in the book. In fact, it contradicts the explanations given in the book (not to mention, it is stuff not needed for the cert. exams after all, as you indicate.)

The actual problem is that treatment of when and how many objects do get created in string concatenation expressions is (necessarily and purposefully) incomplete in the book. At least some of it is outside the scope of the exams, so that´s ok. But then the self-test asks for that one exceptional case which was not covered at all (rolleyes.) It´s universally nonpedagogical and absurd to expect correct answers for nonobvious or exceptional material that had been surgically left out from instruction and explanation.

So if the question remains in the self-test, then the explanations in the book should cover that exceptional case. Omitting its explanation leaves the text in conflict with the right answer. The book could simply show it as an additional case and explain what is going on. At least a small note along the lines of:

"Beware that besides the result String, in some cases an intermediate auxiliary object gets created (e.g. a StringBuilder) to process the concatenation more efficiently, but in some simple cases like the following (and then an example as in question 5.9 would follow) just the resulting String gets created, without the need for this extra auxiliary object or intermediate String objects."


How such a comment gets phrased, or how treatment of those special cases get presented by the book, is something that ought to be revised by the authors. There might be more exceptions that we haven´t dealt with here yet (I don´t know if in certain cases Wrapper number objects get involved, for example, or the strings from their toString() methods do get created, even if a StringBuilder is used) or maybe the authors might choose to start with the simplest case (no auxiliary object created) and then progress to the more complex cases/expressions where auxiliary objects or intermediate Strings (to be concatenated, but different from the end result) get created along the way. But that is up to the authors, so maybe for a revised edition of the book.

For errata purposes, my simplest take would be to remove question 5.9 altogether. Also I would possibly include a clarification note in chapter 4, section "String Concatenation Operator", near location 5025 (Kindle), along the lines of the text I bolded here above, even if without an example.

Kind regards,
Raul
 
Roel De Nijs
Sheriff
Posts: 10662
144
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Raul Saavedra wrote:
Roel De Nijs wrote:For the certification exams (like OCAJP7 and OCPJP7) you do not need to know about this optimization performed by the Java compiler (nor the bytecode)!


If that is the case, I tend to think the best errata addition would be to actually remove question #5.9. The correct answer for it depends on a compiler optimization, but that´s an exceptional case not covered at all by the instructional material in the book. In fact, it contradicts the explanations given in the book (not to mention, it is stuff not needed for the cert. exams after all, as you indicate.)


The correct answer on question 5.9 (about 1000) does not depend on any compiler optimization at all! In this question you'll have a new StringBuilder instance (sb), a new String instance (s), a String literal (" ") and then a loop which creates 1000 new String instances (by concatenating " " with the loop counter) and append each one to sb. So that results in about 1000 objects in memory. And that's based just on the code of this question, without any knowledge about what the Java compiler does behind the scenes.

Let me refer back to your initial post:
Raul Saavedra wrote:But chapter four, section "String Concatenation Operator" (location 5025) states: "The previous code can be read as “Add the values of b and c together, and then ****take the sum and convert it to a String*** and concatenate it with the String from variable a.”"

That suggests that a temporary String object is created representing the number, and then it gets concatenated to the string on the left of the + operator.

If the above is truly the case, then the answer to Self-test question 5.9 cannot be approximately 1000 but 2000, or in fact maybe 3000? (For example, if an Integer wrapper object is created, and then its .totring() method called to get the string to concatenate, then we have the Integer object, and the String resulting from its tostring() method, so two intermediate objects right there).

Based on the statement in chapter 4, you had doubts about the correct answer of question 5.9. And you were right! Based on that statement the correct answer should have been 2000 (because for each integer an extra String would be created before concatenation). That's why I thought (and still think) it was such a great observation. But there is no String conversion (and in the proposed fix this part will be removed), so that brings the number of created objects back to about 1000. The initial correct answer.


Raul Saavedra wrote:At least a small note along the lines of:

"Beware that besides the result String, in some cases an intermediate auxiliary object gets created (e.g. a StringBuilder) to process the concatenation more efficiently, but in some simple cases like the following (and then an example as in question 5.9 would follow) just the resulting String gets created, without the need for this extra auxiliary object or intermediate String objects."

Like already mentioned in my previous post: to answer the mock questions in the book and the questions on the actual exam you do not need to know about any optimizations performed by the Java compiler (nor which bytecode is created)! So adding such note/comment is completely irrelevant. It will only needlessly increase confusion, because you don't need to know for the exam(s).

Kind regards,
Roel
 
Raul Saavedra
Ranch Hand
Posts: 38
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Roel De Nijs wrote:and then a loop which creates 1000 new String instances (by concatenating " " with the loop counter) and append each one to sb. So that results in about 1000 objects in memory.


That is precisely where I disagree, Roel.

Let me try to elaborate on why I think what chapter 04 is misleading by showing you what objects it suggests get created and then elligible for GC.

Here´s what chapter 04 says:
String a = "String";
int b = 3;
int c = 7;
System.out.println(a + (b + c));
"The previous code can be read as “Add the values of b and c together, and then ****take the sum and convert it to a String**** and concatenate it with the String from variable a.”"

The extra asterisks and bold above are of course only mine, but that´s the crucial part. Let´s list the strings that the book suggests get created:
A) "Take the sum and convert it to a string" ---> that would be an intermediate or auxiliary string with the value "10"
B) "Concatenate it with the string from variable a" ---> that results in another new string: "String10"

So in the execution of that code the explanation suggests there have been three different strings in memory thus far:
1) The original value of a = "String"
2) The intermediate string that gets created: "10"
3) The result string (the value to be printed) which is "String10"

Now, using that exact same interpretation of what goes on, here´s the code inside the for loop in question 5.9. Instead
of a space I will use an underscore for more clarity so that different strings stand out more clearly:
s = "_" + i;
sb.append(s);

And now here´s the repetition of what we did above according to what the book suggests.
A) "Take the value of i and convert it to a string". As we did with the sum above, we have an int value to the right of a plus operator, and there´s a string to the left of the plus, just as above. So we must create a string with the value that we have on the right (that´s what the book suggests must happen, at least how I read it.) So in other words, if we are in iteration let´s say i=100 in the for loop of question 5.9, then we create an intermediate string with the value 100, and that would be the string "100" // <---- But this is the issue, because somehow apparently it´s not created, as Jeanne´s bytecodes show
B) "Concatenate it with the literal string "_" ---> that results in string "_100". Notice this is a different string compared to the previous one.

So in the execution of that loop alone, we would have had two new strings thus far:
-The intermediate/auxiliary string that represents the value of i, and that string is "100"
-The result string and new value to be assigned to s, which is "_100"

But we shouldn´t forget that the original value of s in this iteration was "_99" (from the previous loop iteration.) Variable s will now hold the reference to the new string "_100", so the string "_99" will hang loose and will become GC elligible. So how many objects would be GC elligible after this iteration:
1) The alleged intermediate/auxiliary string representing the value of i: "100"
2) The older value of s, which was "_99"

Similarly, in the next iteration, strings "101" and "_100" would become GC elligible. In the next, "102" and "_101", and so forth. The statement invoking the string buffer method makes no difference in this respect. So two strings per iteration, and 1000 iterations make the total about 2000 GC elligible objects.

The fact is, that intermediate string 1) apparently does not get created as Jeanne´s bytecodes show (and in spite of what the text suggests, at least from how I read it.) That is how I see it.

Hope to have better explained where I see the point of this errata submission.

Kind regards,
Raul
 
Raul Saavedra
Ranch Hand
Posts: 38
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jeanne Boyarsky wrote:I agree with the book that the answer is 1000. I looked at the bytecode to confirm. (You can see how I got the bytecode and what it actually is on my blog.) The gist is that there is only one String created from



By the way only now I´m realizing, Jeanne your bytecode is not like the one on question 5.9 because you used "" (an empty String) and not " " (<-- a space, as in question 5.9).
Now I wonder if that little thing precisely can possibly make or break the deal about whether there are 1000 or 2000 GC elligible objects created in 5.9. Could you please check the bytecode using a string with a space and not an empty string?
 
Roel De Nijs
Sheriff
Posts: 10662
144
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Raul Saavedra wrote:Hope to have better explained where I see the point of this errata submission.

You definitely have! A clear explanation of the current issue with the statement in chapter 4 (about sum and string concatenation) and the correct answer of question 5.9

You suggest to remove question 5.9, but I suggested this fix (in one of my previous posts) which alters the statement in chapter 4.
Roel De Nijs wrote:The statement in chapter 4 (location 5025) is indeed misleading/wrong as it implies the sum is first turned into a String before being concatenated to the other String (which would double the number of objects). So I would propose the following fix.

Currently: The previous code can be read as "Add the values of b and c together, and then take the sum and convert it to a String and concatenate it with the String from variable a."

Should be: The previous code can be read as "Add the values of b and c together, and then take the sum and concatenate it with the String from variable a."

So when this fix will be applied to the next edition of the book, there won't be any String conversion anymore of the sum. But simply take the sum and concatenate with the existing String. So if we apply this new statement to your excellent explanation from previous post, we get:

And now here's the repetition of what we did above according to what the book (will) suggest (once the errata item is applied to the text).
A) "Take the value of i". no extra objects are created
B) "Concatenate it with the literal string "_" ---> that results in a bew string "_100".

So in the execution of that loop alone, we would have had one new string thus far:
-The result string and new value to be assigned to s, which is "_100"


So just 1 new object per iteration, 1000 iterations --> results in about 1000 objects in memory (which is the current correct answer of question 5.9). So no need to remove question 5.9 from the book, if the erroneous statement in chapter 4 has been corrected).

Kind regards,
Roel
 
Raul Saavedra
Ranch Hand
Posts: 38
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
See my very last and short post Roel. Only now I´m realizing Jeanne used "" and not " " (a space as in question 5.9). Waiting for her reply.
 
Roel De Nijs
Sheriff
Posts: 10662
144
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Raul Saavedra wrote:Now I wonder if that little thing precisely can possibly make or break the deal about whether there are 1000 or 2000 GC elligible objects created in 5.9. Could you please check the bytecode using a string with a space and not an empty string?

That doesn't matter! If a primitive is concatenated to a String, it's not converted first to a String. When you would do the same with any object, the toString method is called first.

Illustrated with this code snippet:


Output:
_10
_java.lang.Object@2698dd08
in toString
_me


Hope it helps!
Kind regards,
Roel
 
Raul Saavedra
Ranch Hand
Posts: 38
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Roel De Nijs wrote:That doesn't matter!

I'm not so sure, because see my own analysis and counting of objects.

If I had used "" instead of "_", indeed I would have produced only 1000 objects, because the concatenation would end up being the same as the intermediate/auxiliary string created from the primitive. So there wouldn´t be 2 new Strings created in each iteration ("100" and "_100") but only one ("100"). So at least in my own analysis, using "" vs. using a string with a space as in 5.9 does make all the difference precisely between 1000 and 2000.

Until Jeanne replies, I´m now thinking that the errata might be simply in question 5.9. Either that the answer was 2000, or the statement should have used an empty string and not a space for the correct answer to be 1000. The explanations in Chapter 4 might not be problematic after all.
 
Roel De Nijs
Sheriff
Posts: 10662
144
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Raul Saavedra wrote:
Roel De Nijs wrote:That doesn't matter!

I'm not so sure, because see my own analysis and counting of objects.

True! But that's totally on another level. You don't want your compiler to act differently based on the content of your strings/objects you are using. That would be very bad and almost make it impossible to develop an application.

Using the same tool as Jeanne mentioned in her blog, I copied the bytecode for both snippets.






So actually just 1 difference: when the StringBuilder is created in the 1st code example the no-arg constructor is used and in the 2nd one the 1-arg constructor (with a String parameter) is used.

Hope it helps!
Kind regards,
Roel
 
Raul Saavedra
Ranch Hand
Posts: 38
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Roel De Nijs wrote:True! But that's totally on another level. You don't want your compiler to act differently based on the content of your strings/objects you are using.

In fact it´s at runtime that the number of GC elligible objects might end up being different, which is what the question is about. So we don´t have to even think the compiler will ever do things differently with different strings.

Indeed looking at bytecodes has been a rather bad idea all this time I think, because it only shows the variables that get created, not the objects (the string values) that might become elligible for GC at runtime.

I insist check again my analysis and counting of objects. The compiler or whatever it does with bytecodes don´t need to be checked, we have to see what goes on at runtime, as I did with my explanation using "_".

If we use "" (empty string) in question 5.9, indeed only about ~1000 objects are elligible for GC after the for loop. But if we use " " (or "_" or "abcd" or any other nonempty string as literal for that mater) as question 5.9 original states, there will be 2000 objects.

So the error is therefore in the " " of the statement, it should have been an "" (empty string) and not a string with a space, for the correct answer to be 1000. The explanations I referred to in Chapter 04 are perfectly fine then.

 
Raul Saavedra
Ranch Hand
Posts: 38
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Here´s to further illustrate my point about the big difference between having "" vs. " " (or any other nonempty string) at runtime in question 5.9. I wrote a slightly different code to make the difference stand out more easily.



After the first for loop in that code runs, there are 1000 String objects referred to by all the positions in array a. Those strings are not yet GC elligible. There will be however 1000 Integer objects that will be hanging loose after that for loop, so those will be elligible for GC at that point (assuming the GC did not run while we were inside the loop of course.)

Now notice that because in the second for loop we have "ABCD" and not "" (empty string,) in each iteration of that second for loop we will create a brand new String object that at that point did not yet exist in the String pool. In the first iteration it will be "ABCD1", in the second "ABCD2", and so forth. After the assignment of that brand new string to a[i], the original string value in a[i] (which was just the string representation of i) will then hang loose, becoming then elligible for GC. So with the code as above, after the second for loop runs there will be about 2000 objects elligible for GC (1000 Integer and 1000 String objects).

Now, here comes the big difference that "" (empty string) makes in the concatenation.



Notice that no new String object would be effectively created at runtime in the second for loop in this code. Because the expression "" + i; will always produce a String that is already in the string pool. The integer value of i gets converted to an intermediate/auxiliary String (as chapter 4 suggests,) but that String is already in the String pool (and in fact referred to by a[i].) Then we concatenate that intermediate result to "", but precisely because that's an empty string, we get the same String again. Then finally the result of the expression gets assigned to a[i], but the assignment will leave a[i] referring to exactly the same string that it was referring to before. No new string objects get effectively created, and no string objects go loose in these iterations.

So with "" instead of "ABCD" (or "_" or any other nonempty string,) after this second for loop only 1000 objects (the Integer objects) would be elligible for GC.

Bytecode checking I think is not necessary at all to observe any of this. We just need to see the strings that get generated during Runtime and compare them with what we already have in the string pool.

So "" (empty string) vs. " " (a space, or any other non-empty string) would make the difference between having 1000 or 2000 GC elligible objects in question 5.9. There is definitely an error I think. Answer is either 2000, or the question should have "" and not " ".

Kind regards,
Raul
 
Roel De Nijs
Sheriff
Posts: 10662
144
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Raul Saavedra wrote:But if we use " " (or "_" or "abcd" or any other nonempty string as literal for that mater) as question 5.9 original states, there will be 2000 objects.

Really curious to know which 2000 objects there will be in memory in question 5.9...

We have (watch the space in front of every integer): " 0", " 1", ..., " 999". So that gives me a 1000 objects. I can't think of any other 1000 objects being in memory when executing the code in question 5.9.
 
Raul Saavedra
Ranch Hand
Posts: 38
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Roel De Nijs wrote:
Raul Saavedra wrote:But if we use " " (or "_" or "abcd" or any other nonempty string as literal for that mater) as question 5.9 original states, there will be 2000 objects.

Really curious to know which 2000 objects there will be in memory in question 5.9...

We have (watch the space in front of every integer): " 0", " 1", ..., " 999". So that gives me a 1000 objects. I can't think of any other 1000 objects being in memory when executing the code in question 5.9.


Please check my analysis again, I explained it. There are two strings per iteration that will be hanging loose in question 5.9.
The ones you indicate: " 0", " 1", " 2" (each with a space)
And the intermediate string representation of ints (as suggested by ch 04, like for the sum) that get created and then concatenated to " " in s = " " + i; Those would be "0", "1", "2"... (no spaces)

Please check my analysis again a few posts back, and the extra illustration that I just wrote.
 
Roel De Nijs
Sheriff
Posts: 10662
144
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Raul Saavedra wrote:The ones you indicate: " 0", " 1", " 2"
And the intermediate string representation of ints (as suggested by ch 04) that get concatenated to " " in s = " " + i; Those would be "0", "1", "2"...

Are you pulling my leg? I stated at least 3 times in this thread that the statement in chapter 4 is WRONG and should be corrected. Because there is no such string representation when concatenating primitives! That's also the reason why in the corrected version there's no mention anymore of "converted to a String".
 
Raul Saavedra
Ranch Hand
Posts: 38
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Roel De Nijs wrote:
Raul Saavedra wrote:The ones you indicate: " 0", " 1", " 2"
And the intermediate string representation of ints (as suggested by ch 04) that get concatenated to " " in s = " " + i; Those would be "0", "1", "2"...

Are you pulling my leg?


I'm not pulling your leg at all, Roel. I know you have said it, but I wasn't sure. Now I do think the statement in ch 04 is correct, and that your assessment might be wrong about that statement being incorrect..

Please check my last illustration. I believe you might not have seen it in detail yet. After truly seeing it in detail, tell me if you agree with what objects were or were not in the string pool already, and which hang loose or not. That´s what matters, no bytecodes needed for any of this.
 
Raul Saavedra
Ranch Hand
Posts: 38
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I suggest let´s other people comment on what we have discussed so far also. Let others check the examples and the counting I´ve provided and see how they see it.

 
Roel De Nijs
Sheriff
Posts: 10662
144
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Raul Saavedra wrote:Notice that no new String object would be effectively created at runtime in the second for loop in this code. Because the expression "" + i; will always produce a String that is already in the string pool.

Completely wrong! It's not a String pool, but a String literal (or constant) pool. That's a major difference!

Easily illustrated using your code (with a small addition):


The output of the program:
true true
true true
true true
true true
true true
true true
true true
true true
true true
true true

false true
false true
false true
false true
false true
false true
false true
false true
false true
false true


So although after re-assigning the elements of array a they are still equal to the corresponding element of array b, but they refer to a different object (because they are not in the string literal/constant pool, because these strings are not literals nor constants, they are created at runtime). If you want to add a runtime String to the String literal pool, you can use the intern method of the String class. Disclaimer: note that this method is way beyond the scope for the OCAJP7 exam, just adding for completeness.

Hope it helps!
Kind regards,
Roel
 
Roel De Nijs
Sheriff
Posts: 10662
144
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Raul Saavedra wrote:tell me if you agree with what objects were or were not in the string pool already, and which hang loose or not.

Strings, Literally is a must-read for everyone who is preparing for OCAJP and/or OCPJP certification.
 
Raul Saavedra
Ranch Hand
Posts: 38
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Roel De Nijs wrote:
Raul Saavedra wrote:tell me if you agree with what objects were or were not in the string pool already, and which hang loose or not.

Strings, Literally is a must-read for everyone who is preparing for OCAJP and/or OCPJP certification.

Ah that definitely helped a lot, Roel, thanks for the link! Did check it. Did not know about this intern method, and I was under the impression that strings with equal values even created at runtime would not be replicated (as if always this intern method had been called.) Which is why I wrote the previous (now misleading) illustrations the way I did. Now I know better, thanks again.

Well, then we are back to the text in chapter four indeed being misleading. Here's your proposed fix, Roel:

Currently: The previous code can be read as "Add the values of b and c together, and then take the sum and convert it to a String and concatenate it with the String from variable a."
Should be: The previous code can be read as "Add the values of b and c together, and then take the sum and concatenate it with the String from variable a."

I would still find that fix somewhat incomplete. Because I still have a question, and this is also part of why (besides my misunderstanding of the String pool) I thought the explanation in ch04 could have been correct even after your 3 times saying that it wasn´t.

If the expression is s = a + (b + c); then a StringBuilder gets used. So there´s an extra object there, this StringBuilder, which does the actual work of creating the character-by-character representation of the sum of the ints b and c, and appends those characters (and not an intermediate string object representation of the sum) to its internal representation whatever it is, then once it´s all done, the .toString() method of this StringBuilder gets invoked, and that one and only one string gets finally created, the end result of the expression a + (b + c); and we would count the StringBuilder object in this case as GC elligible. (Hopefully I got it right at least for this case.)

But now, how about the simpler case, when we had just s = " " + i; (let´s say i instead of b, just as in question 5.9)? From Jeanne´s findings, apparently no StringBuilder gets used, but still someone (who, another object? we would be back in 2000 if so, so now I guess no, maybe a static method in class String or Integer?) must also create the character-by-character representation of i to build up the resulting String. Who is that someone in this case that (in the fixed version of text in ch04) would "take the sum and concatenate it with the String from variable a" ?? Who exactly does that work if not an object?

Kind regards,
Raul





 
Roel De Nijs
Sheriff
Posts: 10662
144
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Raul Saavedra wrote:Ah that definitely helped a lot, Roel, thanks for the link! Did check it. Did not know about this intern method, and I was under the impression that strings with equal values even created at runtime would not be replicated (as if always this intern method had been called.) Which is why I wrote the previous (now misleading) illustrations the way I did. Now I know better, thanks again.

Glad to hear it improved your understanding of how strings and the string constant pool work.

Raul Saavedra wrote:Who is that someone in this case that (in the fixed version of text in ch04) would "take the sum and concatenate it with the String from variable a" ?? Who exactly does that work if not an object?

I guess the same person/thing who adds 2 integers or multiplies a double with a long or invokes a method. The JVM!

Kind regards,
Roel
 
Raul Saavedra
Ranch Hand
Posts: 38
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Roel De Nijs wrote:
Raul Saavedra wrote:Who is that someone in this case that (in the fixed version of text in ch04) would "take the sum and concatenate it with the String from variable a" ?? Who exactly does that work if not an object?

I guess the same person/thing who adds 2 integers or multiplies a double with a long or invokes a method. The JVM!


Ok I understand that, but then I don´t see where´s the optimization in using a StringBuilder for a + (b + c), why didn´t just the same JVM which always does all that added the two integers b and c and created the char-by-char representation of the sum on its own (as in " " + i;) before the concatenation, saving one object creation (the StringBuilder) from the deal?

PS. This is beyond the scope of the errata of course, but I´m trying to accurately understand what the compiler and the VM are doing and why.
 
Roel De Nijs
Sheriff
Posts: 10662
144
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Raul Saavedra wrote:I don´t see where´s the optimization in using a StringBuilder for a + (b + c), why didn´t just the same JVM which always does all that added the two integers on its own before the concatenation, saving one object creation (the StringBuilder) from the deal?

The JVM will still be the guy that invokes the necessary append methods after the compiler has optimized your code using the string concatenation operator. If the compiler doesn't do this optimization, you'll create (in any application) millions of temporary strings (a string is immutable as you know, so each concatenation is a new - probably auxiliary - string). These millions of temporary strings are using (wasting) your precious memory.

Kind regards,
Roel
 
Raul Saavedra
Ranch Hand
Posts: 38
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Roel De Nijs wrote:
Raul Saavedra wrote:I don´t see where´s the optimization in using a StringBuilder for a + (b + c), why didn´t just the same JVM which always does all that added the two integers on its own before the concatenation, saving one object creation (the StringBuilder) from the deal?

The JVM will still be the guy that invokes the necessary append methods after the compiler has optimized your code using the string concatenation operator. If the compiler doesn't do this optimization, you'll create (in any application) millions of temporary strings (a string is immutable as you know, so each concatenation is a new - probably auxiliary - string). These millions of temporary strings are using (wasting) your precious memory.


I understand that general gist, but still don´t see something. Apparently the compiler creating an extra StringBuilder for this is an optimization:
String s = a + (b + c);

But using the extra StringBuilder is not an optimization for this, because the JVM in principle does the concatenation in runtime:
String s = a + b;

Somewhat that does not fully add up. What I kind of see is that if creating the StringBuilder would be suboptimal to use for just b, then it would also be suboptimal for (b+c). And on the other hand, if the creation of the StringBuilder is allegedly an optimization for (b+c), it should also be so for just b.
 
Roel De Nijs
Sheriff
Posts: 10662
144
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Raul Saavedra wrote:Somewhat that does not fully add up. What I kind of see is that if creating the StringBuilder would be suboptimal to use for just b, then it would also be suboptimal for (b+c). And on the other hand, if the creation of the StringBuilder is allegedly an optimization for (b+c), it should also be so for just b.

In the bytecode I posted a few posts back, I see both times a StringBuilder being created... And the code was of the format String s = a + b;
 
Raul Saavedra
Ranch Hand
Posts: 38
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Roel De Nijs wrote:In the bytecode I posted a few posts back, I see both times a StringBuilder being created... And the code was of the format String s = a + b;


Ok but that would support the 2000 answer again, wouldn´t it? And how is that compatible with Jeanne's finding for the simpler case, where only one String gets created, in principle no StringBuilder involved???

I´m still missing something I´m sure, but I don´t see it. I´ll check the forums tomorrow again, thanks for all the help, Roel!

Kind regards,
Raul

 
Roel De Nijs
Sheriff
Posts: 10662
144
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Raul Saavedra wrote:Ok but that would support the 2000 answer again, wouldn´t it?

No! No! No! Because that's a compiler optimization! For the certification exams (like OCAJP7 and OCPJP7) you do not need to know about this optimization performed by the Java compiler (nor the bytecode)!

Raul Saavedra wrote: And how is that compatible with Jeanne's finding for the simpler case, where only one String gets created, in principle no StringBuilder involved?

There is indeed a difference between following code snippets although both being string concatenation.

If you only concatenate compile time constants, no StringBuilder is required to concatenate (that's another compiler optimization ). As shown in this snippet (followed by the decompiled code and the resulting bytecode)




If one of them is not a compile time constant, a StringBuilder will be used to concatenate (also a compiler optimization). As shown in this snippet (followed by the decompiled code and the resulting bytecode)




And now the last peace of magic If you make in the previous the variable i a compile-time constant by adding the keyword final, you'll get almost the same decompiled code (and bytecode) as the first example. But no StringBuilder to perform the concatenation. This illustrates the java compiler is a real smart cookie and does a tremendous amount of optimizations for us, developers.




I hope you now understand the compiler does a whole set of possible optimizations to make sure your code runs as performant as possible, both in time as in memory usage. And that it's almost impossible for a beginner (and even for an advanced developer with 10 years experience like me) to know all these optimizations. That's why for the book and the actual certification exams you do not need to know/care about these optimizations. Even in real life these optimizations are mostly unknown by developers. Because if you write good performant code, the compiler optimizations will only make your code more performant using less resources (memory). Unless you are a Java Micro Edition developer, then every object really counts if you have just 4K memory to run a very cool app

Hope it helps!
Kind regards,
Roel

Disclaimer: this is way beyond the scope of the OCAJP7 exam, it's even beyond the scope of OCPJP7
 
Raul Saavedra
Ranch Hand
Posts: 38
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Roel De Nijs wrote:If you only concatenate compile time constants, no StringBuilder is required to concatenate


Ah, that sets it. The choice of examples used by Jeanne were just confusing, and her explanations were misleading.

Jeanne Boyarsky wrote:Because the second example has more to concatenate so Java optimizes it.


That is not the case then, it wasn´t because there was more to concatenate. In her example a+(b+c) it doesn´t matter at all that there were more than one + operator. What mattered was the fact that the string to the left of the first plus operator is a String variable, immutable as it is, it´s still not a constant, and it´s not a literal. The StringBuilder gets used only because of that there. If we had had a+b, same thing as you found out. The difference with "" + i not using the Stringbuilder is the fact that first string is a literal, not that there´s less to concatenate.

Ok, I think I got it. Gee, Jeanne´s examples threw me off on two accounts: the "" instead of " ", and the actual reason for the StringBuilder being used in one case and not the other.

Roel De Nijs wrote:I hope you now understand the compiler does a whole set of possible optimizations to make sure your code runs as performant as possible, both in time as in memory usage. And that it's almost impossible for a beginner (and even for an advanced developer with 10 years experience like me) to know all these optimizations.


I perfectly understand. Believe it or not I have more than 5 years of java experience myself and I´m SCJP 1.4, but that was way back up until 2004. I haven´t worked with Java at all for the last 10 years, so I´m using the recertification and Java7 to brush up everything.

Given what you say about compiler optimizations not needed for the exam, I still tend to think question 5.9 ought to be out of the book. Just as easily the question could have used a String variable and not the " " literal to the left of the first plus, and from the book, even corrected with your suggestion, the answer would still seem to be 1000, but in that case it would be 2000. And there's no way a reader could understand why that would be the case unless the book mentions at least that simple optimization done by the compiler.

As a parenthesis, I somewhat find the emphasis in so many "The code fails to compile" questions in the Cert exam to be somewhat misdirected or even absurd. Because use-case wise, when a tricky code fails to compile, the developer would immediately know exactly why not and where, when running the compiler. In that sense, it is to some extent not so critical whether by just looking at tricky code we can tell if and where code fails to compile. On-the-job wise, and for efficient coding, I find it for example much more important to know about this particular compiler optimization we have been talking about here.

Thanks for all the help again, Roel! Have a cow!

Kind regards,
Raul
 
Roel De Nijs
Sheriff
Posts: 10662
144
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Raul Saavedra wrote:Believe it or not I have more than 5 years of java experience myself and I´m SCJP 1.4

I'm SCJP 1.4 as wel

Raul Saavedra wrote:I haven´t worked with Java at all for the last 10 years, so I´m using the recertification and Java7 to brush up everything.

Great to hear!

Raul Saavedra wrote:Given what you say about compiler optimizations not needed for the exam, I still tend to think question 5.9 ought to be out of the book. Just as easily the question could have used a String variable and not the " " literal to the left of the first plus, and from the book, even corrected with your suggestion, the answer would still seem to be 1000, but in that case it would be 2000. And there's no way a reader could understand why that would be the case unless the book mentions at least that simple optimization done by the compiler.

But if that was the case, the book would have mentioned about 2000 as a correct answer and would have explained why it's 2000 and not 1000...

Raul Saavedra wrote:I somewhat find the emphasis in so many "The code fails to compile" questions in the Cert exam to be somewhat misdirected or even absurd. Because use-case wise, when a tricky code fails to compile the developer would know it immediately and exactly why when running the compiler. In that sense, it is to some extent not so critical whether by just looking at tricky code we can tell if code fails to compile. On-the-job wise, and for efficient coding, I find it for example much more important to know about this particular compiler optimization we have been talking about here.

I think that's the most heard criticism on the cert exams. From what I've heard and read here on the forums so far, the OCAJP7 has very few tricky questions. That would be a good adaptation, certainly because everyone uses an IDE these days which immediately tells you something is wrong. But me personally, I like these questions, because they are very easy to answer But that's maybe just because I can spot the error rather quickly.

Happy studying and good luck with the recertification exam!
Kind regards,
Roel
 
Raul Saavedra
Ranch Hand
Posts: 38
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Roel De Nijs wrote:
Raul Saavedra wrote:But if that was the case, the book would have mentioned about 2000 as a correct answer and would have explained why it's 2000 and not 1000...


But then we get back to my former comment: that it´s universally nonpedagogical and absurd to expect correct answers for nonobvious or exceptional material that had been surgically left out from instruction. The book ought to provide the instruction needed to answer the questions correctly before those questions, not after the questions.

What I mean, your proposed errata fix at least removes the inconsistency in the book between instruction in chapter 4 and question 5.9. But to me presenting that question given what was taught, even with the proposed fix, is in itself questionable. What was taught was insufficient, somewhat misleading (and I keep repeating this word.)
 
Roel De Nijs
Sheriff
Posts: 10662
144
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Raul Saavedra wrote:What I mean, your proposed errata fix at least removes the inconsistency in the book between instruction in chapter 4 and question 5.9. But to me presenting that question given what was taught, even with the proposed fix, is in itself questionable. What was taught was insufficient, somewhat misleading (and I keep repeating this word.)

Again I disagree In the section Important Facts About Strings and Memory the study guide clearly mentions the string constant pool and how the compiler processes String literals. If you apply this knowledge in combination with the proposed errata fix there can't be any doubt about question 5.9 and its correct answer.

If more readers would complain in (near) future about this question, this question will definitely be added to the errata overview (and might be updated or maybe even replaced in the next edition). But currently the question still stands. If you like it or not

Kind regards,
Roel
 
Raul Saavedra
Ranch Hand
Posts: 38
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
As they say in German, kein Problem! I will check again this section you mention about String literals. Glad that at least the inconsistency ch.04 vs. q5.9 gets removed by the errata fix.

Kind regards,
Raul
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic