• Post Reply Bookmark Topic Watch Topic
  • New Topic

Why would you do this?  RSS feed

 
Randall Smith
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Everyone:

Please excuse my noob post - I'm as dumb as a rock regarding java...
... but I have seen something similar to the following on multiple
occasions, and I keep asking myself -- WHY?

public static List getMyData() {
List myDataValues=myApi.getDataValues();
List newDataValues=new ArrayList();
Iterator i=myDataValues.iterator();
while(i.hasNext()) {
DataValue dv=(DataValue)i.next();
newDataValues.add(new DataValue(dv.getLabel(),dv.getValue()));
}
return newDataValues;
}

The part I have the issue with is the whole duplication process. Why not
just return myDataValues? And no, there is no side-effect to calling the
DataValue methods - it just makes a copy of each DataValue. I have also
seen similar where some sort of custom "deep copy" is used to make the copy.

Thanks to anyone who takes an interest in this, but please -- don't guess.
If you don't know of a real-world reason for this, please don't try to
invent one. This isn't a trivia contest, or trolling, or an attempt to
start some sort of (programming) religious war. I am desperately trying to
understand this.

Thank you,

Randall
[ June 16, 2008: Message edited by: Bear Bibeault ]
 
Ernest Friedman-Hill
author and iconoclast
Sheriff
Posts: 24217
38
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Randall,

Welcome to JavaRanch!

This is really a core tenet of modern computer programming. In a nutshell: if you control access to changeable data, then you know it hasn't been changed.

Let's say you know that your class X has a list of DataObjects in a private member variable. You keep that list sorted, and you never add any "null" references. This knowledge will affect how you write the methods in your class. For example, if you know there are no nulls, then you don't ever have to check for nulls. If you know the list is sorted, then you can search it using a fast binary search, rather than the slow linear search the contains() method uses.

Now, what happens if you have a method getDataItems(), and you return that member variable from this method? Well, what's to stop me from saying

List items = x.getDataItems();
items.add(null);
Collections.shuffle(x);

Now your member variable points to a list that (1) isn't sorted, and (2) contains a null entry somewhere in the middle. Your class will be broken. Someone will call getLargestDataItem() or something, and the method will return the wrong data item, or throw a NullPointerException, and debugging is going to be a huge problem, since there's nothing wrong with X at all -- it's the other, outside code that made the mistake.

Now, the alternative is to make a copy of the list (and if the DataItems are themselves changeable, the DataItems too.) Now the outside code can do whatever horrible things it wants to that list, and you don't care -- because the outside code just has a copy. Your class X still has the original, untainted list.
 
Randall Smith
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Ernest, and thank you for your kind and detailed reply!

I believe you managed to drive the point home to me, although the specifics of your explanation don't apply to the example. By that, I mean

"say you know that your class X has a list of DataObjects in a private member variable"

It doesn't.

Now of course, you couldn't see the entire context that the code runs in, so you wouldn't exactly know - the clue in my example is my showing that myDataValues is a local object to the getMyData() method. It is created at the top of the method, not assigned to any private member, and drops out of use at the end of the method. So my question was "why build a copy of something you're about to throw away"?

Ultimately, I think you taught me something about the java mindset. I guess this practice of constantly making copies of objects built inside some wrapper method before you pass it back to "that mean, nasty caller that can do whatever horrible things it wants to that list" is a prevalent approach in java.

I can fully appreciate that there are times where you do need either a 'read only' or 'update' or 'scratch' clone/deep copy of an object for a caller to work with, and possibly send back later for comparison and data storage. It just seems that this copying is done on the off chance "it may be needed".

Anyway, thank you again for your reply. I have been unable to google this practice and find a detailed discussion, nor have I yet found anything on it in print. Your explanation is the best I have seen...

Thank you,

Randall
 
Ernest Friedman-Hill
author and iconoclast
Sheriff
Posts: 24217
38
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
As to the "not a private member" argument, the method we're looking at might be serving as a "facade" for another class in the same package. It might be OK to pass an uncopied list between classes in a single package, since those classes would be written by one team and expected to respect eachother's invariants. But when the list is passed out of the package, then it might need to be copied for the reasons given before. All that's happened is that the package, rather than the class, is being used as the unit of encapsulation.

I must confess I am amused by your attitude towards this kind of defensive programming. People with experience working on large software systems understand why caution is warranted. In describing this I used language that suggests the data corruption would be intentional, but of course it's not usually like that; it's usually perfectly innocent, a misunderstanding about how an interface should be used. But it's still corruption, and the debugging problem is quite real.

The idea of defensive copying is not limited to Java; in fact, it's less prevalent in Java than in some other languages, because many of Java's core classes (String, in particular) are immutable and never need to be copied. The wise C programmer copies strings rather often.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!