• Post Reply Bookmark Topic Watch Topic
  • New Topic

for loop bounds  RSS feed

 
Tim Jeffery
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am looking for feedback concerning the bias for zero-based parms in for loops.

Consider the following:

- A loop is processing a set number of items

- The logic within the loop uses the number of the item being processed to drive some of the processing (i.e., for the first item in the loop, processing uses the value 1; for the second item in the loop, processing uses the value 2, etc.)

So, the problem:

If I use a loop driven by a zero-based variable, I have code like either this:

for ( int i = 0 ; i <= endValue ; i ++ )
{
/* some processing */
int j = i + 1;
/* some processing that uses j */
}

or code like this:

for ( int i = 0 ; i <= endValue ; )
{
/* some processing */
i++; /* to get i to the appropriate value */
/* some processing that uses the incremented i */
}

The problem with the first is that because i does not represent the value I need to process, I have to create a second variable. The problem with the second is that the increment of i is buried in the body of the loop.

And in both cases, the loop must stop when i is one less than the actual number of items I am processing.

I believe an easier to understand solution is to start the loop variable with the number of the item I am processing so that the loop explains itself:

for ( int i = i ; i > endValue ; i++ )
{
/* all processing with no need to
either create a second variable or
increment i in the body of the loop */
}

And using this approach, I can actually generalize a loop to:

for ( int i = startValue ; i < endValue ; i++ )
{
/* processing */
}

The benefit is the loop itself becomes self documenting. rather than arbitrarily starting every loop increment at zero, a loop is started with the initial value of the data that drives the purpose of the loop.

This is not to say that no loop should have a zero-based variable. If a loop is driven by an array, clearly there is value in using a variable that starts at zero.

Thoughts?
 
Steven Bell
Ranch Hand
Posts: 1071
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
As far as I'm concerned there is no rule that says a for loop has to be zero based. It is fairly common as arrays are zero based and for loops are often used to iterate through arrays.

It's faily often that I use a for loop like this.


The reason behind this is to lose the reference to the Iterator as soon as it is not needed. I don't think it actually has any real performance advantage, but it is a common coding style where I work, and we do our best to maintain a common coding style.
[ January 12, 2005: Message edited by: Steven Bell ]
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I suppose it's nice to use a zero-based for loop when possible because it's probably more readable to junior programmers. That is, even a very junior programmer ought to be able to understand a non-zero-based for loop if they take a moment to look carefully - but there's an added chance of error, especially if the programmer is in a hurry (which is not unusual at many companies). So I'd tend to favor a zero-bsed loop in most cases. However there are certainly some cases where that doesn't make sense. To use Tim's example:

As long as there is at least some processing that uses the raw value of i rather than j, this idiom makes sense. However if the code is like this:

There is no good reason to start i at 0 here, since the value of 0 is never used in processing - only the incrementd value j. Much better to just start the value at 1, and skip the j = i + 1 step.

As for this:

I would advise against putting the i++ outside the parentheses of the for statement if possible. It's much more readable to see all the control conditions for the loop in one place, at the beginning. Similarly:

Here it's not quite possible to put the call to it.next() into the for loop parens, so we do the next best thing - put it on the line immediately following the parens. It's easy to see the control structure this way. Don't hide the it.next() somewhere later in the loop where we might overlook it - it's too important.

I don't think it actually has any real performance advantage

None, I think. The value of this idiom (aside from it being a widely-adopted standard in most places) is that if you have another iterator nearby, this reduces the chance that you'll have to make up a new variable name like "iter2" or some such. Consider two consecutive loops using another common style for iterators:

See, the compiler forced me to change the second iterator's name to iter2, because the original iter is still in scope. OK, mildly annoying, but not too big a deal. Except - I also forgot to change the name in the final iter.next(). And the compiler didn't warn me abou that one. This code will error at runtime. Bugs like this may be very subtle. Using for loops will greatly reduce the chance of this happening:

Here I can be lazy and reuse the name iter for the second loop - and there's no problem. The only time this doesn't work is if you have a second loop inside the first. Usually I deal with that by refactoring the inner loop to a separate method.

Note also that the new enhanced for loop in JDK 5.0 is implemented to be almost exactly equivalent to using an Iterator inside a for loop - except we never need to name the iterator at all. And the casting is usually done for us. The above code will proably look something like this in JDK 5.0:
 
Jeroen Wenting
Ranch Hand
Posts: 5093
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I wonder why you think a zero-based loop is specifically easier for beginners than for experienced people?
In fact, it's often the other way around.

Many people started learning programming in basic or Pascal, two languages that traditionally default to one-based loops.
Many have big trouble making the switch to the zero-based loop structures advocated by C-style languages (and some never make it, I work in a company now that uses a custom language based on C which defaults to one-based loops).
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
That's true. I was thinking that in Java people really have to get used to zero-based loops, since they show up frequently in arrays, Lists, and elsewhere. Given that they do have to learn that, I think it's probably easier if most other loops they encounter also start with zero, for consistency. If we were designing a new language from scratch, I'd agree that starting with 1 rather than 0 is worth considering, at the very least.
 
Tim Jeffery
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
So to summarize, I am hearing that, if the business algorithm of the loop is driven by a value starting at some value other than zero, it is reasonable for the loop variable to start at this value rather than zero.

This seems reasonable simply because anyone who may have to read the program will almost assuredly be Java-knowledgeable and should understand the rules of the language, while they may not be familiar with the details of the business problem being addressed. If the program constructs are presented in a way that the business algorithm is clear, it will reduce the learning curve for a new programmer.
 
Jeff Bosch
Ranch Hand
Posts: 805
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The zero-based concept comes from machine/assembly language programming. The index of an array is the offset from the base. An offset of zero means "this element" or no offset. That's also why you have to declare a data type for pointers in languages that support them: the compiler has to know how many bytes to skip when a program statement increments to the next index value.
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Good explanation of offsets. That didn't bother me one bit in assembler, but zero based indexing is still geeky to me, and I've been coding for 26 years. I think Borland Pascal let you declare the number range for an array, say 10..13 if you only needed to store the names of face cards. The only time I felt a strong requirement for zero based array indexing was using an unsigned byte index in Pascal ... some times ya gotta have 256 values, and 0 has to be one of them. I haven't run into that problem with integers. I guess I'm used to all this by now, but I hate to say the "zero-eth" element.
 
Jeff Bosch
Ranch Hand
Posts: 805
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Most of my programming experience is in assembler or C. I try not to say "zeroth" element because it makes me sound like I have a lisp. I usually say "the first element" or "index 0". Sounds more impressive, though saying it that way won't get you invited to the best parties...
 
Marcus Laubli
Ranch Hand
Posts: 116
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
How can we have this type of discussion about for loops without even mentioning the do / while loop?

It sounds like we want to make things simple. We often over-complicate things.

Marcus
 
Jeff Bosch
Ranch Hand
Posts: 805
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Different type of loop. The original message concerned the use of "0" in a for loop. A Do-While does not use an initial condition like that, it performs the loop once then tests a condition. Each has its place.
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!