[Jeanne]: In Java a char is one byte. However, some Unicode characters are encoded in two bytes. Ummm... I can't really see how this is true. In Java, a char has a range from 0 to 65535, which requires two bytes. (Minimum) It's true that,
if you encode a group of chars as bytes, using most common encoding schemes (ASCII, ISO-8859-1, Cp-1252, UTF-8), the most common (English-language) characters can be encoded in 1 byte per character. But that's not an absolute rule, and I think it's dangerously misleading to say that a char is a byte.
So, why do they use int int rather than char as the return type here? (For String.indexOf() as well as various other methods scattered in the standard API.) I think the reason is for convenience, given that int is the "default" type for most expressions, unless something in the expression forces the expression to be float or double instead. Because anytime you perform simple arithmetic, or even just write a plain literal like 1 or 42, Java assumes you mean an int. And if it's expecting a char rather than an int, Java gets pissy and blaks until you fix it. I think that the decision to define indexOf(int) rather than indexOf(char) is motivated by nothing more than the desire to save users from the mild annoyance of having to
cast a result from int to char. Which is not a particularly compelling reason, in my opinion, but it's the best one I can think of.
So: chars in Java require two bytes. But Java often tends to assume that computations will need four bytes, and to accommodate this, indexOf() and other methods accept parameters of int type.
[ November 17, 2006: Message edited by: Jim Yingst ]