This is from JLS:
Java's UTF representation of the
string is this:
Every character in the string is represented by one, two or three bytes and the rule it follows to do it is this:
(a) If the character is in the range '\u0000' through '\u007f', then it is represented by one byte
(b) If the character is in the range '\u0080' through '\u07ff', then it is represented by two bytes
(c) If the character is in the range '\u0800' through '\uffff', then it is represented by three bytes.
So, how many bytes are used to represent a character depends on what the value of character is.
HTH