Win a copy of Kotlin in Action this week in the Kotlin forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

switch on a string  RSS feed

 
J. Ryan
Greenhorn
Posts: 21
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Why is it that Java only allows you to use a switch statement on an int or a char? Some other languages allow you to use a switch on a string, and I was wondering why Java doesn't allow it. Anyone know why? Thanks.
 
Edwin Dalorzo
Ranch Hand
Posts: 961
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The Java Language Specification just states that the value of the switch expression must be char, byte, short, int, Character, Byte, Short, Integer, or an enum type, or a compile-time error occurs.

It does not exaplin why it does not accept Strings in the switch statement. Then it is one of those things other languages have and Java simply does not. Other languages have delegates, or indexers and Java does not either. And other languages do not offer generics or multithreading programming, while Java does.

I daresay there may not be an specific reason. But if you need it so badly, maybe you can overcome the issue using enums.
 
J. Ryan
Greenhorn
Posts: 21
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks!
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Back in ancient days, the integer in the switch was an index into an array of addresses and the processor loaded the address from the array and jumped there. It was very, very fast but not very flexible.

Some times we can replace a bunch of string choices by using the string as a key to a Map to retrieve a Command or Strategy or little object that contains the code we want to execute. If that sounds confusing, it may be a bit much for the beginner forum. If it sounds interesting, we'll write up an example.

Keep asking interesting questions!
 
Adam Nace
Ranch Hand
Posts: 117
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The clue to why a Switch Statement won't take a String can be found in section 7.10 of the virtual machine spec. Also, you should take a look at the language spec (section 14.11), which defines ALL of the types that CAN be used in the switch statement. These types are: char, byte, short, int, Character, Byte, Short, Integer, or an enum type.

When using a switch statement, the code is compiled to a table of target offsets. For example, I quote from the Virtual Machine Spec:




compiles to




Note that this is how it COMPILES. Therefore, the labels of a switch statement must be a COMPILE-TIME CONSTANT. i.e. for any object used as a switch label, javac must be able to create the labels, and it must be guaranteed that those labels will never change values at runtime. For primitives literals(int, byte, short, etc), this is not a problem, because a literal is always a constant. Non-literal primitives (i.e. cases that are defined by variables) must be declared to be final, so that they can be evaluated at compile time.

Reference Objects have the same restriction, but they have yet a further restriction: They should be immutable objects. Since you are going to be comparing objects using the .equals operator (actually, you're not, but that's the effect you want to produce), it's not the reference that's important, it's the state of the object that is important. So any object used in a switch label must have a state that can be evaluated at compile time, which will never change thereafter.

Ok. So Far, So Good. Except I haven't explained why a String cannot be used, nor why a long or a Long cannot be used, for that matter.

Once again, I quote from the virtual machine specification:


The Java virtual machine's tableswitch and lookupswitch instructions operate only on int data. Because operations on byte, char, or short values are internally promoted to int, a switch whose expression evaluates to one of those types is compiled as though it evaluated to type int. If the chooseNear method had been written using type short, the same Java virtual machine instructions would have been generated as when using type int. Other numeric types must be narrowed to type int for use in a switch.


Thus, you cannot use a String, because it does not evaluate to an integer (I will cover Longs and Enums in a minute). But then you say to me, "But Adam, couldn't you use the String's hashCode? That's an int!" Quite right, the hashCode is an int, but it is not guaranteed to be unique, which causes two problems:

From the java language spec section 14.11:

No two of the case constant expressions associated with a switch statement may have the same value.


We cannot guarantee that two different Strings will evaluate to unique values using hashCode.

Problem #2: A completely different string with the same hashCode would then match the label, causeing unexpected behavior. If I'm not mistaken, the hashCode for a String is based soley on the first 16 characters only, so if the first 16 characters are the same, the hashCode will be the same, regardless of whether the last 100 characters are different. See number 7 on this list.

So String is out.

But why Long? It does say something about narrowing, doesn't it? Yup, it does, but that's only for the ARGUEMENT of the switch. You can narrow the value your are testing (with an EXPLICIT cast ONLY), but the possible cases must still be ints. Long contains MORE data than an int, so java will not convert to an int without being explicitly directed to, whereas with shorts and bytes, there would be no loss of data should java widen their values to ints without being instructed to.

So we're down to numeric types that can be implicitly widened to an int by the compiler. So what gives with enums? Well, we know that each element in an Enum is a singleton (it is unique, you cannot have two different but identical copies of the same value of an enum). But what is its int value?!

Well, as far as I can tell, it uses the "ordinal" field for that. The only reference I can find on how this works is the original draft. Chances are, things changed slightly, but all of the suggested implementations did involve the ordinal.

So what is the ordinal? It is a unique integer that identifies the order in which the values were identified in the enum declaration. This is a compile time constant. However, it is possible to change the order in the source code and recompile the enum, causing the ordinal values to change. Hence, the switch statement probably cannot be implmented as you see above. This is where the compiler comes in. All of the changes introduced by JSR 201 (this is the basic langauge modifications of 1.5) are handled by the source-to-byte-code compiler, so that the virtual machine never knows that the object was an Enum in the first place. The compiler more likely converts the switch to an if statement. But this is an implementation detail. The fact remains that the Enum DOES satisfy the required properties: Evaluates to a Compile Time Constant Int Value, and has an Immutable State.

So there you have it. An explanation of WHAT can be used for switch statements, and why.

- Adam
 
Edwin Dalorzo
Ranch Hand
Posts: 961
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Excellent, very enlightening and deep explanation, Adam.

By the way, welcome to Java Ranch
[ July 25, 2006: Message edited by: Edwin Dalorzo ]
 
Adam Nace
Ranch Hand
Posts: 117
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Edwin Dalorzo:
Excellent, very enlightening and deep explanation, Adam.

By the way, welcome to Java Ranch


Thank You. I hope to be around for quite a while. I am not new to java forums, but I am new to Java Ranch, and I think that many of the threads on this forum are particular interesting, and I always seem to learn something from reading/replying to them.

- Adam
 
J. Ryan
Greenhorn
Posts: 21
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for all the insight!
 
parshu ram
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I think the switch statement uses the == operator on the values of the premetive type. but same is not possible with object value comparison, as String is an object. BTW can we use an object in the switch case in C++??? think....
-Ram
 
Edwin Dalorzo
Ranch Hand
Posts: 961
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi, Parshu

Adam Nace, in one of the posts above, offered a very good explanation, even showing bytecodes. There you can see how the switch stament works behind the scenes.

Regarding your question about C++ the answer is no. Actually C++ do not have String as in Java. It uses char arrays or pointers to chars.

For instance, this code is invalid:



As in Java, C++ only accept integers in the switch expression.

Regards,
Edwin Dalorzo
[ July 25, 2006: Message edited by: Edwin Dalorzo ]
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
That was a very nice exposition, Adam. I'm not criticizing you at all in the next bit, but rather language designers. Letting the byte code implementation influence the language syntax is something that really bugs me. There's no reason a language could not support switch on string. The byte code would be different, maybe more like if-else-if-else. Big deal. I mean even COBOL has this:

Evaluate goes much further into interesting syntax. A few hints:

This lets you code nifty state machines and logic tables and such.

Sure it compiles to something less efficient than a Java switch, but the point again is why should that particular efficiency influence the syntax? Just to make life easy for compiler writers? Hah! They only have to do their job once. The rest of us live with it forever.

end-rant
 
Larry Eisenstein
Greenhorn
Posts: 14
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Java needs to switch using a type whose members can all be listed. There's a name for that, but I can't remember.

All int, char, ... can be listed(maybe a very long list). But, things like
Strings, floats aren't enumerable, so it is harder for the compiler to switch on those types. It could be added to Java, but they have decided not to do it for whatever reason.
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes, I was kind of careful to say there is no reason a language can't do switch on string, but didn't say there are plenty of reasons particular language designers chose not to. I happen to think "to make the compiler easier" is not a reason that sways me much.

Another example: REXX has a switch on boolean which is roughly like COBOL's "evaluate true" and nicely expresses a bunch of if-else-if-else stuff.

REXX's designer, Mike Cowlishaw, said the free form syntax made interpreter implementation very difficult, but somebody only has to do it once per platform. His goal was a simple, understandable, enjoyable language, and I spent about 10 years having the most fun I've ever had programming because he got it right.
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
[Stan]: Letting the byte code implementation influence the language syntax is something that really bugs me.

I'm not sure that's what happened here. My guess is that the main influences here were (1) they were retaining C-like behavior wherever they didn't have (or think they had) a good reason to change it, (2) they were scrambling to put in a bunch of other features they thought were important, and (3) switch statements just weren't that big a deal to most people. Nowadays few of us advocate using switch statements much in Java. (I contend there are a few rare cases where they offer better performance than the alternatives, but that's of very minor importance.) Most of the time we'd advocate using the strategy pattern and replacing a switch with a HashMap. I don't know how common that idea might have been back when Java was developed (other than that they'd obviously have used Hashtable instead). Regardless, I don't think improving switch was high on anyone's priorities in the early days; if it were, probably the first thing to do would be to change the syntax to remove the need for break - and that would fly directly against the idea of retaining C-like behavior where possible. Lacking a compelling reason for change here, they left switch alone. That's my guess, anyway.
 
Adam Nace
Ranch Hand
Posts: 117
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Jim Yingst:
[Stan]: Letting the byte code implementation influence the language syntax is something that really bugs me.

I'm not sure that's what happened here. My guess is that the main influences here were (1) they were retaining C-like behavior wherever they didn't have (or think they had) a good reason to change it, (2) they were scrambling to put in a bunch of other features they thought were important, and (3) switch statements just weren't that big a deal to most people. Nowadays few of us advocate using switch statements much in Java. (I contend there are a few rare cases where they offer better performance than the alternatives, but that's of very minor importance.) Most of the time we'd advocate using the strategy pattern and replacing a switch with a HashMap. I don't know how common that idea might have been back when Java was developed (other than that they'd obviously have used Hashtable instead). Regardless, I don't think improving switch was high on anyone's priorities in the early days; if it were, probably the first thing to do would be to change the syntax to remove the need for break - and that would fly directly against the idea of retaining C-like behavior where possible. Lacking a compelling reason for change here, they left switch alone. That's my guess, anyway.


I want to definitely state my agreement with this. Especially with the startegy/hashmap idea, which I have recommended to many people (though not here) many times.

In my personal opinion, although I find the switch statement a very clean way to handle certain special cases, I rarely use it, and my feelings would not be hurt if switch was not a part of the java language. The switch is really nothing more than a glorified if structure, and furthermore, can often be replaced with object oriented mechanisms to better effect.

I agree that the language should not be defined by the implementation (bad OO :-), and I'm not going to speculate whether or not it has happened. I can see no GOOD reason why the couldn't have made the switch operate using the .equals operator, save that it would be less efficient, which I also believe is an implementation detail, not a linguistic one. Having said that, I honestly prefer that it does not, because I feel that the idea of a switch is to operate on a set that can be precisely enumerated at compile-time.

To me, the switch is a STYLISTIC structure, and should not be arbitrarily used. I believe when writing code, when two or more valid alternatives are presented (emphasis on both being valid and roughly equally simple and efficient -> no disrespect to Mr. Ockham), then one should choose the alternative which will make the most sense to a reader who may later come and attempt to decipher the code. "If" should be used when the conditions are not, or are only weakly related. "Switch" should be used when the conditions form an enumerable, mutually exclusive set.

But that's just my opinion.

- Adam
 
Pooja Patole
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Adam,
Your explaination was really very helpful. Thanks a lot

Pooja
 
Jesper de Jong
Java Cowboy
Sheriff
Posts: 16028
87
Android IntelliJ IDE Java Scala Spring
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Note that since Java 7 it is possible to use strings in switch statements.
 
fred rosenberger
lowercase baba
Bartender
Posts: 12542
48
Chrome Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
also note that most of this thread was written in 2006. I didn't really parse every comment, but anything said before may not hold true anymore. Six years is an eternity in software.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!