• Post Reply Bookmark Topic Watch Topic
  • New Topic

Java String Spliter, substring, string tokenizer  RSS feed

 
brando brandido
Greenhorn
Posts: 20
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Normally a course details is described as follow: ABC 123 : Course Description.

i want to design and implement a Java program and named it CourseSpliter.java

the program contains a method that will split then print the 3 components of the following example of course details:

i type the course details like this "ICS 102 : Introduction to Computing I"

then the output would split like this

Course Name: ICS
Course Number: 102
Course Description: Introduction to Computing I

so, delimiting the space and colon.

After displaying the output it ask the user if another input will be given. (so apply a do while)

so i believe i have to use the StringTokenizer class and its methods.

The program should work for any course details following the above format (i.e., ABC 123 : Course Description)

can anyone have any idea? (i don't like scanner ok? hehe, maybe java.util, stringtokenizer only)



 
Jules Bach
Ranch Hand
Posts: 71
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
My java is basic (i'm still a learner), so this probably isn't the best way to do it - but if you know that the course, number and description always start at the same place in the string you could split it with the substring method

i.e.



This is assuming that the course code and number always have a length of 3
 
Campbell Ritchie
Marshal
Posts: 56584
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to the Ranch

If you read about StringTokenizer, you find you ought not to use it in new code. you should try reading about the String class’ methods. If you click on Pattern you will find a brief list of regular expressions. Remember if you need the \ you will probably have to write \\.
 
Joanne Neal
Rancher
Posts: 3742
16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
But if Jules' assumptions about the lengths of the course id and number fields being fixed is true, then regex is probably a bit of overkill. Jules solution looks fine to me.
 
Campbell Ritchie
Marshal
Posts: 56584
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Agree, Joanne. But that restricts you forever to course codes three letters long. He does say, “delimiting the space and colon”, which I interpreted as meaning you use the space and colon to split on.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
brando brandido wrote:so i believe i have to use the StringTokenizer class and its methods.

The program should work for any course details following the above format (i.e., ABC 123 : Course Description)

As Campbell said: don't use StringTokenizer.

can anyone have any idea? (i don't like scanner ok? hehe, maybe java.util, stringtokenizer only)

Have a look at String.split(); I suspect you'll find it's exactly what you need.

Oh, and BTW: I totally agree with you about Scanner. Useless...and slow.

Winston
 
brando brandido
Greenhorn
Posts: 20
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
thank you for the replies.

this is what i have so far,



the problem on the above code is that it split the details by space.
so assuming you input: ICS 102 : Introduction to Computing I

the output becomes like this;

ICS
102
Introduction
to
Computing
I

i want the output to become like this;

ICS
102
Introduction to Computing I
 
Campbell Ritchie
Marshal
Posts: 56584
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Go back to what I said earlier, and the method I quoted.
 
brando brandido
Greenhorn
Posts: 20
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
thank you for all the replies,

i am not familiar with the other string class methods (i'l try to use it in my next program) but as for now i stick to StringTokenizer so i could use space and colon to split the course details.

thanks for the links Campbell, it helped me a lot.

i finally got the correct code.

here is my final code:



i delimit space and colon, then also use trim.

so even if the input has lots of spaces between words like this ==> ICS               102           :           Introduction to Computing I
i would still   get the correct output i wanted

output:

Course Name : ICS
Course Number : 102
Course Description : Introduction to Computer I


thank you thank you!
cheers!
 
D. Ogranos
Ranch Hand
Posts: 214
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Campbell Ritchie wrote:Welcome to the Ranch

If you read about StringTokenizer, you find you ought not to use it in new code. You should try reading about the String class’ methods. If you click on Pattern you will find a brief list of regular expressions. Remember if you need the \ you will probably have to write \\.


Slightly off topic: I've never understood why StringTokenizer should not be used anymore. If you understand its limitations, it works fine and is MUCH faster than regular expressions. So why limit yourself artificially? That said, I agree that for this situation it is not needed. You can get the results by using simple String functionality, namely String.indexOf() and String.substring(). And String.trim() of course.
 
Korhan Rankin
Greenhorn
Posts: 14
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hey,

I'm not a fan of string tokenizer...

Have a look at the String method "String[] java.lang.String.split(String regex)" as already mentioned in replies.

Your format looks like this:
ICS 102 : Introduction to Computing I

Now when you want to split it in 3 parts as mentioned, you can't without knowing more. Space or ":" character won't do it. You have 3 options:
1. You have to give your code more knowledge, index etc.. (more code + more error handling)
2. Cut until first space, after that space to ":" (more code + more error handling)
3. You use a defined separator

To #3 my favorite, you could set a format like this:
ICS : 102 : Introduction to Computing I
Trust me separator is the safest way because any day you get more characters or spaces where you dont expected them you have to touch your code..

Simply add a second ":" character as separator.
String[] values = courseDetails.split(":");

Note: the values array will not contain any ":" anymore. As I see from your System out, you don't want the ":" anyway.

After splitting the values into an array you should trim all values so any code that processes these strings stays cleaner.. and since you modify the string that does create a new string anyway, just go ahead and fill an ArrayList.. that is better to handle than an array.


Put the blocks in own methods for reuse if you want.
Now you can do anything you want with the list of values.. you know it does only contain your trimmed values.

Another thing to:
System.out.print("\n");
you can use:
System.out.println("");
instead. The "ln" stands for "line", it does put the text into an new line.

One more thing:
System.out.print("Please enter the course details in the following format");
System.out.print("\n");

you can write:
System.out.print("Please enter the course details in the following format\n");
instead. Reduces your code lines and makes it easier to read.

Anyway you should put any new text into "System.out.println(..)", saves you time and headache.

cheers
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
D. Ogranos wrote:Slightly off topic: I've never understood why StringTokenizer should not be used anymore...

Because, as Campbell mentioned quite a while ago, the documentation says so. Specifically:

"StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split() method of String or the java.util.regex package instead."

Winston
 
D. Ogranos
Ranch Hand
Posts: 214
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
D. Ogranos wrote:Slightly off topic: I've never understood why StringTokenizer should not be used anymore...

Because, as Campbell mentioned quite a while ago, the documentation says so. Specifically:

"StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split() method of String or the java.util.regex package instead."

Winston


Yeah I know that, but there's no good reason given from what I've seen as to WHY its use is discouraged. Regular expression are more powerful, sure, and you can do things with them that you can't with the StringTokenizer. But there is no fundamental problem with the class, so why should one not use it if the limitations are not a problem?
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
D. Ogranos wrote:
Winston Gutkowski wrote:
D. Ogranos wrote:Slightly off topic: I've never understood why StringTokenizer should not be used anymore...

Because, as Campbell mentioned quite a while ago, the documentation says so. Specifically:

"StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split() method of String or the java.util.regex package instead."

Winston


Yeah I know that, but there's no good reason given from what I've seen as to WHY its use is discouraged. Regular expression are more powerful, sure, and you can do things with them that you can't with the StringTokenizer. But there is no fundamental problem with the class, so why should one not use it if the limitations are not a problem?


If you're not comfortable with regex, and are comfortable with ST and it suits your needs, the only reason I could see not to use it is that it is deprecated and therefore could, theoretically, go away in any future release. I think it's unlikely that it will, so this is not a particularly compelling reason, but it's all I can think of.

And of course, it doesn't address the question of why it was deprecated in the first place.

 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
D. Ogranos wrote:Yeah I know that, but there's no good reason given from what I've seen as to WHY its use is discouraged. Regular expression are more powerful, sure, and you can do things with them that you can't with the StringTokenizer. But there is no fundamental problem with the class, so why should one not use it if the limitations are not a problem?

Well, I'd say that its a bit like putting drum brakes on a new car when it's well known that disks are a lot better. Since, as Jeff pointed out, the class is deprecated, it's unlikely to receive any attention in the future, even if better algorithms come along, so basically you're using a piece of fossilized code. If you feel that you absolutely can't live without it, then go ahead, but I wouldn't be advocating its use to newbies.

Winston
 
Rob Spoor
Sheriff
Posts: 21135
87
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Like Enumeration it's not deprecated.
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Rob Spoor wrote:Like Enumeration it's not deprecated.


Oops. I stand corrected.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Rob Spoor wrote:Like Enumeration it's not deprecated.

Ooops * 2. Really should check my sources . I still stand by my comments.

Winston
 
Campbell Ritchie
Marshal
Posts: 56584
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Korhan Rankin wrote: . . . Simply add a second ":" character as separator. . . .
Not at all. The requirement was for there to be no colon before 102.
 
dennis deems
Ranch Hand
Posts: 808
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
D. Ogranos wrote:Yeah I know that, but there's no good reason given from what I've seen as to WHY its use is discouraged. Regular expression are more powerful, sure, and you can do things with them that you can't with the StringTokenizer. But there is no fundamental problem with the class, so why should one not use it if the limitations are not a problem?

Well, I'd say that its a bit like putting drum brakes on a new car when it's well known that disks are a lot better.


No, it's not really like that. It's been demonstrated that the speed of StringTokenizer is superior to both regex and String.split. So if speed happens to figure into one's definition of "better", then clearly the legacy code is better.

Use the right tool for the job. Not the shiniest.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dennis Deems wrote:No, it's not really like that. It's been demonstrated that the speed of StringTokenizer is superior to both regex and String.split...

It has? I only heard a claim so far, no references.

...So if speed happens to figure into one's definition of "better", then clearly the legacy code is better.
Use the right tool for the job. Not the shiniest.

I would say that if you plan on using a class whose use has been actively discouraged by the writers of the language for at least 3 full releases now (I went back to 1.4.2), you should be prepared to defend your decision; otherwise speed better be an overriding factor.

Winston
 
dennis deems
Ranch Hand
Posts: 808
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
Dennis Deems wrote:No, it's not really like that. It's been demonstrated that the speed of StringTokenizer is superior to both regex and String.split...

It has? I only heard a claim so far, no references.

http://www.javamex.com/tutorials/regular_expressions/splitting_tokenisation_performance.shtml
http://www.thatisjava.com/java-essentials/70460/
http://www.coderanch.com/t/326006/java/java/String-split-Vs-StringTokenizer

I would say that if you plan on using a class whose use has been actively discouraged by the writers of the language for at least 3 full releases now (I went back to 1.4.2), you should be prepared to defend your decision

But this is begging the question. It was asked WHY the use of StringTokenizer is discouraged, and no satisfactory response has been forthcoming.
 
Noam Ingalls
Ranch Hand
Posts: 60
Chrome Firefox Browser Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

I would say that if you plan on using a class whose use has been actively discouraged by the writers of the language for at least 3 full releases now (I went back to 1.4.2), you should be prepared to defend your decision

But this is begging the question. It was asked WHY the use of StringTokenizer is discouraged, and no satisfactory response has been forthcoming.

Because the functionality of StringTokenizer is duplicated in String.split() with more functionality and the speed thing isn't really noticeable unless you're going through some huge dataset? I mean, you can do more with String.split() than StringTokenizer if wielded correctly. Then again-- as one discussion would seem to have it, it's the difference between a Swiss Army knife and a general small penknife. Single purpose tools are supposed to fit their purpose better than all-purpose solutions.
 
Paul Clapham
Sheriff
Posts: 22839
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Noam Ingalls wrote:But this is begging the question. It was asked WHY the use of StringTokenizer is discouraged, and no satisfactory response has been forthcoming.


The answer to the question "Why did the designers of Java feature X do it that way" is always "You'll have to ask them." Sometimes if you ask people on forums that question they will willingly confabulate an answer for you, but in practice the answer is whatever information is in the API documentation or the Java language spec.

It seems to me a lot of people take offense against that. Personally, I don't. The people who design the language know more about that sort of thing than I do, mostly, so I'm likely to take their advice instead of the advice of some guy on a forum. (Note: not meant to represent anybody posting in this thread.)
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dennis Deems wrote:But this is begging the question. It was asked WHY the use of StringTokenizer is discouraged, and no satisfactory response has been forthcoming.

Except in rare cases, it is not the business of documentation to explain why; although I suspect Noam's suggestion is probably close to the mark. And, like Paul, I don't think the writers need to justify their decision to me; I trust them to do what they think best.
But I'll give you a couple of other possible reasons anyway:
1. The class represents a developmental dead-end.
2. The writers plan to deprecate it.

In addition, the docs contain what I consider to be a flaw. In playing around with the class (I've never used it before) I notice that it treats consecutive delimiters as one; so by default it behaves like String.split("\\s+"), NOT String.split("\\s") - the example they give as a comparison.
Furthermore, there doesn't seem to be any way to change this behaviour except by having the object return its delimiters as tokens, which I suspect will slow it down.

Oddly enough, all this discussion got me looking at StreamTokenizer, which is a different ball of wax entirely...

Winston
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!