• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Difference between StringTokenizer and split

 
Gaurav Ram
Ranch Hand
Posts: 32
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi

Is there any functional difference between StringTokenizer and split() function of String.

Thanks
Gaurav
 
Campbell Ritchie
Sheriff
Pie
Posts: 49466
64
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes.

Apart from the fact that the API recommends not to use StringTokenizer any more:
String.split() returns an array (String[]) and Tokenizer returns one token at a time.

Tokenizer is now regarded as legacy code.
 
Mike Simmons
Ranch Hand
Posts: 3090
14
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I think the biggest difference is: with a StringTokenizer, the delimiter is just one character long. You supply a list of characters that count as delimiters, but in that list, each character is a single delimiter. With split(), the delimiter is a regular expression, which is something much more powerful (and more complicated to understand). It can be any length. Regular expressions may be harder to understand at first, but when you learn how to use them, they're much more useful.

Also, if you need to parse empty tokens, e.g. a comma-separated line like

one,,three,,,six

where the field values are "one", "", "three", "", "" and "six" where the three empty strings are indicated by the commas with nothing between them - that's a lot more work with a StringTokenizer. By default it gives you just "one", "three", "six" and skips the empties. You can use a special constructor that takes a boolean to tell the StringTokenizer to return delimiters, but that gets complicated too. I'll skip the details. It's much easier to use split(","), which immediately returns {"one", "", "three", "", "", "six"), exactly right. The short version is: StringTokenizer doesn't handle empty strings well. But split() does.
 
Gaurav Ram
Ranch Hand
Posts: 32
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thats really helpful, thanks a lot for replying.
 
David jonat
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Really nice Mike . Its very helpful for me
 
Harish Sehgal
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Its really helpful Mike.
 
Habeeb Hassan Syed
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hey that was good.

Some times you want split to behave same as Stringtokenizer then you can use the reqular expressions for acheiving the same effect like spilt(",*") which takes care of a single , or multiple commas. I you want the delimiter to be , as well as space you can use split("[ ,]") also if you want to trim your strings and has a comma seperated delimiter you can use split(" *, *")

Cheers
Habeeb
 
Campbell Ritchie
Sheriff
Pie
Posts: 49466
64
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to the Ranch Habeeb Hassan Syed
 
Habeeb Hassan Syed
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Happy to hear that

Thanks
Habeeb
 
Rob Spoor
Sheriff
Pie
Posts: 20555
57
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Habeeb Hassan Syed wrote:spilt(",*") which takes care of a single , or multiple commas.

I hope you meant ",+", because ",*" matches any number of commas, including none at all. As a result, the empty string matches as well and you get an array with each single character, plus one empty match at the start (after all, any string starts with the empty string). Try it yourself:

I you want the delimiter to be , as well as space you can use split("[ ,]") also if you want to trim your strings and has a comma seperated delimiter you can use split(" *, *")

Even better is split("\\s*,\\s*). The regular expression \s (the \ is escaped to be used in a Java string) means any whitespace character. That is not just spaces but also line breaks.
 
Habeeb Hassan Syed
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks Rob Prime,

What you said is right, was searching for a particular answer and I found this post, got some cues and just wanted to give some inputs. Was in a hurry and did not test it.

Thanks for your prompt correction, otherwise it would have been misleading.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic