• Post Reply Bookmark Topic Watch Topic
  • New Topic

StringTokenizer in java  RSS feed

 
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I want to use StringTokenizer class and my code is:

StringTokenizer st = new StringTokenizer(input, ";:.++-=<>=+-/*[]&|! \t\n", true);

I want to break string (input) by all arithmetic operators, punctuators, spaces, tabs and line feed etc.

But the problem is it reads increment (++) operator as a two add (+) operator.

Need help.. how to pass string in way that reads arithmetic operators and assignment operator separately.
 
Marshal
Posts: 56608
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to the Ranch

Don't use tokenizer; it is legacy code. You cannot use ++ as a substring of your delimiter string because the tokenizer finds + first.
Use String.split.

What you need is a regular expression. Remember you have to escape a lot of the characters, e.g. +– because they are called meta‑characters. You end up withThe [] round the whole thing means try any character inside.
The + at the end means take one or more occurrences. Otherwise you will have the same problem with && or || as you have with ++
The \\+ means + is a meta‑character and you have to escape it with \ but you have to escape the \ making \\
The \\s means all forms of whitespace including space tab newline, etc.

Somebody please check: have I got that regex right?
 
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Campbell Ritchie wrote:The \\+ means + is a meta‑character and you have to escape it with \ but you have to escape the \ making \\
The \\s means all forms of whitespace including space tab newline, etc.
Somebody please check: have I got that regex right?

Pretty much, except that you don't need to escape a lot of stuff when it's inside a character expression. I'm pretty sure that:
"[-;:.=<>+/*&|![\\]\\s]+"
will do (although I'm not absolutely sure sure about the '|'). Note the placement of the '-'. As long as it's the first or last character, it won't be interpreted as a "range".

@Mehroosha: However, what Campbell described is just the mechanics. You need to be sure that the regex you write covers ALL the patterns you want to split on, and ONLY those patterns. My suggestion would be to list out every single string that you want to be caught by the expression, or write a comprehensive description in English (or your native language) of exactly what you want it to do (and, more importantly, what you DON'T want it to do), before you start writing your regex.

HIH

Winston

PS:
"\\s*[-;:.=<>+/*&|![\\]]{1,2}\\s*"
may be slightly more targeted. I'll leave you to work out why.
You can find out about regular expressions here, or there are several good tutorial sites on the Net.

PS2:
If this is specifically for parsing Java-style source, you'll probably want the '^' operator as well. You also have the issue that several (but not all) operators can be combined with '=' - eg, "+=" is fine, but ";=" isn't.
This is why it's so important to describe the pattern (or the things you want to match) accurately.
 
Mehroosha Asif
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Thank you for your replies.

I know very well how to make right regex but i want to keep/store delimiters too. I dont know how to keep delimiters by using split method.

While searching, I found (?=X)

It works for small regex/delimiters like (?=[;:])

but cant work for so many delimiters, what i Want.


Please help.. and its urgent
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Mehroosha Asif wrote:I know very well how to make right regex but i want to keep/store delimiters too. I dont know how to keep delimiters by using split method.
While searching, I found (?=X)
It works for small regex/delimiters like (?=[;:])
but cant work for so many delimiters, what i Want.

Sure it can. You just need to make sure that the 'X' expression gets exactly what you want.

Please help.. and its urgent

Sorry chum, but that's your problem, not ours...and we're all volunteers here.

Winston
 
Campbell Ritchie
Marshal
Posts: 56608
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Mehroosha Asif wrote:
Thank you for your replies. . . .
You're welcome
And agree with Winston.
 
author
Sheriff
Posts: 23295
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
Mehroosha Asif wrote:I know very well how to make right regex but i want to keep/store delimiters too. I dont know how to keep delimiters by using split method.
While searching, I found (?=X)
It works for small regex/delimiters like (?=[;:])
but cant work for so many delimiters, what i Want.

Sure it can. You just need to make sure that the 'X' expression gets exactly what you want.


Agreed. The X part of the regex expression, is also a regex (and not just a simple character set). This part can also get as complicated as any other regex.

Henry
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!