• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Split using regex doubt

 
Ranch Hand
Posts: 206
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi the output of the following code fragments have puzzled me....Anyone please shed some light..

1) String str = " apples";
String s[] = str.split("\\w*");
for (String i:s)
System.out.println("Token" + i + "Token");

Output is :
TokenToken
Token Token

2) String str = "apples";
String s[] = str.split("\\w*");
for (String i:s)
System.out.println("Token" + i + "Token");
No Output

3) String str = "apples ";
String s[] = str.split("\\w*");
for (String i:s)
System.out.println("Token" + i + "Token");
Output is :
TokenToken
TokenToken
Token Token

I have surrounded the output by word Token so as to distinguish between space and null. But I dont get the logic behind this...Also, whoever knows how this works ...can they please guide me to some good tutorial on the above or instead just tell me that I dont need to worry about the above for the exam
 
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Okay, basically, you have three things going on here...

1. The regex as written, is greedy, so it will always match the whole "apples", when it encounters it.
2. The split always go from left to right as the starting point. This means that it can't match "apples" until the start is at the "a". Furthermore, the way this regex is written, it is capable of matching nothing (zero length match).
3. The default split, that doesn't limit the number of matches, always delete any trailing zero length matches.

So...

For the first case:

The first split match is a zero length match at index zero. The second split match is "apples". And the third split match is a zero length match at the end of apples. This create a first value of zero length, a second value of a single space, a third value of zero length, and a fourth value of zero length. However, applying rule #3, the third and fourth value are deleted.

For the second case:

The first split match is apples. And the second split match is zero length right after apples. This creates a first value of zero length, a second value of zero length, and a third value of zero length. However, applying rule #3, all three values are deleted.

For the third case:

The first split match is apples. The second split match is the zero length right after apples. And the third split match is the zero length right after the space. This creates a first value of zero length, a second value of zero length, a third value of a single space, and a fourth value of zero length. However, applying rule #3, the fourth value is deleted.

[EDIT: Corrected First and Second Case -- sorry]

Henry
[ March 25, 2007: Message edited by: Henry Wong ]
 
Ranch Hand
Posts: 108
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Man!!! This did confuse me as well...

http://java.sun.com/docs/books/tutorial/essential/regex/quant.html
(read "Differences Among Greedy, Reluctant, and Possessive Quantifiers")

Greedy quantifiers are considered "greedy" because they force the matcher to read in, or eat, the entire input string prior to attempting the first match. If the first match attempt (the entire input string) fails, the matcher backs off the input string by one character and tries again, repeating the process until a match is found or there are no more characters left to back off from.


You can read the whole tutorial at http://java.sun.com/docs/books/tutorial/essential/regex/index.html
 
megha joshi
Ranch Hand
Posts: 206
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks for the reply and the tutorial.
I am sorry but I dont understand how the zero length comes in the front before apples in the second and third case and not before apples in the first case in the logic with the following...Its a bit confusing for me.
Can you please explain.
-------------------------------------------------------------------------
For the first case:

The first split match is a zero length match at index zero. The second split match is "apples". And the third split match is a zero length match at the end of apples. This create a first value of zero length, a second value of a single space, a third value of zero length, and a fourth value of zero length. However, applying rule #3, the third and fourth value are deleted.

For the second case:

The first split match is apples. And the second split match is zero length right after apples. This creates a first value of zero length, a second value of zero length, and a third value of zero length. However, applying rule #3, all three values are deleted.

For the third case:

The first split match is apples. The second split match is the zero length right after apples. And the third split match is the zero length right after the space. This creates a first value of zero length, a second value of zero length, a third value of a single space, and a fourth value of zero length. However, applying rule #3, the fourth value is deleted.
---------------------------------------------------------------------- :roll:
 
Greenhorn
Posts: 23
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Remember that the regex you supply to String.split() is for matching delimiters not tokens. This confused me for a while too and I don't think the JavaDocs make it clear until you get to the examples.
 
Don't get me started about those stupid light bulbs.
reply
    Bookmark Topic Watch Topic
  • New Topic