• Post Reply Bookmark Topic Watch Topic
  • New Topic

about String.split() function  RSS feed

 
peter tong
Ranch Hand
Posts: 250
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I cannot understand
public String[] split(String regex, int limit)

function, like the following example,

why "bb" between 'a' and 'c' is seem count only once occurences but between 'c' and 'x' seem count to 3 occurenaces?



output is:

str1.split("b", -1).length = 7
str[0]=a
str[1]=
str[2]=c
str[3]=
str[4]=
str[5]=
str[6]=x
end
>
 
Paweł Baczyński
Bartender
Posts: 2087
44
Firefox Browser IntelliJ IDE Java Linux Spring
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The string is divided this way:
[a]b[]b[c]b[]b[]b[]b[x]
 
peter tong
Ranch Hand
Posts: 250
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Why the function returns empty string between two consecutive single 'b'?
 
Henry Wong
author
Sheriff
Posts: 23295
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
peter tong wrote:Why the function returns empty string between two consecutive single 'b'?


The delimiter is a "b", and the split() method returns everything between the delimiters. So, what is the value between two consecutive "b" delimiters?

Henry
 
Campbell Ritchie
Marshal
Posts: 56584
172
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
And how do you know that is an empty String? Because you only printed out the String you cannot distinguish an empty String (it is in fact empty) from a String consisting of multiple spaces (or similar).
 
peter tong
Ranch Hand
Posts: 250
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I still find some problem about String.split function as following example



I expect it return 3 as str[0]='06265680,8800.00', str[1]='00496455,5076.72', str[2]='', but in fact it returns

input string = "06265680,8800.00|00496455,5076.72|"
input.split("|", -1).length = 36
str[0]=
str[1]=0
str[2]=6
str[3]=2
str[4]=6
str[5]=5
str[6]=6
str[7]=8
str[8]=0
str[9]=,
str[10]=8
str[11]=8
str[12]=0
str[13]=0
str[14]=.
str[15]=0
str[16]=0
str[17]=|
str[18]=0
str[19]=0
str[20]=4
str[21]=9
str[22]=6
str[23]=4
str[24]=5
str[25]=5
str[26]=,
str[27]=5
str[28]=0
str[29]=7
str[30]=6
str[31]=.
str[32]=7
str[33]=2
str[34]=|
str[35]=
end


why it breakdown each character??

If I want it to return str[0]='06265680,8800.00', str[1]='00496455,5076.72', str[2]='', how should I modify the program?>
 
Paweł Baczyński
Bartender
Posts: 2087
44
Firefox Browser IntelliJ IDE Java Linux Spring
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It is because an argument to String.split() is a regular expression and "|" has a special meaning in regular expressions.
You need to escape this character: split("\\|", -1).
 
peter tong
Ranch Hand
Posts: 250
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Great, thaks for the help
 
peter tong
Ranch Hand
Posts: 250
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
then I try


the "|" is never occur in the empty input string, but it also return length = 1, why again?

input string = ""
input.split("\|", 0).length = 1
str[0]=
end


sorry that I am not very clear about String.split() function.

>
 
Paweł Baczyński
Bartender
Posts: 2087
44
Firefox Browser IntelliJ IDE Java Linux Spring
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
peter tong wrote:the "|" is never occur in the empty input string, but it also return length = 1, why again?

Have you read String.split javadoc?
It says:
Splits this string around matches of the given regular expression.

The array returned by this method contains each substring of this string that is terminated by another substring that matches the given expression or is terminated by the end of the string. The substrings in the array are in the order in which they occur in this string. If the expression does not match any part of the input then the resulting array has just one element, namely this string.
 
Balabo Haeron
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
peter tong wrote:Why the function returns empty string between two consecutive single 'b'?


you can specify the number of characters within your delimiter by either using say "b[2]" to denote a delimiting 2 consecutive strings or "b*", i.e. a b followed by any other character.
 
Richard Tookey
Bartender
Posts: 1166
17
Java Linux Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Balabo Haeron wrote:
peter tong wrote:Why the function returns empty string between two consecutive single 'b'?


you can specify the number of characters within your delimiter by either using say "b[2]" to denote a delimiting 2 consecutive strings or "b*", i.e. a b followed by any other character.

Since the '[2]' in "b[2]" represents a character set containing just the single '2' character this would split on occurrences of "b2" and not as you say occurrences of "bb". To get what you say you would need "b{2}" . The first argument to a split() is a regular expression that follows the Java regular expression syntax.
 
Campbell Ritchie
Marshal
Posts: 56584
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Balabo Haeron wrote: . . . "b[2]" to denote a delimiting 2 consecutive strings
Surely it is b{2}? That does not mean two consecutive strings but two consecutive letters b.
... "b*", i.e. a b followed by any other character.
Surely that means any number of letters b including 0? Not b followed by any other character.
I think you want b+ instead.
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!