• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Jeanne Boyarsky
  • Ron McLeod
Sheriffs:
  • Paul Clapham
  • Liutauras Vilda
  • Devaka Cooray
Saloon Keepers:
  • Tim Holloway
  • Roland Mueller
Bartenders:

Need to tokenize a String , but i need to keep what comes between "and"

 
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Dear users,

I wish to tokenize a String which should have the following pattern :

list(groups("A")),list(groups("B")),list(groups("C"))

I need to extract only the values between my " character.

Is there any smart way to do this without implementing any logic to recognize the Strings after tokenization ?

Thanks a lot!
 
Sheriff
Posts: 22821
132
Eclipse IDE Spring Chrome Java Windows
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Let me get this straight. You have a String list(groups("A")),list(groups("B")),list(groups("C")). You want to retrieve A, B and C. Right? Sounds like something a regular expression could easily do, with a capturing group.
 
Renato Bobbio Calogero
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yes, you got the point ... regular expression you say... i used them in javascript but never in java... could you give me some hint on
which api to use and some best practice to do that?

Thanks a lot!
 
Renato Bobbio Calogero
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I tried with the following :



but it returns just the string " ... do you have any suggestion on which regex I should use ? The strings between the characters " and " are variable, and the have not a
standard patter. I just need to get rid of

list(groups( ...

stuff , getting only the contents and not the keywords, and I wish not to implement any buisness logic to recognize if the value which populate the list are useless stuff or the contents I need... do you have any hint ? Thanks ...
 
Rob Spoor
Sheriff
Posts: 22821
132
Eclipse IDE Spring Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Your regular expression almost works. You just need to make it reluctant instead of greedy. Check out the Javadoc page of java.util.regex.Pattern for more information.
 
Renato Bobbio Calogero
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I checked your link , but now that I have recognized a pattern in my content strings I got a bit confused .
I can always assume that my A , B , C from before begin with the string PI .

But I can't see any pattern which apply to my case : I need to catch any string starting with PI , and reject any other. I am looking forward to this,

any help is appreciated. Thanks a lot!
 
Renato Bobbio Calogero
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Or maybe something more like this

 
Rob Spoor
Sheriff
Posts: 22821
132
Eclipse IDE Spring Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Let's take your original regular expression. I just now see one flaw: "\"*\"" means zero or more occurrences of " followed by a single ". That's because in regular expressions * is a meta character that applies to the previous entity, in this case the ". It doesn't work like command line wild cards where * means "any character any number of times". A quick (and incorrect) fix: "\".*\"". That dot makes the * bind to that, so the regular expression becomes a single " followed by any character any number of times followed by a single ". That looks more like it. So we test it:
Output: "A")),list(groups("B")),list(groups("C"
Not quite what we want, and the reason is simple: .* is greedy. It takes everything between the first " and the last ". What we want is to take everything between each " and the next ".

There are two ways:
1) make the matching not greedy but reluctant. We do this by appending a simple ? behind the *; see also the Javadoc I pointed you to. The regex becomes "\".*?\".
2) do not capture everything but only everything that's not a ". We can use a negating character class for that: [^"]. The regex becomes "\"[^\"]*\"".

Both now result in this output:
"A"
"B"
"C"

Now all you need to do is use substring to get the values. Another option is use a matching group, by using ( and ). You will then get a group 1 inside the matcher. Group 0 (the entire match, which is what group() returns; group() and group(0) are equivalent) is no longer relevant:
Output:
A
B
C


Note that it's important to use a loop for find(), and not a single if. That's because you can have multiple matches, and with if you only check the first one. The loop will let you check them all.
 
Renato Bobbio Calogero
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
This is a very elegant way to do that, which is what I wanted to achieve, but indeed I thought that could have been something more "less-brainer",
so I did this :



but I really appreciate the fact I learnt the use of java.util.regex, which will be very useful in future ! Thanks a lot !
 
Rob Spoor
Sheriff
Posts: 22821
132
Eclipse IDE Spring Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You're welcome.
 
Our first order of business must be this tiny ad:
Smokeless wood heat with a rocket mass heater
https://woodheat.net
reply
    Bookmark Topic Watch Topic
  • New Topic