• Post Reply Bookmark Topic Watch Topic
  • New Topic

Regular expressions  RSS feed

 
nikki mateti
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I have a question regarding regular expressions. I need to split a string around the delimeter ";" but not those which are preceded by a '&';

For eg: abc&;aaa;aaaa;

I should get tokens
abc&;aaa
aaaa

Could someone tell me how I can do this using the java.util.regex package.

Thanks,
Nikki
 
Ryan McGuire
Ranch Hand
Posts: 1143
9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You might think that splitting on "[^&];" (using java.util.Pattern.split()) would work, but what would happen when you split "&nikki&nikki" ?

What you want to look up is the "zero-width negative lookbehind assertion". The Pattern class javadoc from Sun doesn't explain this too well, IMHO. You may have to Google for it and look at some Perl documentation.

Ryan
[ April 26, 2005: Message edited by: Ryan McGuire ]
 
Steven Bell
Ranch Hand
Posts: 1071
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You could split on just ';' so
abc&;aaa;aaaa;
would become
abc&
;
aaa
;
aaaa
;

Then iterate through
if(element.endsWith('&')){
element.append(next).append(next)
}

Just throw away anything that == ";" if it doesn't get appended and be careful about IndexOutOfBounds.
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
[Ryan]: What you want to look up is the "zero-width negative lookbehind assertion".

Agreed, this is probably hte best choice.

The Pattern class javadoc from Sun doesn't explain this too well, IMHO.

I don't think it's explained at all there, other than to tell us the name "zero-length negative lookbehind". Very annoying. I learned lookahead and lookbehind from the excellent Mastering Regular Expressions. JavaRegular Expressions is also a good choice. Either are worth the $ to get a good handle on using regular expressions.

(If someone finds a good online link for lookahead and lookbehind, please post it.)

For this particular problem, I believe the expression you want is:

(?!<&);

An expression like (?!<foo[i]) means "there may not be a [i]foo immediately before this".
 
nikki mateti
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi,

Thanks for all the replies.

I tried the regular expression


For this particular problem, I believe the expression you want is:

(?!<&);



but it didn't work. Here is my java code



The result was
array[0]=dog&
array[1]=monkey
array[2]=donkey

Is there something wrong??

Thanks Nikki
 
David Harkness
Ranch Hand
Posts: 1646
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Try "(?<=[^&]);" from the other thread you posted. This is a good example of why not to post the same question multiple times: tracking down all the answers.

Ah, I see "negative" means "not" rather than "minus". Thus "(?<!&);" is clearer.
[ April 22, 2005: Message edited by: David Harkness ]
 
nikki mateti
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanka a lot!!. That worked. Sorry about the multiple posts. Won't do that again.

nikki
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
[David]: Thus "(?<!&);" is clearer.

Yeah, I accidentally swapped the < and !. Oops. The idea was right; the implementation was not. Testing is good...

---

The only problem I see with the regex

(?<=[^&]);

is that it fails if the semicolon is not preceded with any character - i.e. if it's at the beginning of a string.
[ April 22, 2005: Message edited by: Jim Yingst ]
 
Alan Moore
Ranch Hand
Posts: 262
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
(If someone finds a good online link for lookahead and lookbehind, please post it.)

Here's the best one I've seen: http://www.regular-expressions.info/lookaround.html
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!