• Post Reply Bookmark Topic Watch Topic
  • New Topic

Regex help  RSS feed

 
harsha balluru
Greenhorn
Posts: 14
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Folks

I am stuck with a problem of regex. I am trying to find a pattern which is basically "[" and wanted this pattern to be replaced with "\[" (for example [java should become \[java). But at the same time I dont want my operation to affect any strings that have "[[" (i.e., I do not want [[java to become \[\[java )....

So to accomplish above I am doing this as part of my regex program:


String input = "IWRITE[JAVA"
Pattern p = Pattern.compile("[^\\[](\\[)[^\\[]");
Matcher m = p.matcher(input);
String replace = "\\" + m.group(1);
String converted = m.replaceAll(replace);
System.out.println(converted);



With the above part, the output is "IWRIT[AVA"..What is happening is instead of replacing "[" by "\[" ; the "E[J" is replaced by "\["...How to fix this?? Any ideas??

Thanks
Harsha
 
harsha balluru
Greenhorn
Posts: 14
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
A small correction..The output I am seeing is "IWRIT\[AVA".......

Thanks
 
Henry Wong
author
Sheriff
Posts: 23283
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
How to fix this?? Any ideas??


The E and J are being removed because that is what is matching. The replaceAll() method call replaces what matches with the replacement, and not just group(1) matches with the replacement. You have a few choices.

1. You can used zero length look aheads (and look behinds) to match the character before and after the "[", which will not show up in the actual match, and hence, won't be replaced.

2. You can also capture the character before and after as different groups -- so that you can retain them when you build the replacement string. They will still be replaced, but your replacement string accounts for that.

Of course, you still have a problem of what happens when the "[" is the first or last character in the string.

Henry
 
harsha balluru
Greenhorn
Posts: 14
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Henry

Thanks for the reply. I too coincidentally came across this idea of using lookahead and lookbehind. That resolved the problem...

I compiled this regex

(?<!\\[)\\[(?!\\[)

This lookaround concept is so awesome that it alleviates the issue when the "[" is either first or last character in the string...

Thanks
Harsha
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!