• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Devaka Cooray
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Jeanne Boyarsky
  • Tim Cooke
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Tim Moores
  • Mikalai Zaikin
  • Carey Brown
Bartenders:

Regex to replace special characters

 
Ranch Hand
Posts: 300
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

I need to replace special characters when they occur in certain places. The following method works well except for the first occurence of the pattern. I am trying hard to debug, but am getting nowhere. Could anyone please see what is going wrong here ?

The special characters should be replaced only when they occur before the '|' sign. But in the result as you see, in the first occurence, the replacement is made to both sides of '|' and in the rest of the text it is done correctly.

My search text sample -


[first name�|firstname�] this is my page, please go through it and send me feedback. [cchrp|Chris Harp] my page is open to all
[G�nterL�her|G�nter L�her] Replace the sp.characters here please!


Current result after replacement - ( I am trying to escape HTML for correct display here, hence the extra whitespaces )
[first name&s zlig;|firstname&s zlig;] this is my page, please go through it and send me feedback. [cchrp|Chris Harp] my page is open to all [G&u uml ;nterL&o uml ;her|G�nter L�her] Replace the sp.characters here please!


thank you,
Soumya
[ April 27, 2006: Message edited by: soumya ravindranath ]
 
author
Posts: 23942
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

The special characters should be replaced only when they occur before the '|' sign. But in the result as you see, in the first occurence, the replacement is made to both sides of '|' and in the rest of the text it is done correctly.



I believe this problem occurs when there are multiple "|" per line. The reason it is matching is because while it is after the "|" locally, it is still *before* the next one.

Henry
 
soumya ravindranath
Ranch Hand
Posts: 300
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

that makes perfect sense! I am trying to get this greedy matching to change to non-greedy ( though I expected .*? to make a non-greedy match in the first place ), but to no avail. Any suggestion how I can improve this regex to behave the way I require ?
thanks,
Soumya.
 
Ranch Hand
Posts: 262
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Try this:

BTW, regex questions really belong in the Java in General: Intermediate forum.
 
soumya ravindranath
Ranch Hand
Posts: 300
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thank you very much, that works for my example.
Yes, I was about to post it in Java forum, but then seeing another regex question here, changed my mind...

Further, I don't still quite get the regex writing well. Could you please explain what exactly the second group in the pattern does ?

thanks in advance,
soumya
[ April 28, 2006: Message edited by: soumya ravindranath ]
 
Alan Moore
Ranch Hand
Posts: 262
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Sure. If you aren't familiar with negated character classes, read this. The character class ensures that the lookahead doesn't go beyond the end of the current set of brackets to look for a pipe. Come to think of it, you don't even need to put the pipe in the character class. But if there are any special characters that aren't inside brackets, and you don't want to replace them, you should include the left bracket in the character class as well:
 
soumya ravindranath
Ranch Hand
Posts: 300
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
thanks Alan!
 
Look! It's Leonardo da Vinci! And he brought a tiny ad!
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic