• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Regular expression to strip unwanted characters

 
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The below code gives me wrong result. Basically i want only a-z and _ as the starting character


 
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Hafeez Pallikonda Khader wrote:The below code gives me wrong result. Basically i want only a-z and _ as the starting character



It would help us a bit if you give us more details -- probably starting by giving us an examples of what is right.

Henry
 
Bartender
Posts: 1166
17
Netbeans IDE Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You say what the output is but not exactly what you expect for that example and there is ambiguity in your specification. Do you mean you want to get rid of leading characters if they are not in you valid set? If so then I would expect your desired out to be "St #" but this does not make sense when looking at your regex. Could you say exactly what you expect as the output for the example " #19 98St # " .
 
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

output is 1998St which is wrong


What is the correct output, then? A regexp like "^[^a-z_]+" would remove all leading characters that are not letters or _ ; that's how I interpret "i want only a-z and _ as the starting character".

Edit: .... which is just about what Richard said.
 
Hafeez Pallikonda Khader
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I expect the output to be "St"

Surely 1998 does not fall in a-z and not an _
 
Ulf Dittmer
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Then the regxp I mentioned does that. If you want to remove the special characters everywhere (and not just as starting characters, as you said initially), remove the first "^" from the regexp; it causes the regexp to match only at the beginning of the string.
 
Hafeez Pallikonda Khader
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Its not about special characters at the start. Please look at my edited code.
 
Richard Tookey
Bartender
Posts: 1166
17
Netbeans IDE Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Hafeez Pallikonda Khader wrote:Its not about special characters at the start. Please look at my edited code.



If you just want "St" as a result your regex is very very wrong. I can't work out from what you have posted what the general specification is ! You seem to want to keep only alpha characters and the underscore in which case the regex should simply be ""[^a-z_]+" but the lack of a decent specification means I'm just guessing!
 
Ulf Dittmer
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Please don't edit your previous posts like that. Now all the following posts don't make sense any more. If you have new code, just put it into a new post.
 
Hafeez Pallikonda Khader
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Sure. I'll open a new post with the code
 
Richard Tookey
Bartender
Posts: 1166
17
Netbeans IDE Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Hafeez Pallikonda Khader wrote:Sure. I'll open a new post with the code



No !!! Just add a response containing the new code! And please read again ALL the previous responses.
 
Henry Wong
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

I think there is an misunderstanding of regular expression here. The regex defines what to match, and what to replace it with. It doesn't define what the result should look like.

This means that if the first character is not a-z, it will be replaced. If the second letter is also not a-z, that part of the regex doesn't apply. The ^ means the beginning of line of the input -- it is not the beginning of line of the output.

Henry
 
Ulf Dittmer
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Posted on behalf of Hafeez:

The below code gives me wrong result. Basically i want after replace all operation.

  • letters(a-z) and Underscore(_) as the starting character
  • Following that, there can be letters(a-z), numbers(0-9), Underscore(_), Hyphen(-) and period(.)


  •  
    Richard Tookey
    Bartender
    Posts: 1166
    17
    Netbeans IDE Java Linux
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator

    Hafeez Pallikonda Khader wrote:
    The below code gives me wrong result. Basically i want after replace all operation.

  • letters(a-z) and Underscore(_) as the starting character
  • Following that, there can be letters(a-z), numbers(0-9), Underscore(_), Hyphen(-) and period(.)


  • Though I can't be certain because I'm still not certain what the OP wants, I think replaceAll() is the wrong method to use. I think find() should be used with regex "([a-z_][a-z0-9_-.]*)" with group(1) giving the desired result.

    (UD: edited to make clear that the quote is from Hafeez, not from me)
     
    Henry Wong
    author
    Posts: 23951
    142
    jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator

    Richard Tookey wrote:
    Though I can't be certain because I'm still not certain what the OP wants, I think replaceAll() is the wrong method to use. I think find() should be used with regex "([a-z_][a-z0-9_-.]*)" with group(1) giving the desired result.



    Maybe we are speculating on what the OP wants differently, but I think the original code should work. It just needs a slight change to the regex. Here...



    Henry
     
    Richard Tookey
    Bartender
    Posts: 1166
    17
    Netbeans IDE Java Linux
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator

    Henry Wong wrote:[
    Maybe we are speculating on what the OP wants differently, but I think the original code should work. It just needs a slight change to the regex. Here...



    Henry



    Possibly! Without a decent specification or several examples of input and desired output I don't know what the OP wants.
     
    Hafeez Pallikonda Khader
    Greenhorn
    Posts: 9
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    Works perfectly. Thanks a lot


     
    Hafeez Pallikonda Khader
    Greenhorn
    Posts: 9
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    I'm doing this to filter invalid xml element names FYI
     
    Consider Paul's rocket mass heater.
    reply
      Bookmark Topic Watch Topic
    • New Topic