• Post Reply Bookmark Topic Watch Topic
  • New Topic

Regex matcher question  RSS feed

 
steve kelly
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello, have an issue where I am almost there but not quite. I have below code


which outputs
john
do

but I want it to handle any and all special chars(like "!@#$%^&*()_+ etc) so i'd like to see below from original string.

"john"
"do"

How would I do this?
 
Mike. J. Thompson
Bartender
Posts: 689
17
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Your code is matching against "[\w]", which is a character class containing word characters. It's equivalent to [a-zA-Z_0-9].

When you say you want to match special charters too, I assume you mean all characters but white space. If so, this can be done with the following character class: "[\S]".

Note that is a capital letter 'S'.

Edit to add: that character class is equivalent to [^s] (lower case 's') which is arguably clearer, if any regular expression can ever be said to be clear.

The Oracle tutorial will explain more: http://docs.oracle.com/javase/tutorial/essential/regex/index.html
 
Jim Venolia
Ranch Hand
Posts: 312
2
Chrome Linux VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you really want to grok regular expressions get Jeffrey Friedl's book Mastering Regular Expressions. It's even got a chapter on Java's regex engine.
 
Henry Wong
author
Sheriff
Posts: 23295
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

First of all, it looks like you want to deal with nested quotes. In my opinion, regular expressions are not really good for that. It can get complicated to nest something, it can get really really complicated to nest two levels, and possibly, it is likely impossible to deal with in terms of complexity, if you want to nest an unlimited number of levels.

Also, second, can you deal with the quotes? For example, is the second quote a nested quote? Or does it close the first quote? How about the third quote? Or the fourth? Before you are able to create the regex for it, you probably need to better define it first.

Henry
 
steve kelly
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Maybe regex is not what I want. Basically I can receive a string " "John" "Doe"" or "@John Doe#" in my program. The only thing I know is that a blank space will always separate them.
I want these two separate strings broken up into an array. So above examples would look like below:
"John"
"Doe"

and...

@John
Doe#
 
Mike. J. Thompson
Bartender
Posts: 689
17
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you can guarantee that format then you can do the following:

1) Split the string on the space, resulting in an array with the two parts in.

2) Remove the first character from the first String

3) Remove the last character from the second String.

You will need to validate that the string is in the correct form though, such as ensuring that the array has exactly two strings in and that the two characters you're removing are quotes.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!