• Post Reply Bookmark Topic Watch Topic
  • New Topic

Accepting single & double quotes in java regex pattern  RSS feed

 
Rithanya Laxmi
Ranch Hand
Posts: 203
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I need to build a regex which will accepts special characters like ' (single quote) & " (double quote).
So that when i give "Anton'y" it should be pass similarly for double as well. How I can build this pattern in java?
Please clarify.

Thanks.
 
Andrea Binello
Ranch Hand
Posts: 62
5
Eclipse IDE Java Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Rithanya Laxmi wrote:I need to build a regex which will accepts special characters like ' (single quote) & " (double quote).
So that when i give "Anton'y" it should be pass similarly for double as well. How I can build this pattern in java?
Please clarify.

First, you should clarify which regex you want to create, to be specific, what should "match" exactly.
Single quotes and double quotes are not special characters in regex (however double quotes must be escaped in string literals).

For example, the following code finds a text in double quotes, where the text can contain single quotes:

Outputs:

Found: "Anton'y"
Found: "Andrew's"
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Rithanya Laxmi wrote:I need to build a regex which will accepts special characters like ' (single quote) & " (double quote).
So that when i give "Anton'y" it should be pass similarly for double as well. How I can build this pattern in java?

Simple answer: With difficulty.

Quoted strings are not the sort of thing that regexes are very good at because they involve embedded logic. There are also several other wrinkles that make them difficult to parse as a regex pattern:
  • They work in pairs.
  • They can be embedded.
  • They can be escaped.
  • (in your case) Their types are interchangeable.

  • It's possible that you may be able to come up with a pattern, but it's likely to be nasty.

    One possibility: Choose one particular quote - let's say " - and replace all OTHER quotes in both your source and target strings with it; and then run your pattern match. Then you'd be matching (from your example)
    "Anton"y"
    with something like:
    Say "hello" to "Anton"y"
    which would then be a straight match.

    I suspect it's only a 90% solution, but it's where I'd start.

    Winston
     
    Campbell Ritchie
    Marshal
    Posts: 56546
    172
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Winston Gutkowski wrote: . . . because they involve embedded logic. . . .
    In which case, they might not conform to any regular grammar and might be impossible to parse with a regex.
     
    Ulf Dittmer
    Rancher
    Posts: 42972
    73
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    I interpret the double quotes in Rithanya's post merely as delimiters for the purposes of this post, not as part of the pattern to match. So I think the answers given so far make the problem more complicated than it is; maybe Rithanya can clarify.
     
    Winston Gutkowski
    Bartender
    Posts: 10575
    66
    Eclipse IDE Hibernate Ubuntu
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Ulf Dittmer wrote:I interpret the double quotes in Rithanya's post merely as delimiters for the purposes of this post...

    You may be right; althought I submit that my suggestion would still work, even if you are.

    @Rithanya: So, what's the answer? Is the "Anton'y" in your OP actually
    "Anton'y"
    or is it:
    Anton'y
    ?

    I (and I suspect Andrea) have been assuming the first, but if it's the 2nd then Ulf is absolutely right, and the solution is much simpler.

    Winston
     
    Andrea Binello
    Ranch Hand
    Posts: 62
    5
    Eclipse IDE Java Spring
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Keeping the same literal string: " \"Anton'y\" \"Andrew's\" "

    if you want to match/extract only the content between double quotes (excluding quotes) you have at least two options:

    1) Use a "group":

    Pattern.compile("\"([a-zA-Z']+)\"")

    note the ( ) around [a-zA-Z']+ . This creates a group numbered 1.

    and then get the match using matcher.group(1) (instead of group())

    2) Use the so called positive "lookbehind" and "lookahead" (which by definition are non-capturing). The regex is slightly more complicated:

    Pattern.compile("(?<=\")[a-zA-Z']+(?=\")")

    and then get the match again with group().


    Both tecniques give you the output:

    Found: Anton'y
    Found: Andrew's
     
    Consider Paul's rocket mass heater.
    • Post Reply Bookmark Topic Watch Topic
    • New Topic
    Boost this thread!