Win a copy of Securing DevOps this week in the Security forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

PatternSyntaxException with * in regex  RSS feed

 
Ranch Hand
Posts: 32
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I need to search for a string in a data file, based on a user-supplied regular expression. I am using the Pattern class for this as:


When userSuppliedRegex is set to * or a a string beginning with *, such as *foo, I get this exception:


Exception in thread "main" java.util.regex.PatternSyntaxException: Dangling meta character '*' near index 0 *



How do I specify a regex that begins with a metacharacter, such as * or *foo or .bar?

From the Pattern javadoc, I see:


Perl is forgiving about malformed matching constructs, as in the expression *a, as well as dangling brackets, as in the expression abc], and treats them as literals. This class also accepts dangling brackets but is strict about dangling metacharacters like +, ? and *, and will throw a PatternSyntaxException if it encounters them.

 
Rancher
Posts: 274
C++ Debian VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Originally posted by Rolf Johansson:
I need to search for a string in a data file, based on a user-supplied regular expression.
(...)
How do I specify a regex that begins with a metacharacter, such as * or *foo or .bar?



Since you have researched it, you know that the exception is thrown for a reason. What do you expect by "specifying a regex that begins with a metacharacter"? (BTW, I don't see anything wrong with ".bar", but REs that begin with '*' are meaningless - hope you see that.)

If you want to guard your app against arbitrary user input, that is a laudable goal. It involves checking the user input. You could catch the raised exception and tell the user to re-enter a valid RE. Handling I/O is very difficult.

- Anand
 
Rancher
Posts: 42975
76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You escape it with a backslash: \* Note that in Java strings you need to use two backslashes, because the first one is 'eaten' by the Java string escape rules. So it'd be: \\*
 
Author
Ranch Hand
Posts: 959
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Originally posted by Rolf Johansson:
How do I specify a regex that begins with a metacharacter, such as * or *foo or .bar?



I think you may be confused about what quantifiers like '*', '+', and '?' do. They control how many times the thing that comes before it may occur. So if nothing comes before it, that's an illegal regular expression and a PatternSyntaxException should be expected.

'*foo' might make sense in DOS or a unix shell, but it doesn't make sense as a regular expression. A roughly equivalent regular expression might be '.*foo' or '^.*foo'. (In both cases, the '*' quantifies the '.' that precedes it.)

If you want '*' as a literal, not as a quantifier, then escape it with a backslash as Mr. Dittmer suggests.
 
Anand Hariharan
Rancher
Posts: 274
C++ Debian VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Originally posted by Ulf Dittmer:
You escape it with a backslash: \* Note that in Java strings you need to use two backslashes, because the first one is 'eaten' by the Java string escape rules. So it'd be: \\*



To the OP -

Note that Ulf said ".. in Java strings ...". By that he meant string literals within Java source code. So, if you read user's input from console or elsewhere, you don't need the double backslash.

- Anand
 
I can't take it! You are too smart for me! Here is the tiny ad:
Thread Boost - a very different sort of advertising
https://coderanch.com/t/674455/Thread-Boost-feature
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!