This week's book giveaway is in the Cloud/Virtualization forum.
We're giving away four copies of Learning OpenStack Networking: Build a solid foundation in virtual networking technologies for OpenStack-based clouds and have James Denton on-line!
See this thread for details.
Win a copy of Learning OpenStack Networking: Build a solid foundation in virtual networking technologies for OpenStack-based clouds this week in the Cloud/Virtualization forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Liutauras Vilda
  • Campbell Ritchie
  • Tim Cooke
  • Bear Bibeault
  • Devaka Cooray
Sheriffs:
  • Jeanne Boyarsky
  • Knute Snortum
  • Junilu Lacar
Saloon Keepers:
  • Tim Moores
  • Ganesh Patekar
  • Stephan van Hulst
  • Pete Letkeman
  • Carey Brown
Bartenders:
  • Tim Holloway
  • Ron McLeod
  • Vijitha Kumara

PatternSyntaxException with * in regex  RSS feed

 
Ranch Hand
Posts: 32
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I need to search for a string in a data file, based on a user-supplied regular expression. I am using the Pattern class for this as:


When userSuppliedRegex is set to * or a a string beginning with *, such as *foo, I get this exception:


Exception in thread "main" java.util.regex.PatternSyntaxException: Dangling meta character '*' near index 0 *



How do I specify a regex that begins with a metacharacter, such as * or *foo or .bar?

From the Pattern javadoc, I see:


Perl is forgiving about malformed matching constructs, as in the expression *a, as well as dangling brackets, as in the expression abc], and treats them as literals. This class also accepts dangling brackets but is strict about dangling metacharacters like +, ? and *, and will throw a PatternSyntaxException if it encounters them.

 
Rancher
Posts: 274
C++ Debian VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Originally posted by Rolf Johansson:
I need to search for a string in a data file, based on a user-supplied regular expression.
(...)
How do I specify a regex that begins with a metacharacter, such as * or *foo or .bar?



Since you have researched it, you know that the exception is thrown for a reason. What do you expect by "specifying a regex that begins with a metacharacter"? (BTW, I don't see anything wrong with ".bar", but REs that begin with '*' are meaningless - hope you see that.)

If you want to guard your app against arbitrary user input, that is a laudable goal. It involves checking the user input. You could catch the raised exception and tell the user to re-enter a valid RE. Handling I/O is very difficult.

- Anand
 
Rancher
Posts: 42975
76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You escape it with a backslash: \* Note that in Java strings you need to use two backslashes, because the first one is 'eaten' by the Java string escape rules. So it'd be: \\*
 
Author
Ranch Hand
Posts: 964
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Originally posted by Rolf Johansson:
How do I specify a regex that begins with a metacharacter, such as * or *foo or .bar?



I think you may be confused about what quantifiers like '*', '+', and '?' do. They control how many times the thing that comes before it may occur. So if nothing comes before it, that's an illegal regular expression and a PatternSyntaxException should be expected.

'*foo' might make sense in DOS or a unix shell, but it doesn't make sense as a regular expression. A roughly equivalent regular expression might be '.*foo' or '^.*foo'. (In both cases, the '*' quantifies the '.' that precedes it.)

If you want '*' as a literal, not as a quantifier, then escape it with a backslash as Mr. Dittmer suggests.
 
Anand Hariharan
Rancher
Posts: 274
C++ Debian VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Originally posted by Ulf Dittmer:
You escape it with a backslash: \* Note that in Java strings you need to use two backslashes, because the first one is 'eaten' by the Java string escape rules. So it'd be: \\*



To the OP -

Note that Ulf said ".. in Java strings ...". By that he meant string literals within Java source code. So, if you read user's input from console or elsewhere, you don't need the double backslash.

- Anand
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!