Win a copy of The Java Performance Companion this week in the Performance forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

regexp

 
Max Rahder
Ranch Hand
Posts: 177
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
My goal is to come up with a regex pattern that allows any string beginning with "B-" to match. I.e., the strings "B-17" and "B-hithere" should both match.

I'm in an environment where I mustJakarta regex library. Regexp is found at the Jakarta regexp home page

I can't figure out how to code the regex pattern.

Here's a simple example. I thought this would be a literal pattern. I.e., the only thing that should match the pattern "B-aa" should be the string "B-aa" itself; but other things match.



Why are so many strings matching? (How do I get it to treat "B-" as a literal?

Thanks for helping!
 
marc weber
Sheriff
Posts: 11343
Java Mac Safari
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm not experienced with regex (much less with Jakarta), but I'm guessing it would be...

"[Bb]-.*"

That is, match Strings starting with one uppercase or lowercase "B", followed by a hyphen, followed by zero or more (denoted by the asterisk) of any character (denoted by the period).


See http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Pattern.html
[ June 20, 2005: Message edited by: marc weber ]
 
Max Rahder
Ranch Hand
Posts: 177
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Nope. Here are the results of running your guess:



Having "rumplestilskinB-aa" be a match is still not the behavior I want.

I have made lots of guesses myself, but none has explained what appears to be a fundamental mis-understanding on my part. (I suspect the hypen is the problem. In some contexts a hyper is a range operator. I'm afraid my pattern somehow is matching anything in the range "B" through "aa". The problem with that theory is that if it's true the "-" shouldn't be required in the matching string, so "Baa" should test "true" also, but it doesn't. But I still suspect the hyphen.)

I need the help of someone who knows either regex in general or Jakarta regexp in particular.
 
marc weber
Sheriff
Posts: 11343
Java Mac Safari
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The difference is in the behavior of the RE.match method. If you want to start testing for a match at the beginning of the argument String, then use the boundary matcher ^...

"^[Bb]-.*"

Or if you want a simple literal without case variation, just use...

"^B-.*"

I've tested this using the org.apache.regexp package and it works as expected on your examples...

[ June 20, 2005: Message edited by: marc weber ]
 
marc weber
Sheriff
Posts: 11343
Java Mac Safari
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Actually, when I said "I'm not experienced with regex," I should have said "I'm not (very) experienced with regex in Java."

But I have been using a form of regex for years with a product called Hyper.Ink, which is an application that converts rich text source material into Lotus Notes documents. Basically, we use regexes to define "link rules" so that Hyper.Ink can convert specific text patterns into hyperlinks among the databases.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic