• Post Reply Bookmark Topic Watch Topic
  • New Topic

Help me understand Regex quantifiers  RSS feed

 
Oceana Wickramasinghe
Ranch Hand
Posts: 77
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello guys, since days i've been struggling to figure out how quantifiers work. I used several sources including Kathy Sierra's book but none of them explain quantifiers in a way that i can understand. There are several things i dont understand about quantifiers, mainly it has to do with the way the clarifications are worded.

For example * means zero or more occurrences.What does "zero or more occurrences" exactly mean? Can someone simplify this statement for me.

Secondly, i want to know exactly how the mechanism works. If i were to match the string "nnnnnnuuuuuuuuulllll" with the pattern "nu" followed by each quantifier, i would get

"nuuuuuuuuu" with +
"nnnnnnuuuuuuuuu" with *
and
"nnnnnnu" with ?

What exactly happens in the background when i execute this? I want someone to explain step by step why each quantifier behave the way they do. Why am i getting multiple "n"s and "u"s when what i want it to search for is "nu"? Its like quantifiers break down the pattern and treat each character as a different pattern.

Thanks in advance.
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Oceana Wickramasinghe wrote:Hello guys, since days i've been struggling to figure out how quantifiers work. I used several sources including Kathy Sierra's book but none of them explain quantifiers in a way that i can understand. There are several things i dont understand about quantifiers, mainly it has to do with the way the clarifications are worded.

For example * means zero or more occurrences.What does "zero or more occurrences" exactly mean? Can someone simplify this statement for me.


It's hard to make that any simpler or clearer. "Zero or more" means "zero or more". That is, ">= 0 instances of the thing we're matching." So, if we have "X*", that means no Xes at all, or "X", or "XXXXXXXXXX" will all match.

Secondly, i want to know exactly how the mechanism works.


That's a combination of two things:

1. Implementation dependent stuff.
2. Stuff that is way beyond what can be explained in a forum.

The source code for all the Java core API classes is available in the src.zip file that comes with the JDK download. You can look there. Or google for something like "regular expression specification".

What exactly happens in the background when i execute this? I want someone to explain step by step why each quantifier behave the way they do.


Roughly--VERY roughly--it looks at each character and asks, "can what I've matched so far, plus this next character, match the current part of the regex. If not, then if I back up a character, can what I've matched so far match the current part of the regex while the next character matches the next part of the regex."
 
Darryl Burke
Bartender
Posts: 5167
11
Java Netbeans IDE Opera
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I've found this tutorial very useful for gaining some understanding of Regex syntax: http://www.regular-expressions.info/
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!