• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • Ron McLeod
  • Jeanne Boyarsky
  • Paul Clapham
Sheriffs:
  • Liutauras Vilda
  • Henry Wong
  • Devaka Cooray
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Tim Holloway
  • Al Hobbs
  • Carey Brown
Bartenders:
  • Piet Souris
  • Mikalai Zaikin
  • Himai Minh

quantifiers.......

 
Ranch Hand
Posts: 362
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi ranchers...

can you explain the difference between greedy and reluctant quantifiers with example...?


 
Ranch Hand
Posts: 952
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


see here source is

1a2b3c4d5
012345678



1 -> ends with digit
a2 -> ends with digit
b3 -> ends with digit
c4 -> ends with digit
d5 -> also ends with digit

You could expect these should be the output, but it is not.



Regex engines worked from left to right, and consumed characters as they went. We could expect engine to find a2, b3, c4, d5 as engine moves from left to right.

But here we used .* that is greedy quantifier then engine will become greedy, here engine will not look at a2, b3... by moving left to right.
It will look at the entire source data before as we used .*, then it will move from right to left to match (opposite from usual) and it will find the right most match.

Here for source: 1a2b3c4d5

if it finds ".*" ----> it will read entire source
engine read is: 1a2b3c4d5

than it will start matching from right.
1a2b3c4d5
........<---

It will see the first digit from right is 5=="\\d".
So it got the match for ".*\\d"

so it prints find: 1a2c3c4d5
and index is 0 as 1a2c3c4d5 starts at index 0.

Below, you will find reluctant description....

[ December 11, 2008: Message edited by: Punit Singh ]

[ December 11, 2008: Message edited by: Punit Singh ]

[ December 11, 2008: Message edited by: Punit Singh ]
[ December 11, 2008: Message edited by: Punit Singh ]
 
Punit Singh
Ranch Hand
Posts: 952
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


Now we have use ".*?\\d" --> Here *? is reluctant: means it will move from left to right, one by one and will match "zero or more occurence of any character" that will end with a "digit".

so for source: 1a2b3c4d5
1) It will see at index 0: "1" --> it starts with "zero occurrence of any character" but ends with a "digit 1".
so found : 1 at index 0
2) Engine will move to "a" at index 1, it does not end with a digit character so it moves ahead and read "2" also,
then it becomes "a2", that matches the criteria "zero or more means here one characeter 'a' " this ends with a "digit 2".


So in both case of greedy and reluctant, engine works in different manner.
Greedy wants to find as much as possible. It will find bigger matches.
Reluctant wants to find as less as possible.
[ December 11, 2008: Message edited by: Punit Singh ]
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
reply
    Bookmark Topic Watch Topic
  • New Topic