• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Question About Regex Chapter-6 K&B

 
Been Zaidi
Greenhorn
Posts: 8
Chrome IntelliJ IDE Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi.
I am reading Chapter-6 from Kathy and Bert Book. I am on the topic of quantifiers. Here is an example from K&B to use
a regular expression to find all file names starting with proj1. Here is the example


"proj3.txt,proj1sched.pdf,proj1,proj2,proj1.java"


Regular expression give to find such a combination is



It states that the key part to the expression is to use zero or more findings of characters that is not a ,.
It doesn't give any character regex like \w then how come we can state find zero or more occurance of characters that is not a ,.
Can someone elaborate please?

Best Regards,
 
Henry Wong
author
Marshal
Pie
Posts: 21419
84
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Been Zaidi wrote:Hi.
I am reading Chapter-6 from Kathy and Bert Book. I am on the topic of quantifiers. Here is an example from K&B to use
a regular expression to find all file names starting with proj1. Here is the example


"proj3.txt,proj1sched.pdf,proj1,proj2,proj1.java"


Regular expression give to find such a combination is



It states that the key part to the expression is to use zero or more findings of characters that is not a ,.
It doesn't give any character regex like \w then how come we can state find zero or more occurance of characters that is not a ,.
Can someone elaborate please?


Best Regards,



Hint: Are there any word characters are also not "not a comma"?

 
Philip Thamaravelil
Ranch Hand
Posts: 99
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Been,
You don't need to specify the \w. You are matching the rest of the text by matching "Not ,"

Such that, when the expression reaches the "," it stops matching.


* - Zero or more instances of the previous character.

() - Stores the matching value

[ ] - instances of various expressions treated as a single character.

Make sense?

Cheers,
Philip
 
Been Zaidi
Greenhorn
Posts: 8
Chrome IntelliJ IDE Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dear Philip,

Your post was very helpful. So basically the key is this expression. Please correct me if i am wrong. When we write



It basically means that it can be anything except a comma. It can be a digit, character or anything but it shouldn't be a
comma. Plus * means zero or more ocurances. Please correct me if i am wrong.

Thanks,
Been
 
Philip Thamaravelil
Ranch Hand
Posts: 99
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Been Zaidi wrote:



It basically means that it can be anything except a comma. It can be a digit, character or anything but it shouldn't be a
comma. Plus * means zero or more ocurances. Please correct me if i am wrong.


Terrific! Glad it helped. You are correct. To note, this expression stores the value for access, but if you only need to match the expression this works as well:






 
Been Zaidi
Greenhorn
Posts: 8
Chrome IntelliJ IDE Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dear Philip,

Thanks. I have yet another last confusion. When we say it with (), it stores the matching value. Can you elaborate over this a little.
With use of parenthesis and without use of it.

Thanks a ton,
Been
 
Henry Wong
author
Marshal
Pie
Posts: 21419
84
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Been Zaidi wrote:Thanks. I have yet another last confusion. When we say it with (), it stores the matching value. Can you elaborate over this a little.
With use of parenthesis and without use of it.


Parens define groups in a regular expression -- and you can actually fetch the sub-match (the match within the paren) by using the group() method call. Group zero is the matched string, while group 1, group2, etc., are determined by the parens.

Having said that, groups which are followed by a qualifier don't work well IMO. In this example, there is only one sub group (which is group 1), and it will the last match (the character right before the comma). There isn't really a good way to get all the submatches using groups that are followed by qualifiers.

Henry

 
Been Zaidi
Greenhorn
Posts: 8
Chrome IntelliJ IDE Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dear Henry,

Thanks for clarification. It helped me clarify my concept.

Thanks,
Been
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic