• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Paul Clapham
  • Liutauras Vilda
  • Knute Snortum
  • Bear Bibeault
Sheriffs:
  • Devaka Cooray
  • Jeanne Boyarsky
  • Junilu Lacar
Saloon Keepers:
  • Ron McLeod
  • Stephan van Hulst
  • Tim Moores
  • Carey Brown
  • salvin francis
Bartenders:
  • Tim Holloway
  • Piet Souris
  • Frits Walraven

AND operation in java RegEX and Pattern

 
Ranch Hand
Posts: 91
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi All,

When we use the Pattern and Matcher to check if a string complies to a regular expression, how do we perform an AND condition.

For example: Let say I have a really big string. To find out if that String has either 'DOG' or 'CAT' or 'ELEPHANT' word in it, I can use the logical OR (|) symbol. But to find out if that String has all three 'DOG' and 'CAT' and 'ELEPHANT' word in it, how do we construct a pattern?
 
Ranch Hand
Posts: 423
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

Regex hasn't explicit AND operator.
But one could say that Regex has implicit AND operator
Regex pattern "ABC" means: string matches the pattern "ABC" if it has A as a first letter AND B as a second letter AND C as a third letter.

Remember that regex is an engine that operates on series of characters/words, not on sets of characters/words,
thus a position of a word or a character in a string is significant, and regex is probably not the best tool to check this kind of condition.

To check if a string contains two words 'DOG' or 'CAT' we can use this pattern:
(DOG.*CAT)|(CAT.*DOG)
Here we have 2 words and 2 possible combinations of order of words in a string.

For three words 'DOG' or 'CAT' or 'ELE' we have 6 possible combinations of order, so our regex pattern must be:
(DOG.*CAT.*ELE)|(DOG.*ELE.*CAT)|(CAT.*DOG.*ELE)|(CAT.*ELE.*DOG)|(ELE.*CAT.*DOG)|(ELE.*DOG.*CAT)

It is easier and much more faster to check this condtition in this way:



Edit:
After thoughts .... you could use positive lookaheads to check this condition,
here is an example:

 
author
Posts: 23868
141
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Ireneusz Kordal wrote:
After thoughts .... you could use positive lookaheads to check this condition,



Agree++. Positive look aheads are the way to implement ANDs in a regex.

However, you are no longer in the easy beginner regex coding anymore.

Henry
 
Karthick Dharani Vidhya
Ranch Hand
Posts: 91
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks a lot.

Since in my requirement we always know that the order is going to be same like CAT always comes before and DOG somewhere in the middle and ELE after that. I used ".*CAT.*DOG.*ELE.*". But it is not going to work if the order changes.

Thanks again because with the positive lookahead suggested above, I no more have to worry about the order.

One one question: Why do we have ^ there in "^(?=.*CAT)(?=.*ELEPHANT)(?=.*DOG).*". I know that ^ is to indicate the beginning of line. But could not identify the purpose of it here.

Note: Even the OR condition that you have mentioned above can be simplified. Hope you got it after going through the Positive lookahead.
 
Ireneusz Kordal
Ranch Hand
Posts: 423
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Karthick Dharani Vidhya wrote:
One one question: Why do we have ^ there in "^(?=.*CAT)(?=.*ELEPHANT)(?=.*DOG).*". I know that ^ is to indicate the beginning of line. But could not identify the purpose of it here.


Yes you are right, ^ (beginning of line) is not required here.

But if you are worrying about performance (who isn't?), look at this simple test:


Results:

Look at test 4 .... do you still prefer to use regex engine to check this simple condidtion on huge strings ?
 
Do not threaten THIS beaver! Not even with this tiny ad:
Java file APIs (DOC, XLS, PDF, and many more)
https://products.aspose.com/total/java
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!