Win a copy of Functional Reactive Programming this week in the Other Languages forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

How to write a regex to include a text but exclude another text in the one regex?

 
pkinuk Buler
Ranch Hand
Posts: 63
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi all,

I've gone through a lots of examples in JavaRanch/Other websites, but I still can't write a regex to finish : include a text but exclude another text in the one regex

I have an example: haha.hello.common.exceptions.SchemaValidationException: Validated XML message - message invalid. [Error Code cegst01_350]\r\n\n\tat haha.hello.common.util.XMLUtility.validateMessage(XMLUtility.java:257)\n\tat haha.hello.mama.papa.release2.socketxml.readers.InputReaderBaseImpl.a(InputReaderBaseImpl.java:7)\n\tat haha.hello.mama.papa.release2.socketxml.readers.InputReaderBaseImpl.<init>(InputReaderBaseImpl.java:1)\n\tat haha.hello.mama.papa.release2.socketxml.readers.InputReaderWithSSODataImpl.<init>(InputReaderWithSSODataImpl.java:11)\n\tat haha.hello.mama.papa.release2.socketxml.readers.GetRolesReaderImpl.<init>(GetRolesReaderImpl.java:4)\n\tat haha.hello.mama.papa.release2.socketxml.CoreInputReaderFactoryImpl.newInputReader(CoreInputReaderFactoryImpl.java:85)\n\tat haha.hello.mama.papa.release2.socketxml.CoreInputReaderFactoryImpl.newInputReader(CoreInputReaderFactoryImpl.java:14)\n\tat haha.hello.mama.papa.release2.socketxml.CoreInputReaderFactoryImpl.newInputReader(CoreInputReaderFactoryImpl.java:52)\n\tat haha.hello.mama.papa.socket.protocols.NewlineDelimitedAPIBridge.process(NewlineDelimitedAPIBridge.java:108)\n\tat haha.hello.mama.papa.socket.protocols.NewlineDelimitedAPIBridge.process(NewlineDelimitedAPIBridge.java:20)\n\tat haha.hello.mama.papa.socket.SocketConnection.d(SocketConnection.java:29)\n\tat haha.hello.mama.papa.socket.SocketConnection.run(SocketConnection.java:172)\n\tat haha.hello.mama.papa.socket.SocketThreadManager$SocketThread.run(SocketThreadManager.java:7)\nCaused by: org.xml.sax.SAXParseException: cvc-minLength-valid: Value '' with length = '0' is not facet-valid with respect to minLength '1' for type 'NonEmptyString'.\n\tat com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:195)\n\tat com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.error(ErrorHandlerWrapper.java:131)\n\tat com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:384)\n\tat com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:318)\n\tat com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator$XSIErrorReporter.reportError(XMLSchemaValidator.java:410)\n\tat com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.reportSchemaError(XMLSchemaValidator.java:3165)\n\tat com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.elementLocallyValidType(XMLSchemaValidator.java:3068)\n\tat com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.processElementContent(XMLSchemaValidator.java:2978)\n\tat com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.handleEndElement(XMLSchemaValidator.java:2121)\n\tat com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.endElement(XMLSchemaValidator.java:791)\n\tat com.sun.org.apache.xerces.internal.jaxp.validation.DOMValidatorHelper.finishNode(DOMValidatorHelper.java:338)\n\tat com.sun.org.apache.xerces.internal.jaxp.validation.DOMValidatorHelper.validate(DOMValidatorHelper.java:243)\n\tat com.sun.org.apache.xerces.internal.jaxp.validation.DOMValidatorHelper.validate(DOMValidatorHelper.java:186)\n\tat com.sun.org.apache.xerces.internal.jaxp.validation.ValidatorImpl.validate(ValidatorImpl.java:100)\n\tat javax.xml.validation.Validator.validate(Validator.java:127)\n\tat haha.hello.common.util.XMLUtility.validateMessage(XMLUtility.java:8)

the String is a little bit too long, but it was an log message. What I planned to do in one regex is:
1. Check if the text contains
2. Make sure the text doesn't contain

The Matcher.find() will return true only if the text fulfill above conditions.

Can anyone help me to write the regex?

Thank you in advance
 
Henry Wong
author
Marshal
Pie
Posts: 21514
84
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
pkinuk Buler wrote:Hi all,

I've gone through a lots of examples in JavaRanch/Other websites, but I still can't write a regex to finish : include a text but exclude another text in the one regex

I have an example: [tt]haha.hello.common.exceptions.SchemaValidationException: Validated XML message - message invalid. [Error Code cegst01_350]\r\n\n\tat haha.hello.common.util.XMLUtility.validateMessage(XMLUtility.java:257)\n\tat haha.hello.mama.papa.release2.socketxml.readers.InputReaderBaseImpl.a(InputReaderBaseImpl.java:7)\n\tat haha.hello.mama.papa.release2.socketxml.readers.InputReaderBaseImpl.<init>(InputReaderBaseImpl.java:1)\n\tat the String is a little bit too long, but it was an log message. What I planned to do in one regex is:
1. Check if the text contains
2. Make sure the text doesn't contain

The Matcher.find() will return true only if the text fulfill above conditions.

Can anyone help me to write the regex?

Thank you in advance


The easiest way to do this is to have a negative look-ahead (generally from the beginning of the regex) attached to the regex search for the item that you want. With the negative look-ahead, the regex will always fail, if it finds the component that it doesn't want.

Using look aheads is pretty advanced, so I suggest that you start there, but don't be surprised if you run into trouble and have to backtrack to learn other parts of regex first.

Henry

 
pkinuk Buler
Ranch Hand
Posts: 63
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Henry Wong wrote:
pkinuk Buler wrote:Hi all,

I've gone through a lots of examples in JavaRanch/Other websites, but I still can't write a regex to finish : include a text but exclude another text in the one regex

I have an example: [tt]haha.hello.common.exceptions.SchemaValidationException: Validated XML message - message invalid. [Error Code cegst01_350]\r\n\n\tat haha.hello.common.util.XMLUtility.validateMessage(XMLUtility.java:257)\n\tat haha.hello.mama.papa.release2.socketxml.readers.InputReaderBaseImpl.a(InputReaderBaseImpl.java:7)\n\tat haha.hello.mama.papa.release2.socketxml.readers.InputReaderBaseImpl.<init>(InputReaderBaseImpl.java:1)\n\tat the String is a little bit too long, but it was an log message. What I planned to do in one regex is:
1. Check if the text contains
2. Make sure the text doesn't contain

The Matcher.find() will return true only if the text fulfill above conditions.

Can anyone help me to write the regex?

Thank you in advance


The easiest way to do this is to have a negative look-ahead (generally from the beginning of the regex) attached to the regex search for the item that you want. With the negative look-ahead, the regex will always fail, if it finds the component that it doesn't want.

Using look aheads is pretty advanced, so I suggest that you start there, but don't be surprised if you run into trouble and have to backtrack to learn other parts of regex first.

Henry



Hi Henry, thank you for your quick reply. I did apply the negative look-ahead, but it failed, the Matcher.find() returned true. The following is my code:

The regex i used is
SchemaValidationException(?(?!\QCaused by: org.xml.sax.SAXParseException: cvc-minLength-valid: Value '' with length = '0' is not facet-valid with respect to minLength '1' for type 'NonEmptyString'.\E).)*(?:\s)*)*


However, the Matcher.find() returned true, while the Matcher.group() is
SchemaValidationException: Validated XML message - message invalid. [Error Code cegst01_350]\r\n\n\tat com.qxlva.common.util.XMLUtility.validateMessage(XMLUtility.java:257)\n\tat com.qxlva.nhs.api.release2.socketxml.readers.InputReaderBaseImpl.a(InputReaderBaseImpl.java:7)\n\tat com.qxlva.nhs.api.release2.socketxml.readers.InputReaderBaseImpl.<init>(InputReaderBaseImpl.java:1)\n\tat com.qxlva.nhs.api.release2.socketxml.readers.InputReaderWithSSODataImpl.<init>(InputReaderWithSSODataImpl.java:11)\n\tat com.qxlva.nhs.api.release2.socketxml.readers.GetRolesReaderImpl.<init>(GetRolesReaderImpl.java:4)\n\tat com.qxlva.nhs.api.release2.socketxml.CoreInputReaderFactoryImpl.newInputReader(CoreInputReaderFactoryImpl.java:85)\n\tat com.qxlva.nhs.api.release2.socketxml.CoreInputReaderFactoryImpl.newInputReader(CoreInputReaderFactoryImpl.java:14)\n\tat com.qxlva.nhs.api.release2.socketxml.CoreInputReaderFactoryImpl.newInputReader(CoreInputReaderFactoryImpl.java:52)\n\tat com.qxlva.nhs.api.socket.protocols.NewlineDelimitedAPIBridge.process(NewlineDelimitedAPIBridge.java:108)\n\tat com.qxlva.nhs.api.socket.protocols.NewlineDelimitedAPIBridge.process(NewlineDelimitedAPIBridge.java:20)\n\tat com.qxlva.nhs.api.socket.SocketConnection.d(SocketConnection.java:29)\n\tat com.qxlva.nhs.api.socket.SocketConnection.run(SocketConnection.java:172)\n\tat com.qxlva.nhs.api.socket.SocketThreadManager$SocketThread.run(SocketThreadManager.java:7)\n


The java code is:


Would anybody tell me how to fix it?

Thank you in advance
 
Winston Gutkowski
Bartender
Pie
Posts: 10527
64
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
pkinuk Buler wrote:Can anyone help me to write the regex?

Yes. DON'T.

Regexes were designed for (originally) string patterns contained in a single line and, although they've been expanded to include multi-line matches, I'd suggest that whatever regex you come up with is likely to be unwieldly.

Seems to me that you are searching for two patterns in a multi-line block, so my pseudo-code would look something like:but I'm quite sure there are other solutions.

Winston

[Edit] @pkinuk: Above is not quite right. You shouldn't read a new line in the outer loop if a match on the 1st string was already found; I leave it to you to correct.
 
Victor M. Pereira
Ranch Hand
Posts: 50
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm not sure if I fully understand what you want, but it seems that your regex is missing a couple of things.

First, the regex you wrote is telling me that the String must begin with "SchemaValidationException" or it must not have Caused By ...

I believe that you are seeking something like: ([a-Z]|[0-9])*(SchemaValidationException|!(Caused by: org.xml.sax.SAXParseException: cvc-minLength-valid: Value '' with length = '0' is not facet-valid with respect to minLength '1' for type 'NonEmptyString'.))+([a-Z]|[0-9])*

I recommend trimming the String since spaces complicate the regex and one missing space might ruined your method.

That would be with regex however from your test case seems that indexOf would be easier to apply.

 
pkinuk Buler
Ranch Hand
Posts: 63
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
pkinuk Buler wrote:Can anyone help me to write the regex?

Yes. DON'T.

Regexes were designed for (originally) string patterns contained in a single line and, although they've been expanded to include multi-line matches, I'd suggest that whatever regex you come up with is likely to be unwieldly.

Seems to me that you are searching for two patterns in a multi-line block, so my pseudo-code would look something like:but I'm quite sure there are other solutions.

Winston

[Edit] @pkinuk: Above is not quite right. You shouldn't read a new line in the outer loop if a match on the 1st string was already found; I leave it to you to correct.


thank you for your reply, I wished I could use the yr pseudo-code do check the string. It was one of the requirements from our client: we have to use one regex to check the following conditions:

1. Check if the string contains ‘A’
2. If above condition return true, then check if the string doesn't contain the 'B'

 
pkinuk Buler
Ranch Hand
Posts: 63
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Victor M. Pereira wrote:I'm not sure if I fully understand what you want, but it seems that your regex is missing a couple of things.

First, the regex you wrote is telling me that the String must begin with "SchemaValidationException" or it must not have Caused By ...

I believe that you are seeking something like: ([a-Z]|[0-9])*(SchemaValidationException|!(Caused by: org.xml.sax.SAXParseException: cvc-minLength-valid: Value '' with length = '0' is not facet-valid with respect to minLength '1' for type 'NonEmptyString'.))+([a-Z]|[0-9])*

I recommend trimming the String since spaces complicate the regex and one missing space might ruined your method.

That would be with regex however from your test case seems that indexOf would be easier to apply.



Thank you for your reply. As I said I had to use one regex to finish the following checking:

1. Check if the string contains A
2. If the string contains A then check if the string doesn't contain B

I've simpified the example. Hopefully someone can fixed the problem.

Thank you in advance

 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It took me about a minute on google to find exactly what you seem to be looking for. It was in the first link that was returned.



The above will return true if the string contains X but not Y.

Is that what you were asking about, or have I misunderstood something?
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic