[Logo]
Forums Register Login
Regular Expression
Hi,

Appreciate your help in helping me with regular expression for below 2 files containing the keyword RES & UDR. The requirement is to pick only the XML files from a directory having the keyword RES or UDR ( or both). I am trying to find out regular expression matching this criteria.


QM_2017121114173526400_WI90_KM_UDR.xml

QM_2017022019000817400_WI90_KM_RES.xml

While a regex would work, why not check filename.contains("_UDR_") and filename.contains("_RES_")? If you keep the list of string to search in an array, the approach scales easily.
Hi,

Well, I would need REGEX since this would be used in MQ MFT monitor to pick the files from directory and its not java program.
If URD/RES appear at the end of the file name:
".*_UDR\.xml"
".*_RES\.xml"


If UDR/RES can appear anywhere:
".*UDR.*"
".*RES.*"


I would need refinement of your requirements if this is not what you had in mind.
The RES and UDR keyword can appear anywhere.

Here is the thing. I have a directory say F:\csdata\MessageBus\Maydown_Yarn\UploadQM where two types of files would be present ( say QM_2017022019000817400_WI90_KM_RES.xml & QM_2017121114173526400_WI90_KM_UDR.xml)

I have a MQ MFT ( Product from IBM MQ) which is file transfer product. The MQ MFT resource monitor would allow to pick the  file from the monitored directory either using wild card or java regular expression. ( Please see attached screenshot)

I am looking for regular expression which I can write in "Match Pattern" column to instruct the monitor to pick the file having either "RES" or "UDR" keyword.



MQ_MFT.png
[Thumbnail for MQ_MFT.png]
In that case, something like

.*_UDR\.xml$|.*_RES\.xml$

should work. The syntax might differ a bit, but maybe not.
I would use ^.*(?:RES|UDR).*+$, if RES or UDR can appear truly anywhere in the file name. This causes only one quantifier to backtrack, giving you linear performance rather than quadratic.
Thanks Stephan & Tim. I tried both the options and it worked. I would use ^.*(?:RES|UDR).*+$ since the keyword RES and UDR can appear anywhere in the file name.
Do you understand it? Do you know what all the components of this regex do and why they were chosen?
question...why do you need the anchors?  if you start and end your expression with ".*", do those actually matter?
Yes.

^.*(?:RES|UDR).*+$

^.* - Indicates the beginning of the file name can start with any characters (except line break) zero or more times

(?:RES|UDR) - Indicates the keyword RES "OR" UDR appearing zero or one time

.* - RES or UDR would be present zero or more times 

+  - matches the preceding character one or more times

^ and $ - Indicates  whatever is in between them must cover the entire line end-to-end
(1 like)
 

fred rosenberger wrote:question...why do you need the anchors?  if you start and end your expression with ".*", do those actually matter?


It more accurately represents what you want to look for. Also, you can use it to find multiple filenames if you use it on a list of filenames separated by line separators.

Varun Thonse Rao wrote:Yes.

+  - matches the preceding character one or more times


Why would I want to do that if I already used the * quantifier.

No, .*+ means any character, zero or more times, as much as possible, without backtracking. The + symbol makes the quantifier possessive.
The human mind is a dangerous plaything. This tiny ad is pretty safe:
ScroogeXHTML 7.1 - RTF to HTML5 / XHTML converter
https://coderanch.com/t/690611/ScroogeXHTML-RTF-HTML-XHTML-converter


This thread has been viewed 1493 times.

All times above are in ranch (not your local) time.
The current ranch time is
Feb 21, 2018 02:41:19.