• Post Reply Bookmark Topic Watch Topic
  • New Topic

RegEx question on how to match lines that DO NOT meet a specific pattern?  RSS feed

 
Michael D Sims
Ranch Hand
Posts: 113
1
IntelliJ IDE Java MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I couldn't find a forum here that specifically addresses RegEx questions, so I'm posting here.

I have a file that I read in that has over 70,000 lines of text. When I do a search for a specific END OF LINE patter, which specifically looks like this:

(\t\d{1,4}\t\d{1,9})$

I get LESS than 70,000 hits. So I need to find the lines that are NOT hitting with that search pattern, and I have not been able to figure this out so far.

I could do this in Java, or I can even use the find tool in TextWrangler which supports RegEx very similar to the PERL implementation of it.

Thank you.

 
Darryl Burke
Bartender
Posts: 5167
11
Java Netbeans IDE Opera
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Off the top of my head:Can you provide a few samples of input lines and indicate which you would expect to match the pattern?
 
Carey Brown
Saloon Keeper
Posts: 3329
46
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Darryl Burke wrote:Off the top of my head:Can you provide a few samples of input lines and indicate which you would expect to match the pattern?

I have never seen the caret (^) used in this manor, as a logical NOT. The only two uses I can find for it is as a logical NOT inside a character class (e.g. [^a-z]) or as an anchor to the beginning of the input (e.g. ^.*$). Can you point me at any references for this?
 
Carey Brown
Saloon Keeper
Posts: 3329
46
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you want to find lines that do not match a pattern you could do this in java like so:
 
Darryl Burke
Bartender
Posts: 5167
11
Java Netbeans IDE Opera
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Carey, you're right. The regex I posted would attempt to match the entire line and wouldn't solve the problem.
 
fred rosenberger
lowercase baba
Bartender
Posts: 12565
49
Chrome Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
if you use grep, the -v option says "return lines that do NOT match this pattern"
 
Carey Brown
Saloon Keeper
Posts: 3329
46
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
fred rosenberger wrote:if you use grep, the -v option says "return lines that do NOT match this pattern"


Output

So, it may work for grep but doesn't work for Java.

Edit: Hmmm, see what you mean about grep, -v inverts the match but does not change the actual regular expression.
 
fred rosenberger
lowercase baba
Bartender
Posts: 12565
49
Chrome Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Carey Brown wrote:Edit: Hmmm, see what you mean about grep, -v inverts the match but does not change the actual regular expression.

yes...so

grep fred myfile.txt

will return every line in the file that has "fred" in it anywhere

grep -v fred myfile.txt

will return the lines that do NOT contain "fred". They are effectively the inverse.
 
Carey Brown
Saloon Keeper
Posts: 3329
46
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
So, the Java equivalent of "grep -v" would be this...
Carey Brown wrote:If you want to find lines that do not match a pattern you could do this in java like so:
 
Michael D Sims
Ranch Hand
Posts: 113
1
IntelliJ IDE Java MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Carey Brown wrote:If you want to find lines that do not match a pattern you could do this in java like so:

Now this is brilliant! Why didn't I think of this? ... I think I assumed that using pattern matchers and the subsequent

while(m.find())

loop to manipulate my matches was strictly a positive search feature. I didn't consider simply 'notting' the m.find() function to find the lines that don't match. I guess I assumed Java merely presented inside the loop, those lines that matched and no option for lines that didn't ... but when I think about it, why wouldn't Java simply loop through each line inside itself, while only presenting matches once they hit - offering the coder the option of also seeing those that don't match ... I feel like an idiot for not even considering it.

Thank you!

Mike
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!