• Post Reply Bookmark Topic Watch Topic
  • New Topic

Help with Regular Expressions  RSS feed

 
Robert Raps
Ranch Hand
Posts: 30
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
There is string for search "<<23423gd=,4><<233w,4234>"
I need to select every entry between << and >
I try to use next pattern "<<([.*&&[^<]])>".
But it fails.

 
Steve Luke
Bartender
Posts: 4181
22
IntelliJ IDE Java Python
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Have you tested the simple cases? For example, my first test was this: <<(.*)>. That didn't work because it found 23423gd=,4><<233w,4234 as one group. So I made it reluctant: <<(.*?)>. It seemed to work on your posted example.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Robert Raps wrote:I try to use next pattern "<<([.*&&[^<]])>".

Steve's pattern is probably what I would have chosen too; but what you tried suggests that you also want to eliminate the situation where you encounter three "<"s in a row, so you could try: <<([^<>]*)>.

Note that this will return an empty group if it encounters <<>. It also won't work if you can have tags embedded inside <<...>.

Basically, regexes are NOT suited for tag parsing unless what you need is very simple.

Winston
 
Henry Wong
author
Sheriff
Posts: 23295
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Robert Raps wrote:There is string for search "<<23423gd=,4><<233w,4234>"
I need to select every entry between << and >
I try to use next pattern "<<([.*&&[^<]])>".



Also, from the post, it looks like the OP is under the impression that qualifiers work in character classes -- which of course, it does not. How can it? Character classes are used to specify for a single character, there is no such a thing as "zero or more" of "one" character.

Henry
 
Ramesh Pramuditha Rathnayake
Ranch Hand
Posts: 178
2
Java MySQL Database Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator


I suggest this.
As Robert wants every entry between << & > I don't think <<([^<>]*)> is good. It's because this gives "<<234<23gd=,4>" no result.. But here "234<23gd=,4" is between << and >
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ramesh Pramuditha Rathnayake wrote:As Robert wants every entry between << & > I don't think <<([^<>]*)> is good. It's because this gives "<<234<23gd=,4>" no result.. But here "234<23gd=,4" is between << and >

Well if it's in HTML and it's a logical expression, then I would expect the generator to use the proper entity ('&lt;').

So, how would you propose to deal with "<<234>23gd=,4>" (if such an animal exists)?

This is just one of the reasons why, as I said before, regexes are not suited to parsing MLs.

Winston
 
Ramesh Pramuditha Rathnayake
Ranch Hand
Posts: 178
2
Java MySQL Database Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes it is true.. There should be 2 answers.
234>23gd=,4
234


But regex does not support for that as I think. And I don't know a way to extract them.

And for the main question too, there are 3 answers..
23423gd=,4><<233w,4234
23423gd=,4
233w,4234


As Winston said, I also don't think that regexes are not suited for this..
 
Ramesh Pramuditha Rathnayake
Ranch Hand
Posts: 178
2
Java MySQL Database Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Though I said there is no way, I found a way..!


This gives 4 answers..!!
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!