Win a copy of The Java Performance Companion this week in the Performance forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Please help me interpret this pattern

 
Jacob Sonia
Ranch Hand
Posts: 183
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
("<a\\b[^>]*href=\"[^>]*>(.*?)</a>")

I understand that * means 0 or more. [^] means negation, ? means 0 or 1, but i still cannot understand the whole pattern.

Thanks,
See
 
Wouter Oet
Saloon Keeper
Posts: 2700
IntelliJ IDE Opera
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Here you can lookup what the other characters mean. Just by scanning it it appears to match a html link tag <a href="something">Something else</a>
 
Campbell Ritchie
Sheriff
Pie
Posts: 49451
64
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Not too hot on regexes myself, but I think
Are you short of a second double-quote? You open quotes after href, but I can't see a closing quote.

That's what I can make of it. Let's see whether Rob has managed a better and quicker answer
 
Campbell Ritchie
Sheriff
Pie
Posts: 49451
64
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Wouter Oet wrote:Here you can lookup . . .
Good idea. Another place to look is the Pattern class.
 
Rob Spoor
Sheriff
Pie
Posts: 20552
57
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Campbell Ritchie wrote:Let's see whether Rob has managed a better and quicker answer

I would have split [^>]*href= into two separate parts, and would have explained the parentheses a bit more, but other than that it's a good explanation. And yes, a closing \" is missing.

These parentheses are probably actually used as a capturing group, allowing the user to get the hyperlink's label through the regex.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic