• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Pattern matching problem

 
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
hi there

I have a string = "<a><b>qwer qwer</b></a><b>zxcv zcv</b>"
I want output as follows

<b>qwer qwer</b>
<b>zxcv zcv</b>


I tried following but the problem is i m getting output as <b>qwer qwer</b></a><b>zxcv zcv</b>

String newLine = System.getProperty("line.separator").toString();
String input = "<a><b>qwer qwer</b></a><b>zxcv zcv</b>";
String output = "";
String regex = "<b>.*</b>";
Pattern p1 = Pattern.compile(regex);
Matcher m1 = p1.matcher(input);
while (m1.find())
{
output += m1.group() + newLine;
}

//System.out.println("input = " + input);
System.out.println("output = "+output);


Can anyone suggest a solution for this ?

Basically because i m using .* so it goes on parsing the string and doesn't stop when it finds first match.

Can somebody tell how to do this ?

Thanks a lot in advance
 
Ranch Hand
Posts: 188
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

pats shah wrote:hi there

I have a string = "<a><b>qwer qwer</b></a><b>zxcv zcv</b>"
I want output as follows

<b>qwer qwer</b>
<b>zxcv zcv</b>


I tried following but the problem is i m getting output as <b>qwer qwer</b></a><b>zxcv zcv</b>

String newLine = System.getProperty("line.separator").toString();
String input = "<a><b>qwer qwer</b></a><b>zxcv zcv</b>";
String output = "";
String regex = "<b>.*</b>"; // try "<b>([a-z]*|\\s*)*</b>"
Pattern p1 = Pattern.compile(regex);
Matcher m1 = p1.matcher(input);
while (m1.find())
{
output += m1.group() + newLine;
}

//System.out.println("input = " + input);
System.out.println("output = "+output);


Can anyone suggest a solution for this ?

Basically because i m using .* so it goes on parsing the string and doesn't stop when it finds first match.

Can somebody tell how to do this ?

Thanks a lot in advance

 
Bartender
Posts: 4179
22
IntelliJ IDE Python Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The problem is that your regex where you find '0 or more characters' is too greedy, it is looking for everything it can get its hands on without breaking a match - which includes the intermediate tag. So this part of the regex:

matches all of this text:


Look at the Pattern javadocs to find a way to make it more reluctant to consume characters (ie, don't consume those characters if they can be used in another part of the matching pattern).
 
Ranch Hand
Posts: 385
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
use the pattern like this "<b>.*?</b>"
 
Rahul P Kumar
Ranch Hand
Posts: 188
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Siva Masilamani wrote:use the pattern like this "<b>.*?</b>"



Thanks! it was revealing
 
Marshal
Posts: 79178
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
And welcome to JavaRanch, Pats Shah
 
reply
    Bookmark Topic Watch Topic
  • New Topic