Win a copy of The Java Performance Companion this week in the Performance forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Pattern matching problem

 
Rahul Ba
Ranch Hand
Posts: 206
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am trying to retrive the body contents and file tag conetents if any.

String str = "<body>1</body><file>myFile1</file><body>2</body><body>3</body><file>myFile2</file>";
Pattern pattern = Pattern.compile("<body>(.*?)</body><file>(.*?)</file>");
Matcher matcher = pattern.matcher(str);
while(matcher.find()) {
System.out.println("BValue:"+matcher.group(1));
System.out.println("FValue:"+matcher.group(2));
}

I am getting the output in this way
BValue:1
FValue:myFile1
BValue:2</body><body>3
FValue:myFile2

See, mu BValue is not coming properly, I want output in following way...I know this is not in proper pattern..but still can we achieve this output?


BValue:1
FValue:myFile1
BValue:2
BValue:3
FValue:myFile2

 
Sebastian Janisch
Ranch Hand
Posts: 1183
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am not a fan of engaging the heavy regex engine when string positions would do just fine, as they do in your case.

You could rewrite your program using code.indexOf("<body") etc. and loop over all occurances.
 
Rob Spoor
Sheriff
Pie
Posts: 20550
57
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The problem is that your regular expression requires a <file> to come after an <body>. That's why the non-greedy .*? will match "2</body><body>3". To prevent this, make the <file> optional.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic