Bookmark Topic Watch Topic
  • New Topic

extract html tags from....  RSS feed

 
kaparapu madhu
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Report post to moderator
i need to extract only tags from the entire html tag

ex
1) <li>regular expression
should give : <li>

2) <li>regular expression</li>
should give : <li></li>

3) regular<li>expression
should give : <li>

4) regular expression</li>
should give : </li>

i want this to be done with the help of regular expressions

i have tried the following expression but in vain.
String finalline = line.trim().replaceAll("(>.[^<>]*< ", "><");

help me over this....
 
Dave Wingate
Ranch Hand
Posts: 262
  • Mark post as helpful
  • send pies
  • Report post to moderator
Maybe instead of using replaceAll(..), you could parse the string using a pattern matcher.

Try using Sun's regex test harness.

After you've got the code compiled, edit the regex.txt file so that it contains these two lines:
<.*?>
<li>sometext i don't want</li>

I think this will get you started on what you want to do.
 
Ilja Preuss
author
Sheriff
Posts: 14112
  • Mark post as helpful
  • send pies
  • Report post to moderator
Duplicated at http://www.coderanch.com/t/376951/java/java/Regular-Expression - please continue discussion there.
 
    Bookmark Topic Watch Topic
  • New Topic
Boost this thread!