You don't say whether or not you have other tags within your <td> tags, whether or not there can be a new line within the <td> tag or whether or not you can have more than one on one line so take the following as a starting point.
import java.util.regex.*;
public class Test20040913 { public static void main(String[] args) { Pattern pattern = Pattern.compile("<td>([^<]*)</td>", Pattern.MULTILINE);
String lines = "some rubbish <td>value 1</td> some futher \nrubbish <td>value \n2</td> and more again"; Matcher matcher = pattern.matcher(lines); for (int startPoint = 0; matcher.find(startPoint); startPoint = matcher.end()) { System.out.println(" Value found at " + matcher.start(1) + " with value [" + matcher.group(1) + "]"); } } }
This still makes several severe assumptions about your requirements. One can only generate an effective regular expression if the requirements are well specified and without a good specification one is only guessing.
You are reaching the point where you might do better to use a tolerant XML parser to generate a DOM document. A Google search could be effective.