• Post Reply Bookmark Topic Watch Topic
  • New Topic

Please Help -Problem in Regular Expressions  RSS feed

Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Everybody I am in need of your help in correcting my code . I am not an expert in Regular expressions I want to open a URL say shopping.com and display only the tables in that particular web page.....
My code below is not working to display all tables in the web page ...........
Can anybody help me to corret the code Since I need to submit my code at the earliest any help would be really appreciated.............
Thanks a lot in advance............
My code is below...........
import java.net.*;
import java.io.*;
import java.util.*;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.ArrayList;
import java.util.Vector;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

class ConnectionTest {
public static void main(String[] args) {
URL yahoo = new URL("http://www.shopping.com");
URLConnection yahooConnection = yahoo.openConnection();
DataInputStream dis = new DataInputStream(yahooConnection.getInputStream());

String inputLine;

Pattern regexp = Pattern.compile("<table(.*?)</table>", Pattern.DOTALL);

while ((inputLine = dis.readLine()) != null) {
Matcher matcher = regexp.matcher(inputLine);

matcher.reset( inputLine ); //reset the input
if ( matcher.find() )

} catch (MalformedURLException me) {
System.out.println("MalformedURLException: " + me);
} catch (IOException ioe) {
System.out.println("IOException: " + ioe);

It just comes out of the loop without displaying anything anybody please help me out.................
Waiting for your guidance and help in regular expressions Please
Posts: 57437
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to the Ranch.

This is the Java Tutorial about regular expressions.

There is something not quite right about your regex; you are matching any character any number of times then maybe once or not at all. It is the .*? bit. All three of those characters are meta-characters; . means anything-not-line-end, * means any number including 0 and ? means 0 or 1.

Not sure exactly what you need, but I think you will find the tutorial helpful. Take some time over it; regular expressions are by no means easy.
[ April 24, 2008: Message edited by: Campbell Ritchie ]
The human mind is a dangerous plaything. This tiny ad is pretty safe:
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!