• Post Reply Bookmark Topic Watch Topic
  • New Topic

help with search() method  RSS feed

 
Willie Tsang
Greenhorn
Posts: 24
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have assignment that im stuck with.
Design a program that finds all hyperlinks of the form <a href="link">link text</a> on the address http://java.sun.com/index.html

I have already did the read and write its contents to a file. I am stuck at, how can i read contents that in the file, and know whether it is a url then return its url address. Any suggestions?
 
Rob Spoor
Sheriff
Posts: 21131
87
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
There are two ways I can think of:
1) Use a regular expression. The basic format for a hyperlink is <a XXX href="YYY" ZZZ>AAA</a>; XXX and ZZZ are any number of attributes, YYY is the URL and AAA is the title. It's not that hard to write a regular expression for that using the Javadoc of java.util.regex.Pattern. There are two things to consider:
a) You need to use reluctant qualifiers.
b) The URL does not need to be enclosed in double quotes; it can also be enclosed in single quotes, or not enclosed in quotes at all.

2) Use an HTML parser to get hold of all the links.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!