Originally posted by Alexandre Bairos:
Hi!
I need to get hyperlinks from html documents through JSPs located in external web servers, in order to monitor modifications in web content.
If I understand you correctly, all you want to do is fire a HTTP GET request at those servers, and scan the HTML returned for hyperlinks. Right? Look into java.net.URL.getContent() for your requests. To extract the URLs, a simple
string scan (java.lang.String.indexOf()) might be adequate.
This will work irrespective of the type of source (flat HTML,
JSP, ASP, PHP, whatever). But, database-driven web pages are virtually impossible to check thoroughly. If the links you want to scan may come from the database, you're stuck.
- Peter