• Post Reply Bookmark Topic Watch Topic
  • New Topic

Web Spider Java  RSS feed

 
Isaac Ferguson
Ranch Hand
Posts: 1063
3
Java Netbeans IDE Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi

I want to write a spider which detect if a web site is using a script in the head. I also need to test it I have one server, but I dont know how to create several URL with different IP´s for it.

I never have do it before. Could someone put me in the right track?

Regards
Isaac
 
Henry Wong
author
Sheriff
Posts: 23295
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
As a Java library, many people like to use the httpclient (http://hc.apache.org/httpclient-3.x/) to programmatically scrape URLs. Personally, I just use the java.net.URL class, which seems to work fine (and it is built into the core library).

Henry

 
Ulf Dittmer
Rancher
Posts: 42972
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
For anything that needs to process (as opposed to just downloading) web pages, I recommend the HtmlUnit library.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!