Win a copy of Murach's Python Programming this week in the Jython/Python forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

web spider setting HTTP User-Agent  RSS feed

 
Kevin Nilson
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am writing a web robot and I need for it to identify itself as a robot and not a regular web browser. This should be done by setting the User-Agent, but I can figure out how to do it.
Below is code similar to what I am doing. In the weblog it shows up as 'HTTP/1.1' for the user agent. I want it to say something like 'Java Program, not browser'.
URL url = new URL ("http://www.yahoo.com");
URLConnection connection = url.openConnection();
connection.setDoOutput(true);
BufferedReader in = new BufferedReader(
new InputStreamReader
(connection.getInputStream()));
String inputLine;
while ((inputLine = in.readLine()) != null)
System.out.println(inputLine);
in.close();

Thanks
Kevin Nilson
 
Mike Janger
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Try this:
Properties props = new Properties(System.getProperties());
props.put("http.agent", "Kevin's non-browser Robot");
System.setProperties(props);
Mike Janger
Web Developer
Meridian Enterprises Corporation
[ November 11, 2002: Message edited by: Mike Janger ]
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!