Win a copy of The Java Performance Companion this week in the Performance forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

How to capture response sent from a website

 
Nelo Angelo
Ranch Hand
Posts: 44
Chrome Eclipse IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello everyone,

I tried to search this topic on the internet but couldn't get any proper solution. I want to open a page through my Servlet and capture response that is received from the server so that I could alter/retrieve the data I get from the response. I am not talking about filtering the data sent from my own servlet but getting the response from other websites.

Any suggested reading would be greatly appreciated.

Basically, I was trying to extract all the hrefs present on a page through FileIO. But that can be tedious process as I had to download the pages before-hand. Is there any other way to do this? Please advice.
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13064
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sure, look at the java.net package, specifically the URLConnection and HttpURLConnection classes.

The concept used to be called "Screen scraping" - back when terminal displays were simple. Now I think "web crawler" is what you want.

Of course, once you have the initial HTML text you still have a lot of work to do to extract all the linked resources.

Bill
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic