Forums Register Login

retrieve the HTML page of any URL without using java.net.URL

+Pie Number of slices to send: Send
I want to develop a simple class that can fetch the HTML contents of a URL without using java.net.Url or java.net.UrlConnection classes. Any good suggestions in this regard will be highly appreciated... Thanks in advance...
+Pie Number of slices to send: Send
Try Apache's HttpClient. But it may use URL in the background, I'm not sure about that.
+Pie Number of slices to send: Send
 

salman khalid wrote:I want to develop a simple class that can fetch the HTML contents of a URL without using java.net.Url or java.net.UrlConnection classes.



Why?

Anyway the way to do that would be to write code which does this:

  • Extracts the name of the host from the URL
  • Extracts the port from the URL
  • Connects to that host on that port using a Socket
  • Using the HTTP protocol, sends a GET request to the host
  • Receives the response from the host and interprets it according to the HTTP protocol


  • You may find your requirement for "a simple class" conflicts with what actually has to be done. That's why I asked why you want to do this.
    +Pie Number of slices to send: Send
     

    Rob Prime wrote:Try Apache's HttpClient. But it may use URL in the background, I'm not sure about that.


    I'm pretty sure it doesn't use URLConnection; that caused problems for me when I tried to use it in an applet.
    +Pie Number of slices to send: Send
     

    Paul Clapham wrote:

    salman khalid wrote:I want to develop a simple class that can fetch the HTML contents of a URL without using java.net.Url or java.net.UrlConnection classes.



    Why?

    Anyway the way to do that would be to write code which does this:

  • Extracts the name of the host from the URL
  • Extracts the port from the URL
  • Connects to that host on that port using a Socket
  • Using the HTTP protocol, sends a GET request to the host
  • Receives the response from the host and interprets it according to the HTTP protocol


  • You may find your requirement for "a simple class" conflicts with what actually has to be done. That's why I asked why you want to do this.



    thanks for the response...I agree with you that it will not be a simple class. I have implemented your suggested method. The following code snippet describes this method, but there is a problem in this approach and that is that it does not retrieve HTML contents when a URL contains the file path as well.

    like "www.google.com" URL will return HTML contents but not "http://www.oracle.com/technetwork/java/index.html" URL.

    +Pie Number of slices to send: Send
     

    Rob Prime wrote:Try Apache's HttpClient. But it may use URL in the background, I'm not sure about that.





    I will try Apache's HttpClient and then I will let you know....
    +Pie Number of slices to send: Send
    Please UseCodeTags next time. It preserves indentation, and adds syntax highlighting. I've added them to your code, and you can see it's much easier to read now.
    +Pie Number of slices to send: Send
     

    salman khalid wrote:... but there is a problem in this approach and that is that it does not retrieve HTML contents when a URL contains the file path as well.



    You will have to properly format the get request. This URL may help:
    http://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html
    +Pie Number of slices to send: Send
    Which is why I suggested HttpClient, as it will do all the hard work for you.
    I like tacos! And this tiny ad:
    a bit of art, as a gift, that will fit in a stocking
    https://gardener-gift.com


    reply
    reply
    This thread has been viewed 3054 times.
    Similar Threads
    How to connect to HTTPS url from a java method
    How to connect to HTTPS url using java URL Class
    URL Redirect
    parent location of a url
    how can i validate an url using weblogic 7.0 api?
    More...

    All times above are in ranch (not your local) time.
    The current ranch time is
    Mar 28, 2024 03:42:44.