• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

retrieve images from the web

 
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have built a client socket to get the web page from the server. When i output the result the images will not be shown. How can i get the images to show of the requested web page?
this is the code of my client:
try {
InetAddress addr = InetAddress.getByName(args);
URL url = new URL(args);
int getport = url.getPort();
int port = 80;
SocketAddress sockaddr = new InetSocketAddress(addr,port);
Socket socket = new Socket();
int timeoutMS = 2000;
socket.connect(sockaddr,timeoutMS);

boolean autoflash = true;
PrintWriter out = new PrintWriter(socket.getOutputStream(),autoflash);
InputStream inputStream = socket.getInputStream();
InputStreamReader isReader = new InputStreamReader(inputStream);
BufferedReader rd = new BufferedReader(isReader);

out.println("GE" + args + "HTTP/1.1");
out.println("HOST: localhost:80");
out.println("connection: closed");
out.println();

String s = null;
while ((s = rd.readLine()) != null)
System.out.println(s);
rd.close();
} catch (MalformedURLException ex) {
} catch (UnknownHostException ex) {
} catch (IOException ex) {
}
}
 
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You'll need to parse the HTML page you got, and then download the images in separate calls. Note that there various ways of adding images to web pages (IMG tags, via CSS, via JavaScript), so it's not trivial if you need to be sure you're downloading all the images.

} catch (MalformedURLException ex) {
} catch (UnknownHostException ex) {
} catch (IOException ex) {
}


This is a bad idea. How will you know if there are any problems? At least write the error message to System.err or System.out.
 
mj zammit
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yes it is a bad idea. I have now added the System.out.println to each of them to see what error was caught.
Going back to the images, so i will not only be have to call for the html file but the images the server uses??
any ideas on how to do it please
 
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by mj zammit:
any ideas on how to do it please

Depends what "it" means in that question. Are you asking how to parse the HTML and identify the images you need to download? Are you asking how to download an image? Are you asking how to display the image?

You only said "When i output the result" and didn't say anything about what that "output" looked like, so I don't have any context for answering questions number 1 and 3 of the possibilities. For #2, how to download an image, you do that just like downloading the HTML, but don't use a Reader because that's meant for reading text. Use an InputStream instead.
 
mj zammit
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
In the above code my output would be the HTML of any website (example www.yahoo.com), but when i come to rendering this HTML the images do not come up and i would like to rectify that problem...
 
Ulf Dittmer
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The page in all likelihood has relative paths to images - you need to replace those with absolute paths that point to the original site.

Or, as was mentioned above, you need to download the images. In that case, you still need to correct the links in the HTML page, or mimic the server's directory structure locally.
 
mj zammit
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
hmmm i see...
How do I find out what is the absolute path of a website? Are there special functions in Java?
 
Ulf Dittmer
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
In general, just prepend the domain name. For example, if the relative image path is "/images/foobar.gif", then the absolute path is "http://www.yahoo.com/images/foobar.gif".
 
Paul Clapham
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
It's also possible that the HTML has a <base> tag which specifies that the base is something other then the URL of the page itself.

Also, the URL class in Java has methods for producing a URL from a relative path and a base URL.
 
mj zammit
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks very much for all your replies, they have helped me a great deal
I am now having problems getting the contents of a particular html tag (for example i want the img tag with the src's value). I would like to use regular expressions. Does anyone have any ideas?
 
Ulf Dittmer
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Using regexps will cause a variety of problems. I'd use a library like TagSoup or NekoXNI to transform the HTML to XML, and then use the SAX API to work with the XML.
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
reply
    Bookmark Topic Watch Topic
  • New Topic