• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Downloading a file from web site

 
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,
i need to download a file from a web site, mimicking the behavior of a user who goes on site and click the link. problem is, the link is not the URL to the file itself, but rather a link that triggers the download (not sure what is the technical name for those. the kind that opens a small window or a tab, then ask you for the open/save option and then closes the window).
simplistically, it seems like i need a facility to click that link and catch the file that is returned by the click.

any idea on how to do that?

thank you,
Ehud
 
Marshal
Posts: 28193
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yeah, that's how a browser does it because it needs interaction with the client to decide what to do with a downloaded file. You don't have to do anything like that and you don't have to interact with a browser to do it.

It's really not complicated at all. Here's a tutorial with an example: Reading Directly from a URL.
 
Sheriff
Posts: 67746
173
Mac Mac OS X IntelliJ IDE jQuery TypeScript Java iOS
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Ehud Kaldor wrote:the kink that opens a small window or a tab, then ask you for the open/save option and then closes the window).


"Kink"?

Such "links" aren't special -- they're just a link like any other. The behavior of opening up a Save Dialog is browser behavior; not anything specified in the HTML.

[Edit: Paul beat me to it!]
 
Ehud Kaldor
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Bear Bibeault wrote:

Ehud Kaldor wrote:the kink that opens a small window or a tab, then ask you for the open/save option and then closes the window).


"Kink"?

Such "links" aren't special -- they're just a link like any other. The behavior of opening up a Save Dialog is browser behavior; not anything specified in the HTML.

[Edit: Paul beat me to it!]



was meant to be "the kind that opens a small window". corrected.
 
Ehud Kaldor
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
here is the code:



i am trying to download a binary .iso file. this is the response console i get:


if i paste the URL into the browser it pops the download dialog. why is it saying it is http/text? why is length=0?
 
Paul Clapham
Marshal
Posts: 28193
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Perhaps you haven't done something the browser would have done, like authenticating or returning a cookie to the site or something like that.

why is it saying it is http/text?



Where does it say that?

why is length=0?



The content length header is optional.

By the way you said you were downloading a binary file. So why are you using a Reader to copy the data? Readers are for text, not binary data.
 
Ehud Kaldor
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Paul Clapham wrote:Perhaps you haven't done something the browser would have done, like authenticating or returning a cookie to the site or something like that.

why is it saying it is http/text?



Where does it say that?

why is length=0?



The content length header is optional.

By the way you said you were downloading a binary file. So why are you using a Reader to copy the data? Readers are for text, not binary data.



meant text/html. typing without thinking.
why do you think i am using a reader? the InputStreamReader and BufferedReader in the imports are remnants of prior iterations, and are not sued in the code itself (and already removed in code file).
 
Sheriff
Posts: 22783
131
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Ehud Kaldor wrote:


That last line should be fos.write(buffer, 0, tempCount);. You've only read tempCount bytes, which may be any number between 0 and buffer.length. If it's smaller, and for the last block it probably is, you don't want to write everything in the buffer - only the newly read data.
 
Ehud Kaldor
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Rob Spoor wrote:

Ehud Kaldor wrote:


That last line should be fos.write(buffer, 0, tempCount);. You've only read tempCount bytes, which may be any number between 0 and buffer.length. If it's smaller, and for the last block it probably is, you don't want to write everything in the buffer - only the newly read data.



Thanks, Rob,
but i have not reached that problem yet
URLConneciton.getContentLength() returns 0, so i get an empty file created. tried to set the default authenticator, but to no avail.
 
Paul Clapham
Marshal
Posts: 28193
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
So why did you try to set an authenticator?
 
Ehud Kaldor
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Paul Clapham wrote:So why did you try to set an authenticator?


the site is credential protected. tried http://user:pass@site.com, tired authenticator, still getting contentLength==0. i brought that up as it might be the issue.
i tried it on another, non-protected site, pointing to the page file (http://web.site.com/index.html) and it worked. on downloadable files in the site i am getting 403. what am i doing wrong here?

 
Paul Clapham
Marshal
Posts: 28193
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Ehud Kaldor wrote:

Paul Clapham wrote:So why did you try to set an authenticator?


the site is credential protected.



Okay. But the authenticator you used works with basic authentication. Does the site you are accessing use basic authentication, or does it use its own internal authentication process?
 
Rob Spoor
Sheriff
Posts: 22783
131
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Perhaps Apache's HttpClient is a better solution, especially if you need to login through an HTML form.
 
Ehud Kaldor
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Paul Clapham wrote:

Ehud Kaldor wrote:

Paul Clapham wrote:So why did you try to set an authenticator?


the site is credential protected.



Okay. But the authenticator you used works with basic authentication. Does the site you are accessing use basic authentication, or does it use its own internal authentication process?



good point. internal.
 
Ehud Kaldor
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Rob Spoor wrote:Perhaps Apache's HttpClient is a better solution, especially if you need to login through an HTML form.



i will give it a try.
 
Do the next thing next. That’s a pretty good rule. Read the tiny ad, that’s a pretty good rule, too.
a bit of art, as a gift, that will fit in a stocking
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic