• Post Reply Bookmark Topic Watch Topic
  • New Topic

Grabbing a pdf file returned in browser  RSS feed

 
Al Grant
Greenhorn
Posts: 10
Eclipse IDE Firefox Browser
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Evening all,

I am looking for suggestions on ways to capture 2 pdfs and send them to the default printer....this is the situation:

The java app tests if process xyz.exe is running (xyz.exe is a database application I have no control over - it has a list of customers) and if so, using send-keys makes the application xyz.exe query a customer.

From this query a few more keys are sent to request a report. Application xyz.exe sends a request to another server, and, i think using a certificate and some other complicated security stuff I am not priivy to then gets the report as a pdf displayed in a IE browser window, which is opened.

Note the URL of the report doesnt contain any ID numbers so cant be predicted in that way.

To capture the PDF I see I could try doing it 2 ways:
1. Java applet listens on socket 80 to get the PDF from the HTML
I am not sure how I would know where the start and the end of the file is, and this way could be quite complicated since it would need to let all the traffic through except the file?

2. Manipulate IE with one of the libraries to grab the file from the IE browser window.
Not as much fun as playing with sockets but might be easier?

After this PDF is saved/printed, the process is repeated for a second report.

Love to hear thoughts on how to tackle this.

Regards,

-Al
 
Tim Holloway
Saloon Keeper
Posts: 18792
74
Android Eclipse IDE Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to the JavaRanch, Al!

I think you have some confusion there. First and foremost, you seem to be saying that a web browser is fetching a PDF and displaying it in a browser window. Unless someone has changed something I don't know about, that cannot happen if the browser in question is Internet Explorer. Microsoft was sued for half a billion dollars over that capability and lost, and was required many years ago to modify IE so that instead of displaying PDFs in an IE window, PDFs could only be downloaded or opened in a PDF reader application.

Secondly, you seem to be expecting to run an applet as a web server (that's basically what listening on port 80 is). There are 2 problems with that idea. First, unsigned applets cannot open ports to listen on. You'd have to sign the applet and then manually enable it for each client.

Secondly - and more importantly - any port lower than 4096 is considered "magic" and cannot be opened by any program at all unless that program has Administrator privileges. Ordinary Windows users can't do so.

As for figuring out where the document data boundaries are, that's the least of your worries. When you transmit a file's contents via HTTP (for example as an upload), the data is MIME-encoded and the standard mandates wrapping start/end markers around the data. When you make an HTTP request to a web server and the response is a PDF, you'll get HTML back that included a Content-Length header that tells you how long the document is and you get an end-of-data when reading the response stream. End-of-data is the official notice that the document has been received, not all apps know the content length before the document has been sent (for example, if the document is being produced on-the-fly). As an aside, however, sending PDFs without a valid content length header has been known to annoy web clients, so it's a good idea for the server to provide one.
 
Dave Tolls
Ranch Foreman
Posts: 3056
37
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Microsoft Edge is not only a browser...it is also the default PDF viewer in Windows 10.
 
Al Grant
Greenhorn
Posts: 10
Eclipse IDE Firefox Browser
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Tim

Thanks for the detailed reply.

Tim Holloway wrote:Welcome to the JavaRanch, Al!

I think you have some confusion there. First and foremost, you seem to be saying that a web browser is fetching a PDF and displaying it in a browser window. Unless someone has changed something I don't know about, that cannot happen if the browser in question is Internet Explorer. Microsoft was sued for half a billion dollars over that capability and lost, and was required many years ago to modify IE so that instead of displaying PDFs in an IE window, PDFs could only be downloaded or opened in a PDF reader application.



A little more information - we are still on XP IE7 - I just checked and I definitly are viewing PDFs in IE, but its via a Adobe Reader plugin by the looks.

Tim Holloway wrote:

Secondly, you seem to be expecting to run an applet as a web server (that's basically what listening on port 80 is). There are 2 problems with that idea. First, unsigned applets cannot open ports to listen on. You'd have to sign the applet and then manually enable it for each client.

Secondly - and more importantly - any port lower than 4096 is considered "magic" and cannot be opened by any program at all unless that program has Administrator privileges. Ordinary Windows users can't do so.

As for figuring out where the document data boundaries are, that's the least of your worries. When you transmit a file's contents via HTTP (for example as an upload), the data is MIME-encoded and the standard mandates wrapping start/end markers around the data. When you make an HTTP request to a web server and the response is a PDF, you'll get HTML back that included a Content-Length header that tells you how long the document is and you get an end-of-data when reading the response stream. End-of-data is the official notice that the document has been received, not all apps know the content length before the document has been sent (for example, if the document is being produced on-the-fly). As an aside, however, sending PDFs without a valid content length header has been known to annoy web clients, so it's a good idea for the server to provide one.


So it sounds like that approach is quite hard, and potentially impossible re the information about port numbers.

What about interacting with the Adobe reader plugin?

Regards,

-AL
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!