• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Paul Clapham
  • Ron McLeod
Sheriffs:
  • Jeanne Boyarsky
  • Liutauras Vilda
Saloon Keepers:
  • Tim Holloway
  • Carey Brown
  • Roland Mueller
  • Piet Souris
Bartenders:

How to improve in file access

 
Ranch Hand
Posts: 193
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Good day,

I'm working on legacy portal , which allow users to access and view the activity announcement, the announcement portion was structure in the way from landing page (image - 1mb file size). When user click on the image, it will show the pdf file(multiple pages, mostly powerpoint slide conversion).

Each pdf file size could reach out nearly 5-10mb, the performance turn horrible or sometimes server halt due to concurrent users access to the same information or pdf content.

This application serving few hundred users, users can access at the same time when activity announced, and developed under struts framework.

My intention is to fine tune this legacy application, and seeking Guru advice what is the best approach to improve this concern especially in situation to handle file size reduction or better approach to handle this kind of announcement activity .


PS : CDN is not allow due to confidential content

Many thanks for your advice!
 
Bartender
Posts: 15743
368
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
First, you need to perform a stress test on your application, and profile it to see which part of it is causing the delay. There is no point in reducing the file size if the delay is caused by something unrelated.

Regardless, if your problem IS caused by a combination of file size and number of concurrent requests, a first step might be to try a higher compression ratio for your PDF. There is a limit to how much you can compress data before it becomes lossy though.

Finally, it might just be an issue of server resources or limited bandwidth. If you can't host your file in the cloud, then you must invest money to scale up network and server capabilities.

Note that hosting in the cloud does not necessarily mean hosting your file on a CDN without any form of protection. You can write a small application that does nothing more than authenticate a user, and then serve the file. The application can then be hosted in the cloud, and redirected to from your legacy application.
 
Nakataa Kokuyo
Ranch Hand
Posts: 193
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Stephan van Hulst wrote:First, you need to perform a stress test on your application, and profile it to see which part of it is causing the delay. There is no point in reducing the file size if the delay is caused by something unrelated.

Regardless, if your problem IS caused by a combination of file size and number of concurrent requests, a first step might be to try a higher compression ratio for your PDF. There is a limit to how much you can compress data before it becomes lossy though.

Finally, it might just be an issue of server resources or limited bandwidth. If you can't host your file in the cloud, then you must invest money to scale up network and server capabilities.

Note that hosting in the cloud does not necessarily mean hosting your file on a CDN without any form of protection. You can write a small application that does nothing more than authenticate a user, and then serve the file. The application can then be hosted in the cloud, and redirected to from your legacy application.



Thanks a lot Stephan, very detail.

For file compression, would it advisable to convert to jpg instead, the size and quality would it better than compressed pdf.

Thanks again!!
 
Stephan van Hulst
Bartender
Posts: 15743
368
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
How do you figure?

If your PDF consists mostly of text, there is no way a JPG would be smaller than a well-compressed PDF.

If your PDF consists mostly of images, then converting it to JPG could reduce its size, but then you might consider just reducing the quality of the images inside the PDF instead, and then once again keeping the file as PDF.

However, step 1 is identifying what is causing the latency in the first place. If you don't do this first, reducing the PDF file size is just like slapping a band-aid on a big gaping wound.
 
Saloon Keeper
Posts: 28748
211
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Stephan van Hulst wrote:If your PDF consists mostly of text, there is no way a JPG would be smaller than a well-compressed PDF.



I would not guarantee that myself. Depends on the document and its format.

Still, the #1 rule for optimization is NEVER optimize what you ASSUME is the problem. Always measure to find out where the problem really is. You will often be surprised!
 
Nakataa Kokuyo
Ranch Hand
Posts: 193
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks Stephan and Tim for your warm replies.

The root cause is network latency, while we requesting infrastructure upgrade (normally it take long time).

Is it possible for us to limit maximum user to access the same file concurrently? how possible the code handle in this situation.

Thanks again.
 
Tim Holloway
Saloon Keeper
Posts: 28748
211
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Network latency is something that you need to take up with your network administrator. Since most networks run gigabit or faster rates in-house these days, actually transferring 5-10MB shouldn't be an unreasonable delay. Heck, these days, just ordinary web page viewing can easily suck down that much data for each and every desktop in the shop*. Which is a sad commentary on most websites, but nonetheless true.

But if you are repeatedly constructing the document from scratch, THAT can hurt. It can take a relatively long time to assemble a PDF or image from scratch, so avoid doing that. Especially if multiple requesters are demanding the CPU at the same time to create the same thing at the same time. Create it once and then cache it. There are all sorts of cache mechanisms available, both for in-memory caching and online storage caching. For good measure, make sure that your document response requests client-side caching as well.

---
* I just did a quick check of a typical Ranch page. 1.58MB for the one page. Fortunately, a lot of it is client cached.
 
Marshal
Posts: 28425
102
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Tim Holloway wrote:But if you are repeatedly constructing the document from scratch...



I'm sure this must be the case, although the OP didn't say so. Otherwise why would there be a problem with concurrent downloads by different users?
 
Tim Holloway
Saloon Keeper
Posts: 28748
211
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Paul Clapham wrote:I'm sure this must be the case, although the OP didn't say so. Otherwise why would there be a problem with concurrent downloads by different users?

Very probably, though whether it's necessary (documents are customized) or not, is also not known here.

But I''ve learned not to take certain assumptions for granted. So I wanted to make the point plain.
 
Nakataa Kokuyo
Ranch Hand
Posts: 193
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Tim Holloway wrote:Network latency is something that you need to take up with your network administrator. Since most networks run gigabit or faster rates in-house these days, actually transferring 5-10MB shouldn't be an unreasonable delay. Heck, these days, just ordinary web page viewing can easily suck down that much data for each and every desktop in the shop*. Which is a sad commentary on most websites, but nonetheless true.

But if you are repeatedly constructing the document from scratch, THAT can hurt. It can take a relatively long time to assemble a PDF or image from scratch, so avoid doing that. Especially if multiple requesters are demanding the CPU at the same time to create the same thing at the same time. Create it once and then cache it. There are all sorts of cache mechanisms available, both for in-memory caching and online storage caching. For good measure, make sure that your document response requests client-side caching as well.

---
* I just did a quick check of a typical Ranch page. 1.58MB for the one page. Fortunately, a lot of it is client cached.



Thanks Tim and all,
very good point, i need to explore on caching mechanism, currently if 1000 users access to same content 5mb pdf (it was converted from powerpoint slides,it contains images and text), the system just halt due to network unable to cater 5GB bandwidth (5mb x 1000), hence the tomcat system require to restart to release the connections.
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
reply
    Bookmark Topic Watch Topic
  • New Topic