Win a copy of Murach's Python Programming this week in the Jython/Python forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Question on correct use of I/O stream objects.  RSS feed

 
Kuldeep Tewari
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,
I'm using apache Fop to generate pdf document from xml and xsl data, that is stored in a database table. I pass the Inputstreams connected to these database coulmns as arguments to the method, that does the actual conversion.



Now, when I run this code in my local machine this works fine for all file sizes. But, when deployed to our dev-server(separate machine), it renders the pdf very slow for file sizes > 100 KB(the local deployment works almost 10-15 times faster). The database server (another separate machine) is common for both the deployments. I suspect that the slow rendering may be related to how I'm using the I/O stream objects in my code.

So, I'm thinking of doing either of these two changes:

A) Create the BufferedInputStream objects(bufIn and bufXsl) by specifying the bigger buffer size(say 32768) in the constructors;(default buffer size is 4096, I believe)

or

B) Don't create the BufferedInputStream objects(bufIn and bufXsl), and pass the raw Inputstream objects( xmlIn and xslIn) directly to InputSource constructors.


Please advice, if any of the above would be helpful.

Any help will be greatly appreciated.
 
Joe Ess
Bartender
Posts: 9406
12
Linux Mac OS X Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
K Tewari wrote: when deployed to our dev-server(separate machine), it renders the pdf very slow for file sizes > 100 KB(the local deployment works almost 10-15 times faster).


The first rule of optimization is not to do it
The second rule is to identify the bottleneck and concentrate on that. Changing random code is not likely to solve the problem, and may make things worse.
Are the specifications of the two machines similar?
When you say "local deployment" are you testing on the same machine you are running the server on?

 
Kuldeep Tewari
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Joe, thanks for your reply.

The first rule of optimization is not to do it.
The second rule is to identify the bottleneck and concentrate on that. Changing random code is not likely to solve the problem, and may make things worse.

I don't have any profiling tools, just tried by things such as looking at the heap-size before and after the method call, time taken in method execution etc.. But, the problem is that it doesn't get slow in my development machine. So, this info is not very useful and the dev-server where it runs slow is being used for UAT right now, so I can not deploy the app. with these log messages there now.

Are the specifications of the two machines similar?

No. The dev-server is much more powerful than my local dev machine.

When you say "local deployment" are you testing on the same machine you are running the server on?

Most of the time, Yes. But, even when we try accessing this deployment from other machines in intranet, it works fine.


BTW, the code changes I thought of making, made any sense to you?
 
Joe Ess
Bartender
Posts: 9406
12
Linux Mac OS X Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
No. The dev-server is much more powerful than my local dev machine.

But what are the specifications? Are there any other processes running on this production server? What's the load like?
I've noticed that my Solaris servers, though very powerful, are fairly stingy when it comes to allocating a single process a lot of CPU time. This makes them appear slow when compared to when I test on my Windows laptop, but when serving hundreds of users, my Windows machine would burn up, whereas the Solaris machine would churn happily along at 25% load.

K Tewari wrote:BTW, the code changes I thought of making, made any sense to you?


My gut feeling is: probably not. That large buffer still has to be crammed through some tight bottlenecks, like being written to a disk or network connection. Neither of them is going to be able to handle data in chunks much larger than the default buffer size.
 
Kuldeep Tewari
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks Joe,
Maybe you are right in taking into account the location, load and other processes running on the dev-server. Still, I would like to give it a try.

Also, could you please tell the difference(in terms of performance/memory) between my planned new approaches (for say a file-size of several hundred KBs)?

 
Joe Ess
Bartender
Posts: 9406
12
Linux Mac OS X Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Probably negligible.
There's a big difference between not using a buffer and using a buffer. Not so much between one size of a buffer and another.
For example, the worst-case for writing 100k without a buffer is 100k writes. If you add a 4k buffer, your worst case is 25 writes. If you increase the buffer to 30k, your worst case is 4 writes. As you can see, there's a big difference between 100k writes and 25. Not so much between 25 and 4.
 
Kuldeep Tewari
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks Joe.

You are right.

 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!