• Post Reply Bookmark Topic Watch Topic
  • New Topic

IO Performance  RSS feed

 
Steve Grant
Ranch Hand
Posts: 106
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dear Sir,
I am using BufferedOutptStream class to write byte array to a file. Is there any other way to write to the file in a more effecient way in java
. i have my own class called VFSIOService which has the methods addFile and copyFile which are responsible for writing and copying. This class is called by a stateless session ejb VFSFacade. Many users will be uploading their files by calling addFile method on the EJB and also copy their files from source to destination.
The addFile method receives a byte array . Here I am using BufferedOutputStream to write this byte array to the file on hard disk. Is this right approach or is there any other approach which could improve the performance. Following is the sample code of addFile:
addFile(FileTO fileTO) throws Exception
{
BufferedOutputStream out = new BufferedOutputStream( new FileOutputStream("c:/tmp.txt"));
byte b [] = fileTO.getFileBytes();
out.write(b);
out.close();
}
Similarly I am using BuffereInputStream n OutputStream for copying a file from source to destination. My project will be on linux system and so i was thinking about using Runtime.exec method to call linux cp (copy) command which will be resposible for copying the file from source to destination. I thought of this bcuz this will be much faster than using BufferedInputStream n OutputStream . Is there any harm of using above approach when multiple users will be calling the above methods and using Runtime.exec in that scenario.
thx & rgds
Siddharth K
 
Ilja Preuss
author
Sheriff
Posts: 14112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you are using JDK1.4+, using Java NIO could give you better performance. See http://java.sun.com/j2se/1.4.2/docs/guide/nio/
 
David Harkness
Ranch Hand
Posts: 1646
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Siddharth Kirad:
I am using BufferedOutptStream class to write byte array to a file. Is there any other way to write to the file in a more effecient way in java[?]

Buffered streams are useful when you'll be reading and writing small bits at a time. The performance improvement comes by holding the data from several write calls in a buffer and making one write to the underlying stream. The same is done for reads, only the first call reads a larger chunk and hangs onto the result, doling it out to each subsequent read until the buffer is exhausted, at which point it reads again.
Since you're writing the entire contents of the file in one call, buffering is actually slowing you down. You call write(buf) which copies buf into its own internal buffer, and then writes that buffer to the file. Since the whole point of buffering is turning several write calls into one, it cannot help the case of a single write call.
The same goes for copying if you're reading the entire file in one go. If you were performing a block copy (copy the file in n-size chunks one at a time), buffering would help if you chose a small block size and a larger buffer size (e.g. 1k and 4k). In your case, however, you're being hindered by the buffer in the same way as for writing: you're creating an unnecessary array allocation and copy operation.
In fact it's worse than this. You're actually turning your one write call into multiple write calls, unless you're using a buffer at least as large as the file size, which you probably aren't. It takes you full buffer and copies it a chunk at a time into its buffer, writes it out, and continues.
Since you have the whole file, let the file system work out how best to write it to disk.
 
Steve Grant
Ranch Hand
Posts: 106
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dear Sir,
I didnt get ur point bcuz i am calling the same write(byte[],offset,len) method of BuffereOutptStream class which does the writing
Do u mean to say that increase the size of the buffer of a buf protected field in BuffereOutputStream class of java.io.
Following is the addFile and copyFile code::;

thx,
siddharth K
[Added [ code ] tags - Jim]
[ December 28, 2003: Message edited by: Jim Yingst ]
 
Steve Grant
Ranch Hand
Posts: 106
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dear Sir,
Do u mean to say that i should have a buffer whose size will be that of the file size which is to be written to the hard disk.
thx ,
siddharth
 
David Harkness
Ranch Hand
Posts: 1646
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Siddharth Kirad:
I didnt get ur point bcuz i am calling the same write(byte[],offset,len) method of BuffereOutptStream class which does the writing
Do u mean to say that increase the size of the buffer of a buf protected field in BuffereOutputStream class of java.io.

I looked into the source code for FileOutputStream and BufferedOutputStream, and I was incorrect in my analysis but still arrived at the correct solution. As it turns out, BOS will bypass its internal buffer if the number of bytes to write from the byte array you give it is larger than its buffer. Since you are writing the entire contents of the file in one call, there are two possibilities:
  • The file is smaller than the buffer size. In this case, the file contents are copied from the array to the buffer and then the buffer is written to the file.
  • The file is larger than the buffer size. In this case, the file contents are written directly to the file from the byte array, bypassing the buffer.

  • In the first case, you suffer a needless byte array copy operation, whereas in the second case there is no penalty. Since neither case gives you a gain, and both cases require creating a BOS and its buffer, you're hurting performance slightly with no benefit. I haven't looked at the code for BufferedInputStream, but I'd bet it's the same.
    So in the end, buffering only helps if you're reading/writing chunks that are smaller than the buffer size.
    Now a separet warning. In you code you have

    If START_OFFSET is anything other than zero, you'll get an ArrayIndexOutOfBoundsException as the third parameter to write() is the number of bytes to write -- not the ending index. Say the array is ten bytes and START_OFFSET is 5. You'll be trying to write bytes 5 through 14 of the array, yet the array only has indexes 0 through 9. It should be fixed as

    Also, if the method filtTO.getFileBytes() does anything other than simply return a member array (for example, it reads the file's contents), you will want to store the result into a temporary variable instead of calling the method twice. To wit:

    Is that clearer?
     
    Steve Grant
    Ranch Hand
    Posts: 106
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Dear Sir,
    I got ur point .So if i make the buffering value to 0 then it will bypass the bufferedoutputstream . so i feel this is this code for doing the io ok .
    thx,
    siddharth K
     
    Ilja Preuss
    author
    Sheriff
    Posts: 14112
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    This is still inefficient, as the OS will read the file content into its own buffer, you will copy it to your programs byte array, then copy it to the OS's buffer for the output stream. Using NIO, you wouldn't need the intermediate array, if I remember correctly.
     
    Steve Grant
    Ranch Hand
    Posts: 106
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Dear Sir,
    My fileTO.getBytes() just returns the byte array which is holded by the byte array varaible of FileTO class . This byte array will ahve the file contents in it which will be passed by the client .And yes u may be right that using nio the performnace will improve but we using WebSPhere 5.0 which supports only Jdk1.3 and which will not allow me to use nio package in it .so i am sticking to this java io. I was thinking to use FileOutputStream which will directly write the contents to the file b cuz i get the whole byte array from the client ,
    Thx & rgds ,
    siddharth K
     
    David Harkness
    Ranch Hand
    Posts: 1646
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Originally posted by Siddharth Kirad:
    I was thinking to use FileOutputStream which will directly write the contents to the file b cuz i get the whole byte array from the client

    Yes, since you have the entire aray, stick to use FIS and FOS and bypass the buffered streams.
     
    Steve Grant
    Ranch Hand
    Posts: 106
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    dear sir,
    I was thinking of using c or c++ library instead of java io to write the bye array . will it give me good performance as compared java io. my project will be implemented on SAN(storage area network) .

    thx & rgds,
    siddharth
     
    William Brogden
    Author and all-around good cowpoke
    Rancher
    Posts: 13078
    6
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    I can't imagine any reason why the c or c++ version would be faster. The JVM just calls the operating system write routines so the Java overhead is tiny. Look at the source code for the various java.io classes and you will see code like:

    Bill
     
    Dmitry Melnik
    Ranch Hand
    Posts: 328
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    I was told that random access file i/o on Windows (NT and younger) works noticeably faster when implemented using file-mapping API calls rather than regular file i/o API calls. This might be one of the reasons. I have not verified this myself though, so take it as another rumor
     
    Ilja Preuss
    author
    Sheriff
    Posts: 14112
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Originally posted by Dmitry Melnik:
    I was told that random access file i/o on Windows (NT and younger) works noticeably faster when implemented using file-mapping API calls rather than regular file i/o API calls.

    That is in fact how the NIO API works, if I understand correctly - by memory mapping.
     
    Jim Yingst
    Wanderer
    Sheriff
    Posts: 18671
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Memory-mapping files is a major feature supported by NIO, but it's not the basis for all NIO, or even all file-based NIO. Another major reason NIO performance can be notably faster than traditional IO is that NIO supports the use of ByteBuffers rather than byte[] arrays. These can use native code to allocate memory in contiguous blocks outside the JVM. This can be advantageous because the native code in Channels can access this memory directly rather than going through the JVM, which often means one less memcopy, and greater opportunity to take advantage of platform-specific optimizations. Memory mapping can make things even faster, but it also incurs more overhead to set up in the first place; it's generally recommended for big files, and/or files that you will be accessing a lot. If instead you're dealing with lots of small or medium-size files, it can be better to forgo the FileChannel's map() operation, and just use plain ByteBuffers. Now where exactly this cutoff is can vary greatly from system to system, application to application - best to test and see which is faster, when performance really matters.
    Note that for file copying, FileChannel has a couple methods, transferTo() and transferFrom(), which are particularly convenient, and much faster than any alternatives I've tried. You don't need to figure out what type of ByteBuffer to use, you just tell the system to copy from one FileChannel to another, and the OS may just tell the hardware to take care of it, without necessarily bothering to transfer anything to/from system RAM. (I.e. it can just relay on the hardware's own buffers.) Very fast.

    Error checking & handling omitted for simplicity. (E.g. put close() in fanally, chec to see if destination already exists, etc.) The last line could also be toFC.transferFrom(fromFC); these do the same thing in this case. Two similar methods are provided because for this type of transfer, it's necessary that at least one of the two Channels involved should be a FileChannel - but the other could be some other type, such as a SocketChannel. The two transfer methods allow transfer between a FileChannel and another channel, in either direction. When both channels are FileChannels the methods appear redundant, but they aren't really.
    Anyway, Siddharth - using NIO you may indeed be able to get much better performance from your system. The copyFile() method is easy to improve; the addFile() method is a bit harder to follow without seeing additional code, but I expect you could modify it to deal with FileChannels instead, and probably get better performance. But be sure to measure the performance you're getting - first to see if there's a problem worth spending time on in the first place, and then to test whether "improvements" really have the desired effect. There are plenty of ways you can accidentally make things slower if you're not careful, so test results as you go.
     
    It is sorta covered in the JavaRanch Style Guide.
    • Post Reply Bookmark Topic Watch Topic
    • New Topic
    Boost this thread!