Win a copy of Kotlin in Action this week in the Kotlin forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

NIO not suitable for small files ?  RSS feed

 
Pho Tek
Ranch Hand
Posts: 782
Chrome Python Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
According to this weblog post by Toby; performance of I/O (e.g. upload) on small files is not improved for small files. Care to comment ?
Pho
 
David Weitzman
Ranch Hand
Posts: 1365
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
That entry seems to refer specifically to memory mapped files. I believe it says specifically in the NIO documentation that memory mapped IO is not efficient for small files. But NIO is good for quite a lot more than memory mapped IO. Reading and writing of files through standard Channels should work just fine.
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Quite right. You can benefit from using a FileChannel rather than FileInputStream or RandomAccessFile, regardless of the size of the file. But you may want to check the size() before you invoke map(). I don't know what size is "too small" to effectively use map(), but you can test the performance on your system, with & without map(), to see.
 
Pho Tek
Ranch Hand
Posts: 782
Chrome Python Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Reading MSDN led me to believe that the virtual memory manager on Windows NT uses 4K pages. Thus it is possibly so that files less than 4K in size would incur a performance penalty when being memory mapped. I guess the only way to find out is write code. Off I go....
 
Michael Zalewski
Ranch Hand
Posts: 168
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
NIO memory mapped files may not be faster than just re-reading for small files.
But that does not mean there are not applications where it could be used.
Consider a set of applications which write to a control file. If I use a memory map for this control file, I don't have to re-read the control file every time I want to check it. I just access a byte in an array.
But if my intent is to simply read out a file, (such as when I download the file to a client), memory mapping may not help until the file becomes large.
 
Ron Hitchens
Author
Ranch Hand
Posts: 30
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Memory mapping is intimately linked to the virtual memory behavior of the OS. These numbers will vary by OS type, version and configuration.
The NIO MappedByteBuffer is a Java object wrapped around a chunk of mapped native memory. Unlike C, where the mmap() system call returns a physical address which can be used directly by the hardware, the JVM must mediate access to the mapped memory. There is also some overhead in setting everything up.
If you're just reading a file sequentially, creating a MappedByteBuffer is unneccessary. Use the read() method of FileChannel and read into a ByteBuffer. That's what those are for. It may take longer to setup the MappedByteBuffer than it would to just read the data.
Under the hood, most OSs use memory mapping to perform I/O anyway. And many do predictive read-ahead buffering to boost throughput. If you're mapping the file yourself you won't get that benefit.
MappedByteBuffers are good for mapping large files and for implementing a sort of persistent, shared memory.
Depending on the OS's virtual memory design, mapping a huge file may not consume any virtual memory space at all (because the virtual memory pages are backed by the file itself). This would allow you to appear to have a humungous data array in memory all at once. The data will be dynamically paged in and out as needed, based on usage. This is similar to accessing a file randomly, but you don't need to seek and read chunks, it appears to all be there all the time.
If multiple parties map the same file, any updates made by any of them are seen instantly by all the others. This could be used as a status area, or scoreboard type of thing for cooperating processes to communicate (not all of which need to be Java, by the way). And the content will remain in the file after all the processes exit.
Mapped files have their uses, but mapping a file is not necessarily a faster way to read it.
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!