Win a copy of Spring in Action (5th edition) this week in the Spring forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
  • Campbell Ritchie
  • Bear Bibeault
  • Devaka Cooray
  • Liutauras Vilda
  • Jeanne Boyarsky
  • Knute Snortum
  • Junilu Lacar
  • paul wheaton
Saloon Keepers:
  • Ganesh Patekar
  • Frits Walraven
  • Tim Moores
  • Ron McLeod
  • Carey Brown
  • Stephan van Hulst
  • salvin francis
  • Tim Holloway

Buffered RAF  RSS feed

Ranch Hand
Posts: 56
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Does anyone here know of any *working* and/or *tested* implementations of a Buffered RandomAccessFile? I know of one from here (but this one seems very's using deprecated API).

Also when does it make sense to buffer I/O off the disk? For instance it my code is trying to make > 10 Million seek() calls to a file - each call fetching a line from a different byte offset - won't buffering worsen the performance, for the simple reason that the same disk I/O now takes one additional step (i.e. buffering) before it delivers the output?

Posts: 9550
Linux Mac OS X Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
This topic is a continuation of a previous topic on seek() vs. seekBytes()

Originally posted by Prashant Sehgal:
(but this one seems very's using deprecated API).

Did you try replacing the depreciated methods with the new API? There's only two calls that are depreciated, both String constructors. And the current version of String has constructors that take a byte array, an offset and a length as arguments.

Also when does it make sense to buffer I/O off the disk?

This is something that will differ from application to application. That is why, in the previous topic, I pointed you to resources to help you understand and quantify what is going on in your application. Without doing some homework and making some measurements, you'll be stumbling around in the dark.
Note that the buffer size in Braf is variable. You may get a substantial performance boost just from setting the buffer size to the approximate size of a record (for you, ~1k). By default, RAF reads a line character by character. Reading the data in a single block would be much faster, and we know you will read 10 million lines, so thats a few billion disk accesses you could save right there.
If you had some grouping of records, where several consecutive indexes would give you hits within several k in the data file, it would make sense to find some balance between loading x records in the buffer and the time+memory it takes to load the buffer. If you don't have some grouping, then making a buffer size greater than 1k would be wasting memory and time. These are factors which need to be tested and tuned. There is no one answer.
Again, the order in which you will gain performance is:
1. Hardware. Without fast hardware, software is slow. Period.
2. Hardware. The cost/performance gain is more justifiable than paying you to tweak code.
3. Hardware. 10 million anything takes time. Double your disk throughput and you will likely double your program's execution speed. Try getting that improvement through tweaking code.
4. Software.
Don't count your weasels before they've popped. And now for a mulberry bush related tiny ad:
Download Free Java APIs to Work with Office Files and PDF
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!