• Post Reply Bookmark Topic Watch Topic
  • New Topic

Fast Access in Binary Files  RSS feed

 
jones
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello everyone,
I've never dealt with binary files at all, so I'm just looking for a place to get started (and to find out if I'm even barking up the wrong tree).

Hypothetical (*simplified*) example...
Let's say that starting at 00:00 on 4/18/08, I will start recording the number of users accessing at a web site once every minute. I need to store this data for long term retrieval, and I need very fast access to it to answer questions like.. "How many visitors were there at each minute interval between 5/1/08 08:30 and 5/5/08 15:05?".

I cannot use a database for a number of reasons, but mainly because it will be too much data for a standard rdbms in our environment (billion+ rows). Remember, the above example is simplified, in reality there's more than one measure being recorded, etc.

So without knowing a darn thing about binary files, here's my bonehead idea..
I would have a single file for each type of measure. At every minute (or whatever the interval may be), I would append a value (a double). I would not store any date/timestamp information with it, because it would be implied. I would know the timestamp of the very first value, so from there, when a request is made for a specfic value at a specific time, I would be able to calculate the offset in number of bytes (because i know the # of minutes between the requested time and the starting time) and grab that value. In other words, I would just grab bits 50000000-50000016. Something like that.

So, since these files would get quite large, I wouldn't want to read through the entire file just to grab one value at the end. Is it possible to do? If this method is possible, what's the theoretical correlation between file size and access speed? Is it, the bigger the file is, the slower it will get? Or is the impact relatively minimal?

I searched around first and didn't see a good resource to answer these sorts of questions before I spend a lot of time learning how to try it and to see if it works. So I would appreciate anyone telling my if I'm on the right track, or if not, any other suggestions. And maybe a link to some info for noobs that might help me out?

Thanks!!
 
Joe Ess
Bartender
Posts: 9426
12
Linux Mac OS X Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to the JavaRanch.
You must have missed our naming policy on the way in. In short, your displayed name must be a real-sounding first and last name, separated by a space. You can change it here.
As for your question, that is more or less the reason for the RandomAccessFile class.
 
Don't get me started about those stupid light bulbs.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!