posted 21 years ago
Performance is a must so I want to know if there is a faster way to do this.
Well, there are going to be a number of different options depending on how you're using this, and how much time you want to spend on more complicated algorithms. I suggest trying the simplest solutions first, then moving to more complex stuff if the performance is not good enough.
How big are the files you want to read? How much memory do you have? Would it be feasible to store all the lines of the file in RAM as an ArrayList of Strings, for example? This is pretty simple:
Now you can easily access any line you want to. The only down sides are, it may take too much memory, and also it takes a bit of time to load everything in memory at the beginning. Once it's loaded though, you've got very fast access.
I note that you're already loading all your data into a byte[] (or trying to), so I gather memory isn't too much of a problem. Yet.
To save memory, instead of storing everything as Strings, you could just store a long representing the offset in bytes of each line. This would be similar to what you've already coded - but you save every line's posiiton, not just the last one. Then when you need to retrieve a particular line, you can skip() (or seek()) to that position. You could even save even more memory by saving every 10th offset, or every 100th. So tp read line 3745, look up the position of line 3700, jump there, and read 45 lines to get to 3745. Obviously this is more complex, and trades performance for memory - but it's a possibility.
If you just want to read the last line, there are faster methods to do this, where you start reading bytes from the end, not the beginning. (Need a RandomAccessFile or FileChannel for this, not an InputStream.) In this case you never need to read most of the file at all. But it's only useful for the last line, or the second to last line, or the nth to last line. If you want line 3745, unless you knw in advance that there are really only 3747 lines and thus you can read 3 lines from the back - you probably need to read 3745 lines from the beginning. Because there's usually no way of knowing the line count without looking at the whole file, to see how many \n there are. So these special algorithms for reading lines from the end are of limited use, and they're rather complex, so I doubt you want to do this unless performance is really a problem.
[B][/B]
The APIs for RandomAccessFile and InputStream do not guarantee that the method will actually read the full length of bytes you have requested. So you need to check the return value and loop to make sure you've really filled the array:
Why was this different on RandomAccessFile? Well, you were lucky. Neither class guarantees a full read, but for the implementation and platform you were using, and for the particular file and I/O hardware you were dealing with, RandomAccessFile delivered a full read while FileInputStream did not. But in another situation, you might get very different results. So you always need to check the return value and do some sort of loop to make sure you've read everything you needed.
Or, RandomAccessFile also offers a different method, readFully(byte[]). This does exactly what you'd expect. But it's not available on FileInputStream. Your choice.
"I'm not back." - Bill Harding, Twister