• Post Reply Bookmark Topic Watch Topic
  • New Topic

reading from the end of file

 
Asher Tarnopolski
Ranch Hand
Posts: 260
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hey folks,
i have a huge log file (asci text) but i need to read in it only a number of last lines (the logs from the last day). i pretty hate all io stuff
didn't find any method in the api which lets me read lines from the end. is it any way to do it without reading all of the lines from the beginning?
thanks.
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, you could use a RandomAccessFile, and use the seek() method to skip to a position you think is probably just before the lines you're interested in. You have to guess how far to skip ahead, and you have to write the code in some sort of loop that will try again if it turns out you've skipped to far ahead in the file. Immediately after skipping to a position, you really have no idea where you are in a given line - so I wouldn't even try to parse that line; I'd just do a single readLine() to skip to the beginning of the next line. Ignore the String returned by this call, and do another readLine() to read the first full line.
How do you know if you've gone too far? If you're looking for the last n lines in a file, you'll have to read all the lines from your skip point to the end, and see how many there are. if it's less than n, you'll have to skip to an earlier point and try again. Maybe you'll want to save the lines read so far in a List of some sort, so you won't need to read them again.
Alternately, if your goal is something like "read all log entries from the last 24 hours", and each log entry has a timestamp, then you have other options. After you use seek() and readLine() to get to the beginning of a new line, read a full line. Parse that line to determine its timestamp. Is it more than 24 hours in the past? Then just keep reading forward until you cross the 24-hour line. If you're way off, you can use seek() again to skip ahead quickly. If you accidentally seek() to a position less than 24 hours in the past, you'll need to use seek() again to find another position earlier in the file.
Now, for all the logic that will go into getting something like this to work effectively, it may well be easier to just read every line. That's probably what I'd try first - once you establish that this is too slow, you can work with alternatives such as I've suggested above.
Incidentally, for speed you can also use the skip() method of an InputStream or Reader to move to a particular position in a file. This will be a lo faster than a RandomAccessFile, but it's more difficult to move backwards if necessary. To do it you basically will have to discard the old InputStream and create a new one. I'd work out the logic with the RandomAccessFile first, then go to the InputStream when all the other bugs are worked out.
Note that if the file encoding is different from the system default encoding, and especially if it's something like UTF-8 or UTF-16 which uses multiple bytes per character, you'll have additional complications to work out when jumpin around inside a file. Something to be aware of.
Also, Java SDK 1.4 contains new IO classes that could well be more efficient than what I've discussed so far. The map() method of FileChannel may well be useful to you. However you'll still have to address the issue of what happens if the region you map turns out to not contain all the records you're interested in.
Enjoy...
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!