posted 18 years ago
Hm, Stan seems to be making several assumptions about how line terminators are being processed here, and I'm not sure they're warranted. I think the poster needs to determine whether things like line terminators will be changed during processing, and whether a new line terminator may be added at the end of each split file. Offhand I don't think either of those is necessary, though they may be desirable, or not, depending what this application is for.
If you're concerned about time, then counting lines in a file does ultimately require that you read each and every byte of the file to discover if it's a line terminator (or part of one). If you're going to do that, you might as well also use some sort of checksum to verify that all the data is valid, not just the number of lines. The time spend calculating a checksum should be small compared to the time spend reading from the file in the first place. If you want something less reliable but much faster, and if you're not changing line terminators (something I typically find unnecessary or even undesirable anyway) then simply adding up the total file sizes should work reasonably well.
"I'm not back." - Bill Harding, Twister