• Post Reply Bookmark Topic Watch Topic
  • New Topic

Shell Script to retrieve only the recently added records from a file.  RSS feed

 
Kishen Singh Punjabi
Ranch Hand
Posts: 71
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I want to run a shell script which retrieves only the newly added records from the last time the script is run.

A log file is created at 12:00 pm and new records get added to it every second.

A shell script is made to run every 30 minutes to retrieve the newly added records.
If the script is run at 12:30 pm, it must only the records between 12:00 and 12:30 pm.
If the script is run at 01:00 pm, it must only the records between 12:30 and 01:00 pm.

Note: The log have the date along time but I do not want to trace the records for the date/time.
Can I achieve something like this using the size of the file.

Any suggestions ?

Thanks!!
 
Greg Charles
Sheriff
Posts: 3015
12
Firefox Browser IntelliJ IDE Java Mac Ruby
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes, the file size would work, but a line count might be easier to work with in a shell script. You could read the lines in a loop, but ignore lines 1 - n. When you finished whatever processing you want to do, you would just store the new value of n. Of course, you'd need a place to store that value. I'd probably put it in a file in the same directory with the log file. I suppose if you were really clever, you could store it at the beginning of the log file itself, and just overwrite it as needed. I'd personally have trouble doing that with a shell script, but it would be easy enough with a C program. The only issue is the process that writes to the log file might not allow your program to get simultaneous write access to it. You'd have to experiment with that.
 
Richard Tookey
Bartender
Posts: 1166
17
Java Linux Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Unless each log file line is the same length I doubt if counting the number of lines is going to be fast enough since to find the 'n'th line you will have to read all the lines up to line 'n'. Saving the length of the file sounds to have problems since you can't be sure that the length gets updated after every log line is written (it is probably not atomic) and one might just get a partial line written.

Depending on the process(es) writing the log file you might be able to use a named pipe to pass the lines directly to a process/server that stores the lines (either on disk or in memory) and then every half hour processes the lines. See http://en.wikipedia.org/wiki/Named_pipe .
 
Kishen Singh Punjabi
Ranch Hand
Posts: 71
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Greg Charles - Thank You for the suggestion.

I cannot overwrite the log file with the newly added records as the log file is being populated from an external system. But I have the read access to the file and can read the records from it but the problem here is am reading all the records everytime which I do not want.

Richard Tookey - Thank You for the suggestion.

Yes I agree with you, counting the number of lines will not be fast enough since to find the 'n'th line ..
I do not have control over the process of writing the records to the log file.
How do I proceed with the size based approach ?
 
Richard Tookey
Bartender
Posts: 1166
17
Java Linux Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
[quote=Kishen Singh Punjabi]
I do not have control over the process of writing the records to the log file.
[/quote]
A named pipe is accessed pretty much as if it is a standard file so the only control one needs to use a named pipe is to be able to control the name of the log file which I would expect to be defined though some configuration system and not hard coded.
[quote]
How do I proceed with the size based approach ?[/quote]

One possible approach - have a separate file that contains only the length of the log file last time it was processed and update this file each time you process the log file. Write a program that uses random access to seek to the last position processed. As I already hinted, there could be problems with having non-atomic operations.

 
Ramakanta Sahoo
Ranch Hand
Posts: 256
Fedora Firefox Browser Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You can check this out I have added few example links for your reference.

Thanks
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!