How can I trim log files using Perl?

I recently came up with a situation where I need to trim some fairly large log files when they exceed a certain size. Everything except the last 1000 lines in each file is deleted, work is performed every half hour cron. My solution was to just start the file list, check the size and crop if necessary.

for $file (@fileList) { if ( ((-s $file) / (1024 * 1024)) > $CSize) { open FH, "$file" or die "Cannot open ${file}: $!\n"; $lineNo = 0; my @tLines; while(<FH>) { push @tLines, $_; shift @tLines if ++$lineNo < CLLimit; } close FH; open FH, ">$file" or die "Cannot write to ${file}: $!\n"; print FH @tLines; close FH; } 

This works in its current form, but there is a lot of overhead for large log files (especially those that have 100_000 + lines) due to the need to read on each line and, if necessary, change.

Is there a way that I could only read part of the file, for example. in this case, I want to have access only to the last lines of "CLLimit". Since the script is being deployed on a system that saw better days (I think Celeron 700 MHz with 64 MB of RAM), I am looking for a faster alternative using Perl.

+4
source share
3 answers

I understand that you want to use Perl, but if it is a UNIX system, why not use the tail utility to perform trimming? You can do this in BASH with a very simple script:

 if [ `stat -f "%z" "$file"` -gt "$MAX_FILE_SIZE" ]; then tail -1000 $file > $file.tmp #copy and then rm to avoid inode problems cp $file.tmp $file rm $file.tmp fi 

However, you are likely to find this post very useful if you are tuned to use Perl for this.

+8
source

Estimate the average length of a line in the log โ€” call it N bytes.

Look back from the end of the file to 1000 * 1.10 * N (10% margin for a 1.10 error). Read from there, keeping only the most recent 1000 lines.


The question was asked - what function or module?

Does the built-in seek function look like a tool to use?

+4
source

Consider simply using the logrotate utility; It is included in most modern Linux distributions. A related tool for BSD systems is called newsyslog. These tools are designed more or less for their intended purpose: it atomically moves the log file into place, creates a new file (with the same name as before) to store new log entries, instructs the program that generates messages to use the new file, and then (optional ) compresses the old file. You can customize the number of revolving magazines. Here's a potential tutorial: http://www.debian-administration.org/articles/117

This is not the interface you want (keeping a certain number of lines), but the program will most likely be more reliable than what you will cook yourself; for example, the answers here do not relate to atomic file movement and notifying the log program to use a new file, so there is a risk that some log messages will be lost.

+4
source

All Articles