__/ [Brian Wakem] on Saturday 31 December 2005 19:13 \__
> Roy Schestowitz wrote:
>
>> __/ [Brian Wakem] on Saturday 31 December 2005 18:21 \__
>>> $ ./bot /usr/local/apache2/logs/access_log
>>> Googlebot 10595
>>> Yahoo! Slurp 1326
>>> msnbot 12
>>
>> Thanks a bunch, Brian. Since Perl is double-dutch to me, is there any way
>> of having the above script separate the numbers by day? I only had a
>> shallow look and I suspect the functionality is there, somewhere.
>
>
> It just totals up all the bot hits in the file. Our logs are rotated daily
> so grouping by date has never been an issue.
>
>
> The following should work, though I don't have a multi-day log to test.
>
> The dates will not sort correctly, I haven't got time to write a sub to
> convert into a sortable format.
>
>
>
> #!/usr/bin/perl
>
> use strict;
> use warnings;
> my @bots = ('Googlebot','Yahoo! Slurp','msnbot');
> my %date;
> open (LOG, "< $ARGV[0]") or die "Can't open log ($ARGV[0]) - $!";
> while(<LOG>){
> chomp;
> if (m!\[(\d+/\w+/\d{4}):.*("(.*?)" ".*?"$)!) {
> my $date = $1;
> my $ua = $2;
> foreach (@bots) {
> if (index ($ua, $_) != -1){
> $date{$date}{$_}++;
> last;
> }
> }
> }
> }
> close LOG;
>
> foreach my $date( sort { $date{$b} cmp $date{$a} } keys %date ) {
> foreach my $ua( sort keys %{$date{$date}} ) {
> printf "%-15s%-18s%d\n",$date,$ua,$date{$date}{$ua};
> }
> print "\n";
> }
Thanks again, Brian. I have just tested it. It works brilliantly. It doesn't
show me the numbers I was hoping to see, but it's a valuable tool which I
will definitely use again in the future. I still have (and use) your
"extract URL's from HTML" Perl one-liner. It was tailored for Borek, but
simplified my problems too.
Roy
|
|