__/ [Roy Schestowitz] on Saturday 31 December 2005 18:48 \__
> __/ [Brian Wakem] on Saturday 31 December 2005 18:21 \__
>
>> Roy Schestowitz wrote:
>>
>>> Does anybody know a (preferably free) tool that will extract crawlers
>>> data from raw log files and produce a day-by-day breakdown of traffic
>>> from each search engine? I can see total (aggregated) volumes using
>>> available tools, but they tend to be more (human) visitor-oriented.
>>>
>>> [...]
>>
>>
>> I only monitor the big 3.
>>
>>
>> [code]
>> $ ./bot /usr/local/apache2/logs/access_log
>> Googlebot 10595
>> Yahoo! Slurp 1326
>> msnbot 12
>
> Thanks a bunch, Brian. Since Perl is double-dutch to me, is there any way
> of having the above script separate the numbers by day? I only had a
> shallow look and I suspect the functionality is there, somewhere.
>
> It's important to me as I suspect a certain batch of links (WordPress
> support) encouraged a lot of crawling, but I can't tell to what extent, if
> at all. I haven't kept track of a daily running sum, so I need to look at
> this in retrospect. Visitors and AWStats haven't got this functionality. I
> don't know about Analytics, but I can never use it properly.
Never mind that. I can slice the log files, which is not ideal (somewhat
manual), but should work nonetheless. I just find an editor that can handle
large files or (at second thought) make use of fgrep, something along the
lines of
fgrep /usr/local/apache2/logs/access_log "30/Dec/" >29_dec_log
fgrep /usr/local/apache2/logs/access_log "30/Dec/" >30_dec_log
...
Thanks again,
Roy
|
|