On Thu, 10 Feb 2005 01:41:23 +0000, Roy Schestowitz
<newsgroups@schestowitz.com> wrote:
>Philip Ronan wrote:
>
>> Roy Schestowitz wrote:
>>
>>> Yahoo Slurp often make traversal mistakes on my site, but it is not what
>>> you describe. They decent to wrong levels when presented with the Apache
>>> interface to directory contents. Are you sure it is always Slurp?
>>
>> Just Slurp. Which I guess makes it less likely the problem is caused by
>> scrambled links from another website. I certainly haven't seen these
>> requests from anywhere else.
>>
>>> Also, are
>>> the paths they look for somehow meaningful to you? Is it possible that
>>> they all come from a different site? Maybe search these names and find
>>> the commonality...
>>
>> Well... I searched Google for "inurl:FukumoA" and the results are all on
>> the same website (a Japanese ISP). My website is about Japanese
>> translation, so there's some sort of connection.
>>
>> Another filename that got slotted in for no reason is "kataoka.htm". This
>> also appears on a lot of Japanese websites.
>>
>> OTOH, "inurl:v242b" throws up a couple of completely unrelated websites.
>>
>> I thought for a while that maybe the problem is happening because my site
>> is using a shared IP address. In other words, there are other websites on
>> the same server at the same IP address, and the server needs HTTP requests
>> with a "Host" header to work out which site is being requested. Maybe
>> Yahoo is getting confused that way. But so far all the sites that could
>> have provided these URL fragments seem to be on completely different
>> servers.
>>
>> I really don't know what's going on. If only Yahoo could at least reply to
>> my email... :-(
>
>Maybe purely by coincidence, Google has been acting strangely (also
>scrambling links) on my sites last night. It is apparently not just Slurp
>that can be blamed.
>
>It would be nice to separate the error logs into bot/non-bot one day. The
>technology is there already. It just needs a Perl script.
>
>Roy
Gee....who do we know could do that?
BB
--
www.kruse.co.uk SEO@kruse.demon.co.uk
home of SEO that's shiny!
--
|
|