Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: Yahoo slurp acting strangely

Philip Ronan wrote:

> Roy Schestowitz wrote:
> 
>> Yahoo Slurp often make traversal mistakes on my site, but it is not what
>> you describe. They decent to wrong levels when presented with the Apache
>> interface to directory contents. Are you sure it is always Slurp?
> 
> Just Slurp. Which I guess makes it less likely the problem is caused by
> scrambled links from another website. I certainly haven't seen these
> requests from anywhere else.
> 
>> Also, are
>> the paths they look for somehow meaningful to you? Is it possible that
>> they all come from a different site? Maybe search these names and find
>> the commonality...
> 
> Well... I searched Google for "inurl:FukumoA" and the results are all on
> the same website (a Japanese ISP). My website is about Japanese
> translation, so there's some sort of connection.
> 
> Another filename that got slotted in for no reason is "kataoka.htm". This
> also appears on a lot of Japanese websites.
> 
> OTOH, "inurl:v242b" throws up a couple of completely unrelated websites.
> 
> I thought for a while that maybe the problem is happening because my site
> is using a shared IP address. In other words, there are other websites on
> the same server at the same IP address, and the server needs HTTP requests
> with a "Host" header to work out which site is being requested. Maybe
> Yahoo is getting confused that way. But so far all the sites that could
> have provided these URL fragments seem to be on completely different
> servers.
> 
> I really don't know what's going on. If only Yahoo could at least reply to
> my email... :-(

Maybe purely by coincidence, Google has been acting strangely (also
scrambling links) on my sites last night. It is apparently not just Slurp
that can be blamed.

It would be nice to separate the error logs into bot/non-bot one day. The
technology is there already. It just needs a Perl script.

Roy

-- 
Roy Schestowitz
http://schestowitz.com

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index