Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: Yahoo slurp acting strangely

  • Subject: Re: Yahoo slurp acting strangely
  • From: Philip Ronan <invalid@invalid.invalid>
  • Date: Wed, 09 Feb 2005 12:02:40 +0000
  • Newsgroups: alt.internet.search-engines
  • References: <BE2F8AB5.2A9BC%invalid@invalid.invalid> <cucrui$2mh4$1@godfrey.mcc.ac.uk>
  • User-agent: Microsoft-Outlook-Express-Macintosh-Edition/1.0.0
  • Xref: news.mcc.ac.uk alt.internet.search-engines:55085
Roy Schestowitz wrote:

> Yahoo Slurp often make traversal mistakes on my site, but it is not what you
> describe. They decent to wrong levels when presented with the Apache
> interface to directory contents. Are you sure it is always Slurp?

Just Slurp. Which I guess makes it less likely the problem is caused by
scrambled links from another website. I certainly haven't seen these
requests from anywhere else.

> Also, are
> the paths they look for somehow meaningful to you? Is it possible that they
> all come from a different site? Maybe search these names and find the
> commonality...

Well... I searched Google for "inurl:FukumoA" and the results are all on the
same website (a Japanese ISP). My website is about Japanese translation, so
there's some sort of connection.

Another filename that got slotted in for no reason is "kataoka.htm". This
also appears on a lot of Japanese websites.

OTOH, "inurl:v242b" throws up a couple of completely unrelated websites.

I thought for a while that maybe the problem is happening because my site is
using a shared IP address. In other words, there are other websites on the
same server at the same IP address, and the server needs HTTP requests with
a "Host" header to work out which site is being requested. Maybe Yahoo is
getting confused that way. But so far all the sites that could have provided
these URL fragments seem to be on completely different servers.

I really don't know what's going on. If only Yahoo could at least reply to
my email... :-(

phil [dot] ronan @ virgin [dot] net

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index