Re: Googlebot and robots.txt

Home	Messages Index

[Date Prev]	[Date Next]	[Thread Prev]	[Thread Next]

Author Index	Date Index	Thread Index

Re: Googlebot and robots.txt

Subject: Re: Googlebot and robots.txt
From: Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx>
Date: Fri, 26 Aug 2005 08:29:39 +0100
Newsgroups: alt.internet.search-engines
Organization: schestowitz.com / Manchester University
References: <justal-E24F8E.06491126082005@nntp-readers.plus.net> <3n7r9bF8mpmU1@individual.net>
Reply-to: newsgroups@xxxxxxxxxxxxxxx
User-agent: KNode/0.7.2

_____/ On Friday 26 August 2005 07:26, [Tonnie] wrote : \_____

> Alan Cole wrote:
>> Is there any way to stop the Googlebot visiting a site quite so
>> often.... I know a robots.txt file can restrict their access to certain
>> pages but I don't want to do that as I still want all pages listed, I'd
>> just rather the Googlebot didn't visit so often as it eats up all of my
>> bandwidth.
>> 
>> It seems to visit my site every few days and manages to crawl through
>> something like 30,000 pages each time it does so.
> 
> Perhapse Google can help:
> 
> http://www.google.com/intl/en/webmasters/bot.html
> 
> 
> 
> Tonnie

Beat you to it. *wink*

To add a little more, I respect Google's approach to this. They ask you to
contact them directly (I assume they reply too) instead of providing a
non-standardised method of communication. 'Extending' the well-agreed-upon
robots.txt sounds like something that other, more vicious companies would
be tempted to do. Hanging 'bits at the end', however, is something that I
suspect one (or more) of the crawlers did. Was it actually Google that
started to support wildcards?

I am not entirely fond of A9/Amazon's siteinfo.xml. Imagine yourself a site
whose top-level directory is filled with different data files for each
individual on-line service. It triggers errors and makes the erection of
new Web sites a daunting and arduous task.

Roy

-- 
Roy S. Schestowitz        "Black holes are where God is divided by zero"
http://Schestowitz.com

[Date Prev]	[Date Next]	[Thread Prev]	[Thread Next]

Author Index	Date Index	Thread Index