Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: Google only spiders the robots.txt

  • Subject: Re: Google only spiders the robots.txt
  • From: Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx>
  • Date: Fri, 18 Nov 2005 11:48:34 +0000
  • Newsgroups: alt.internet.search-engines
  • Organization: schestowitz.com / MCC / Manchester University
  • References: <437da503$1$25200$dbd45001@news.euronet.nl> <BFA36B25.3B116%invalid@invalid.invalid>
  • Reply-to: newsgroups@xxxxxxxxxxxxxxx
  • User-agent: KNode/0.7.2
__/ [Philip Ronan] on Friday 18 November 2005 11:25 \__

> Weird: according to the log files googlebot visits my site every day, has a
> look at the robots.txt and leaves:
> 
> crawl-66-249-71-53.googlebot.com - - [15/Nov/2005:05:16:39 +0100] "GET
> /robots.txt HTTP/1.0" 200 25 "-" "Googlebot/2.1
> (+http://www.google.com/bot.html)"
> .........
> crawl-66-249-72-226.googlebot.com - - [16/Nov/2005:02:14:45 +0100] "GET
> /robots.txt HTTP/1.1" 200 25 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;
> +http://www.google.com/bot.html)"
> ......
> 
> The same with MSNbot
> 
> The content of the robots.txt is:
>     User-agent: *
>     Disallow: /newsfiles
> 
> Should I change the robots.txt to make Google visit the other pages?

__/ [Nico Schuyt] on Friday 18 November 2005 09:57 \__

> Are you sure? Your log file says that only 25 bytes were transferred. Try
> counting the characters in your robots.txt file.

I  can  count  about 30 bytes. Why do you reckon this might be?  Have  you
changed it recently? I see that you still experiment as shown below.

Is  the file robots.txt potentially malformed? It doesn't appear to be the
case. I also notice that your site has decent traffic and ranks. What hap-
pens  if you remove robots.txt altogether? When did you register the  site
or  first observe such requests from search engines? Is there any new con-
tent to index? XML sitemaps?

Looking at:

http://www.nicoschuyt.nl/robots.txt

I now see:

User-agent: *
Disallow: /remkaflex/
Disallow: /cartal/
Disallow: /wsw/
Disallow: /vleeskens/
Disallow: /lease/
Disallow: /vijzelaar/
Disallow: /seo_analyse2.htm


Roy

-- 
Roy S. Schestowitz      | "I feed my 3 penguins with electricity and love"
http://Schestowitz.com  |    SuSE Linux     |     PGP-Key: 0x74572E8E
 11:40am  up 15 days  7:34,  4 users,  load average: 0.41, 0.31, 0.42
      http://iuron.com - next generation of search paradigms

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index