Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: Google does not obey robots.txt

  • Subject: Re: Google does not obey robots.txt
  • From: Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx>
  • Date: Fri, 02 Jun 2006 08:38:00 +0100
  • Newsgroups: alt.internet.search-engines
  • Organization: schestowitz.com / MCC / Manchester University
  • References: <pan.2006.06.01.17.26.23.184904@mail.invalid> <4e94t7F1ci1giU2@individual.net> <Xns97D5B0F4BF82Dcastleamber@130.133.1.4>
  • Reply-to: newsgroups@xxxxxxxxxxxxxxx
  • User-agent: KNode/0.7.2
__/ [ John Bokma ] on Thursday 01 June 2006 23:23 \__

> Brian Wakem <no@xxxxxxxxx> wrote:
> 
>> wd wrote:
>> 
>>> Google is not obeying robots.txt at all now.
>>> 
>>> Here is the typical Drupal robots.txt with a few modifications:
>>> 
>>> User-agent: *
>>> Disallow: /aggregator
>>> Disallow: /tracker
>>> Disallow: /comment/reply
>>> Disallow: /node/add
>>> Disallow: /user
>>> Disallow: /search
>>> Disallow: /admin
>>> 
>>> Google has quite a few pages like the following indexed:
>>> /comment/reply/1
>>> /user/register
>>> /aggregator/sources/1
>>> /user/password
>> 
>> 
>> Sounds interesting.  What's in /user/password?
> 
> 151,000 answers:
> 
> http://www.google.com/search?q=inurl%3A%22/user/password%22

Seems like most answers are irrelevant, but in certain CMS's, this is a page
what serves a function such as password changes. Interestingly, if it were
/users/password , you'd probably get plenty of Web sites that had registered
users whose username is 'password'.

Best wishes,

Roy

PS - I think that's what they call Google hacking. I once found someone's
entire PDA data on some Webspace and reported this to him. He was an MIT
sysadmin, as ironic as it may seem.

-- 
Roy S. Schestowitz
http://Schestowitz.com  |  GNU is Not UNIX  ¦     PGP-Key: 0x74572E8E
  8:35am  up 35 days 15:07,  16 users,  load average: 3.22, 3.11, 3.03
      http://iuron.com - proposing a non-profit search engine

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index