__/ [ John Bokma ] on Thursday 01 June 2006 23:23 \__
> Brian Wakem <no@xxxxxxxxx> wrote:
>
>> wd wrote:
>>
>>> Google is not obeying robots.txt at all now.
>>>
>>> Here is the typical Drupal robots.txt with a few modifications:
>>>
>>> User-agent: *
>>> Disallow: /aggregator
>>> Disallow: /tracker
>>> Disallow: /comment/reply
>>> Disallow: /node/add
>>> Disallow: /user
>>> Disallow: /search
>>> Disallow: /admin
>>>
>>> Google has quite a few pages like the following indexed:
>>> /comment/reply/1
>>> /user/register
>>> /aggregator/sources/1
>>> /user/password
>>
>>
>> Sounds interesting. What's in /user/password?
>
> 151,000 answers:
>
> http://www.google.com/search?q=inurl%3A%22/user/password%22
Seems like most answers are irrelevant, but in certain CMS's, this is a page
what serves a function such as password changes. Interestingly, if it were
/users/password , you'd probably get plenty of Web sites that had registered
users whose username is 'password'.
Best wishes,
Roy
PS - I think that's what they call Google hacking. I once found someone's
entire PDA data on some Webspace and reported this to him. He was an MIT
sysadmin, as ironic as it may seem.
--
Roy S. Schestowitz
http://Schestowitz.com | GNU is Not UNIX ¦ PGP-Key: 0x74572E8E
8:35am up 35 days 15:07, 16 users, load average: 3.22, 3.11, 3.03
http://iuron.com - proposing a non-profit search engine
|
|