Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: Robots.txt - syntax question.

  • Subject: Re: Robots.txt - syntax question.
  • From: Roy Schestowitz <newsgroups@schestowitz.com>
  • Date: Mon, 25 Jul 2005 01:43:54 +0100
  • Newsgroups: alt.internet.search-engines
  • Organization: schestowitz.com / Manchester University
  • References: <1122227122.544038.152980@g14g2000cwa.googlegroups.com> <op.sufluamy584cds@borek> <nSVEe.15656$vv6.12994@newsfe6-gui.ntli.net>
  • Reply-to: newsgroups@schestowitz.com
  • User-agent: KNode/0.7.2
Wolfman's Brother wrote:

> Borek wrote:
>> On Sun, 24 Jul 2005 19:45:22 +0200, ted <occasionaluse@hotmail.com>
>> wrote:
>> 
>>> Disallow: /*.gif$
>> 
>> 
>> If I recall correctly wildcards are not allowed by the standard.
>> 
>> Check at www.robotstxt.org
>> 
>> Best,
>> Borek
> 
> To be pedantic about it.. It's not that wildcards arent ALLOWED by the
> standard, but that they arent HANDLED by it. So a "*" is not illegal,
> but simply means the literal character "*" rather than some other
> special meaning.
> 
> However .. Google's handling of robots.txt DOES include special meanings
> for wildcards, and in that sense is non-standard.
> 
> Chris
> --
> http://www.lowth.com/rope - Scriptable packet match logic for IPCop and
>                              other linux-based firewalls.

>From http://www.robotstxt.org/wc/faq.html#info

<snip>

Two common errors:

    * Wildcards are _not_ supported: instead of 'Disallow: /tmp/*' just say
'Disallow: /tmp/'.
    
...

</snip>

Roy

-- 
Roy S. Schestowitz
http://Schestowitz.com

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index