star green wrote:
> "John Bokma" <firstname.lastname@example.org> wrote in message
>> star green wrote:
>>> Ive been trying to figure out how to write a robots.txt file that will
>>> allow the robots to access the homepage (index.html), but no other
>>> page on the site.
>> Note that not all robots honor robots.txt. The major search engine ones
>> do (most likely), but ratware bots don't.
> I know, I was just wondering if there's something that would work for most
> But, I guess there's nothing to be done short of completely reorganizing
> the site (which I don't wish to do) or listing every page in the
> robots.txt file.
> Ah, well.
> Thanks to everyone for the advice!
Use a script. You can get a listing of all HTML files on your site and
output them to a file. Contact me if you need help on this.
Having said that, if your site has, let us say, 10,000 pages, then
robots.txt becomes overly inflated.