Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: What Filetypes will Search Engine Follow?

  • Subject: Re: What Filetypes will Search Engine Follow?
  • From: Roy Schestowitz <newsgroups@schestowitz.com>
  • Date: Tue, 14 Jun 2005 13:18:40 +0100
  • Newsgroups: alt.internet.search-engines
  • References: <d8lk9r$1187$1@godfrey.mcc.ac.uk> <42aec883$0$306$7a628cd7@news.club-internet.fr>
  • User-agent: KNode/0.7.2
davidof wrote:

> Roy Schestowitz wrote:
>> I have a couple of questions which I could not find an answer to on the
>> Web.
>> 
>> 1. What files will search engines follow? I have seen Google descending
>> into text files, files without any extension and even C files.
> 
> Ahhh, I think you may be taking a Microsoft centric viewpoint to this
> question. In the Web world filetype is not determined by the extension a
> file has but by the MIME document type returned by the Web server. A
> robot will index any text/html file regardless of the extension which is
> totally meaningless to the robot or to a correctly written Web browser
> (IE is not one of those).


I was trying to steer away from a document type-O/S discussion. Sadly I tend
to label files with extensions. Many Linux application require it because
they use filetype filters.


> Google also indexes some other document types: Word docs, PDF,
> PowerPoint and even Flash but these are non-optimal from an SEO
> perspective.


Point taken. I would be glad if someone could provide an answer to my second
question, which is more important to me. I output HTML from experiments and
due to the nature of the method, I have many (thousands of) AVI, MAT (the
reason for my first question) and JPEG links that are broken.

I don't know if I should add an exclusion rule to robots.txt. I set it up
this morning, but changed my mind hours later and then sought some advice
from you guys.

Thanks in advance,

Roy

-- 
Roy S. Schestowitz
http://Schestowitz.com

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index