Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: Robots.txt help

__/ [ Roy Schestowitz ] on Monday 18 September 2006 17:29 \__

> __/ [ danish ] on Monday 18 September 2006 17:05 \__
> 
>> I have a site that uses PHP Sessions IDs.. I know that total
>> elimination of these from the URL is what is recommended for optimal
>> bot crawling and I am working on that, but is there any way, for now to
>> 
>> include a line in robots.txt that would ignore the "PHPSESSID"
>> parameter?
>> 
>> For example, the site works just fine when you visit this page:
>> 
>> 
>> http://fixmyfamily.com/search_details.php?cid=41
>> 
>> 
>> But by default it generates a URL like this:
>> http://fixmyfamily.com/search_details.php?cid=41&PHPSESSID=0d8ff46dbd...
>> 
>> 
>> 
>> What can be done right now so that Google doesn't crawl these session
>> IDs and then store them and want to come back to them? Thanks in
>> advance for your help. BTW, I don't want to disallow all
>> "search_details.php" URLs..
> 
> Hi, this would probably be handled well by alterring the generation of
> URL's in the CMS, either by omitting these duplicates or moving them to a
> (virtual) directory structure so that robots.txt can exclude them (it
> can't/shouldn't do wildcards, but Google is pushing towards
> breaking/'extending' the standards and conventions).
> 
> Session ID's are tricky. Are you sure bots are being assigned a cookie? I
> know that spyware-type tools will be passed such URL's, but I don't think
> search engines will browse (crawl) with a cookie. There were similar
> questions before in this newsgroup (sessionid and duplicates), so it's
> definitely worth browsing the archive. It's also worth looking at the logs,
> filering by crawler type (or IP address) to see what is going on underneath
> the surface. Another possibility is to view the cache, e.g. using
> "site:yoursite.suffix".

Addendum: the following has just been published.

http://www.webpronews.com/expertarticles/expertarticles/wpn-62-20060918SessionIDsMakeEcommerceDifficult.html
http://tinyurl.com/zsfzo

                Session ID's Make Ecommerce Difficult

It might help.

-- 
Roy S. Schestowitz      |    Community is code, code is community
http://Schestowitz.com  |    SuSE Linux     |     PGP-Key: 0x74572E8E
  6:20pm  up 60 days  6:32,  7 users,  load average: 1.06, 0.82, 0.77
      http://iuron.com - Open Source knowledge engine project

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index