Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: Step-by-Step: How to Get BILLIONS of Pages Indexed by Google

  • Subject: Re: Step-by-Step: How to Get BILLIONS of Pages Indexed by Google
  • From: Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx>
  • Date: Sun, 18 Jun 2006 11:49:37 +0100
  • Newsgroups: alt.internet.search-engines
  • Organization: schestowitz.com / MCC / Manchester University
  • References: <Xns97E62F32CBE5castleamber@130.133.1.4> <FI5lg.93010$ii4.62905@fe08.news.easynews.com> <1150626236.579451.272490@p79g2000cwp.googlegroups.com>
  • Reply-to: newsgroups@xxxxxxxxxxxxxxx
  • User-agent: KNode/0.7.2
__/ [ canadafred ] on Sunday 18 June 2006 11:23 \__

> www.1-script.com wrote:
>> John Bokma wrote:
>>
>> > <snip stuff I replied to already />


@Dmitri (and Paul, whom I know is on the same boat, among others): I think I
have some encouraging news. As you already know, Google saturation has been
tremendously low and rather stable for several weeks (in my case, it has
fallen from 100,000+ pages, which I never truly had)...

64.233.167.99   1               569     569     20,800
64.233.171.99   1               569     569     20,800
64.233.179.99   1               569     569     20,800
64.233.183.99   1               569     569     6,750
64.233.185.99   1               569     569     20,800
64.233.187.99   1               569     569     20,800
64.233.189.104  1               569     569     20,800
66.102.7.99     1               569     569     20,800
66.102.9.99     1               569     569     6,750
66.102.11.99    1               569     569     6,750
66.249.85.99    1               569     569     725
66.249.93.99    1               569     569     6,750
72.14.203.99    1               569     569     20,800
72.14.207.99    1               569     569     30,100
216.239.37.99   1               569     569     30,100
216.239.39.99   1               569     569     30,100
216.239.51.99   1               569     569     20,800
216.239.53.99   1               569     569     20,800
216.239.57.99   1               569     569     20,800
216.239.59.99   1               569     569     6,750
216.239.63.99   1               569     569     20,800

                                                ^^^^^^^
                                                Saturation

                                (Thanks, Tippy)

As you can see, more pages are beginning to emerge in the datacentres. The
old value was around 6,700. This morning I was getting these occasional
700-ish oddities, but suddenly there is positive movement, too. Taking the
fullest datacentre, I am seeing pages which I know previously dropped out of
the index/cache.

http://216.239.39.99/search?hl=en&q=site%3Aschestowitz.com+graveyard&btnG=Google+Search

I must admit that the formatting/fields (particularly title) in the results
is/are truly incorrect. It's almost as if there is a bug. I only came to
discover all of these because one of my targetted SERP's has improved
significantly overnight. 

Checking a different site of mine (same host and, I suspect, same IP address
as well):

4.233.161.99    2               208     208     1,530
64.233.167.99   2               208     208     998
64.233.171.99   2               208     208     998
64.233.179.99   2               208     208     998
64.233.183.99   2               208     208     128
64.233.185.99   2               208     208     998
64.233.187.99   2               208     208     998
64.233.189.104  2               208     208     998
66.102.7.99     2               208     208     998
66.102.9.99     2               208     208     128

                                                ^^^^^^^
                                                Saturation

If I recall correctly, the old value was 128. Prior to the Big Daddy fluke,
it was somewhere over 1,000.

 
> This has been quietly going on for months.
> 
> Google pledged before Christmas 2005 to try to rid itself of the
> zillions of useless web pages now occupying exploited sub-domains.
> You'll noticed that some free servers ( such as Geocities and MSN
> MySpace ) will be immune to the SERP extraction as they have received
> exempts from the garbage collection due to having passed sub-domain
> management tests.


Ironically perhaps, Matt Mullenweg has just had his site exerted from the
datacentres. He is  responsible for wordpress.com, which I believe is/could
be becoming the victim of subdomain spam. As in the case with MSN.com and
zombies, it appears as though individuals eat their own poison. What goes
around comes around?

Best wishes,

Roy


-- 
Roy S. Schestowitz
http://Schestowitz.com  | Free as in Free Beer ¦  PGP-Key: 0x74572E8E
Cpu(s):  20.2% user,   3.8% system,  17.6% nice,  58.4% idle
      http://iuron.com - semantic engine to gather information

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index