____/ JeffM on Thursday 21 Jul 2011 18:05 : \____

>>JeffM wrote:
>>>I like Google's *Cached* pages.
>>>If you append  &strip=1  to the URL, it feeds a page without graphics
>>>--well, except for background images 8-( --and without scripts.
> Roy Schestowitz wrote:
>>Doesn't Google cache expire quite quickly
>>compared to Web Archive (which has financial issues?
> I believe you're thinking of New York University's Coral Cache
> which purposely has a 24-hour timeout.
> Folks use that a lot to avoid the Slashdot Effect.
> Occasionally, I do find a Google Cache that is broken[1]
> but when you consider the trillions of pages they handle,
> it is a relatively rare occurrence.
> Large numbers of broken Caches within *one* domain
> (in for a penny; in for a pound, apparently)
> are more typical for some reason unknown to me.
> Most often, when a Google Cache no longer works
> it's because the original page is now 404/500
> so nothing exists to be cached.
> Google circles back periodically and checks pages,
> so there is a latency when you can find the cache,
> but not the actual page.
> It seems that very popular pages have a shorter wait.
> This meme is another possibility for what you were thinking
> ...and I have seen the Wayback Machine
> list a whole bunch of dates for a page and
> upon clicking those, found that most of them were dead links
> so it's not like they have a magic potion either.
> .
> .
> [1] This has changed significantly **very** recently.
> Now, when Google can't locate one of its own Caches,
> instead of a basically-blank
> You-are-too-stupid-to-use-Google-properly page,
> it actually repeats the search using the available data
> and shows those results.
> The #1 item is invariably the original, intended page
> (though if you click *that* Cached link,
> the color-coded highlighting won't be as originally intended).

I used to run a nightly script that made a local copy of 
each item I referenced. After a while it became a huge (10GB)
archive, so I stopped doing this. It rarely came handy.

Internet rot is a scary thing because the Internet is still young
and research of the early Internet will be hard within a few decades.

