Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: Mirroring website without wget?

  • Subject: Re: Mirroring website without wget?
  • From: Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx>
  • Date: Wed, 22 Feb 2006 09:36:47 +0000
  • Newsgroups: uk.comp.os.linux
  • Organization: schestowitz.com / MCC / Manchester University
  • References: <Xns9771EC461BF02steveNNhogg@>
  • Reply-to: newsgroups@xxxxxxxxxxxxxxx
  • User-agent: KNode/0.7.2
__/ [ SteveN ] on Tuesday 21 February 2006 23:13 \__

> Dear All,
> Are there any other tools that I can use to mirror a website without using
> wget (which doesn't seem to work).

scp, ftp, among more. Are you *mirroring* a site or just *scraping* it? Is it
/your/ site?

> The site needs a username and password (which I have) and I copied the
> cookies from Firefox to the wget directory after properly logging in.
> Firefox, Opera, and for that matter Internet Explorer have no problems once
> I have logged in, but it seems wget is getting confused by some javascript
> nastiness that sends it off-site.

Maybe grabbers are denied as a matter of principle. Maybe some user-agent
sniffing is involved, in which case you must spoof.

> I *think* what I am asking is if there is an extension to Firefox which
> allows it to be used as a mirroring tool?  Googling just seems to give me
> lots of 'mirrors of firefox' rather than what I am after.

There are mirroring tools for Web sites that are owned by the 'mirrorer'. I
syndicate Firefox plug-ins on a daily basis and I have not come across such
an extension.

> I know it's a bit of a lame question, but can anybody point me in the right
> direction?
> Many thanks,

There are Google scrapers in the wild, so you might be able to re-use them.
They should be easy to identify on the Net.

Hope it helps,


Roy S. Schestowitz      | Windows all-in-one: Word, IE (for E-mail) & iTunes
http://Schestowitz.com  |    SuSE Linux     |     PGP-Key: 0x74572E8E
  9:30am  up 4 days 21:49,  8 users,  load average: 1.88, 1.30, 0.93
      http://iuron.com - help build a non-profit search engine

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index