Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: Spidering Sites (was: Konqueror off-line)

__/ [ wbarwell ] on Saturday 11 March 2006 10:21 \__

> Roy Schestowitz wrote:
>> __/ [ wbarwell ] on Saturday 11 March 2006 09:04 \__
>>>  news@xxxxxxxxxxxxxx wrote:
>>>> Hi,
>>>> It seems to me that Konqueror can't save the fetched page.
>>>> I need to fetch & save several pages, to be read off-line later,
>>>> and to extract URLs which I will fetch next time I go on-line.
>>>> Is it possible that Konqueror can't do theis ?!
>>> It should do it, but I have found that Konquerer
>>> as a browser is slow and a bit buggy. For reading
>>> and saving stuff Opera works very well, and its a
>>> lot better than Konquerer for stuff like printing too.
>>> I'd say google up the Opera website and download Opera
>>> and use that.  I use KDE and the file manager/konquerer
>>> works quite well with Opera as a system.
>>> Get Opera, you won't be sorry.
>> Using Konqueror in isolation is hard. Many zealous sites will not even let
>> it in. For that reason, I keep Firefox and Opera installed. I do not
>> believe that Konqueror has the functionality you described, but you can
>> browser the Konqueror cache, which probably resides in
>> ~/.kde/share/apps/konqueror/ although I can't seem to find it.
>> Best wishes,
>> Roy
> If there is something I want to read offline, I just save the pages I want
> with Opera.
> I keep meaning to explore spiders some day but never seem to get around
> to it.
> Your konquerer cache will be in the .kde cache-your-systems-name folder

Don't be intimidated by spiderting tools. They are /very/ easy to use.

If you want to download example.org, create a directory 'example'

$ mkdir example
$ cd example

Now simply

$ wget -R example.org

Since you probably don't want to download the entire site, have a look at the
options you have:

$ man wget

So, for example, consider:

$ wget -r -l2 -t1 -N -np -erobots=off

The above will limit the level of /depth/ explored in the site. It also
honours rules for spidering and it authenticates with the site, in case it
is necessary.

Best wishes,


Roy S. Schestowitz      |    #00ff00 Day - Bakset Case
http://Schestowitz.com  |    SuSE Linux     |     PGP-Key: 0x74572E8E
 10:20am  up 3 days  2:57,  7 users,  load average: 0.73, 0.59, 0.54
      http://iuron.com - Open Source knowledge engine project

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index