Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: saving search results with bash script, curl, wget...

__/ [ Nospam ] on Friday 17 March 2006 13:29 \__

> "Roy Schestowitz" <newsgroups@xxxxxxxxxxxxxxx> wrote in message
> news:dve54r$c40$1@xxxxxxxxxxxxxxxxxxxx
>> __/ [ canadafred ] on Friday 17 March 2006 02:31 \__
>>
>> >> I am wondering how it is possible to place the results from a search
> query
>> >> url with wget, curl, or bash scripts (or any other command line tool)
> and
>> >> get the results(including the results of the next page) into a txt file
>> >> with
>> >> each a newline for each url of the result?
>> >> i.e for a search query; something like www.example.com=query?...
>> >> each result can be placed on a newline in a text file.
>> >
>> > cross posting removed
>>
>> Thanks, Fred.
>>
>> At risk of answering an Internet troll:
>>
>> Use wget, e.g.:
>>
>> ,----[ Commands ]
>> | cd ~; wget http://www.google.co.uk/search?&ie=utf-8&oe=utf-8&q=nospam
>> `----
>>
>> Then, you could use Perl (script from Brian Wakem):
>>
>> ,----[ get_urls.sh ]
>> | cat ~/index.html| perl -ne '@url=m!(http://[^>"]+)!g;print "$_\n"
>> | foreach @url' > ~/googleurls
>> `----
> 
> can you elaborate on the above script, I tried to run it with bash but it
> says cat command not found, I am on windows using bash from cygwin.


Cat (concatenate) is a very fundamental and simple command. cygwin is not
complete, so I advise you to install and run Linux as a virtual machine
under Windows (VMWare is now free) or get a dual-boot setting.

I am sorry if this answer is unhelpful, but if you don't have CAT(1), then I
doubt you will have Perl, among other valuable utilities.

Hope it helps,

Roy

-- 
Roy S. Schestowitz, Ph.D. Candidate in Medical Biophysics
http://Schestowitz.com  |    SuSE Linux     ¦     PGP-Key: 0x74572E8E
  3:50pm  up 9 days  8:27,  7 users,  load average: 0.08, 0.02, 0.02
      http://iuron.com - Open Source knowledge engine project

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index