__/ [ Nospam ] on Friday 17 March 2006 13:29 \__
> "Roy Schestowitz" <newsgroups@xxxxxxxxxxxxxxx> wrote in message
> news:dve54r$c40$1@xxxxxxxxxxxxxxxxxxxx
>> __/ [ canadafred ] on Friday 17 March 2006 02:31 \__
>>
>> >> I am wondering how it is possible to place the results from a search
> query
>> >> url with wget, curl, or bash scripts (or any other command line tool)
> and
>> >> get the results(including the results of the next page) into a txt file
>> >> with
>> >> each a newline for each url of the result?
>> >> i.e for a search query; something like www.example.com=query?...
>> >> each result can be placed on a newline in a text file.
>> >
>> > cross posting removed
>>
>> Thanks, Fred.
>>
>> At risk of answering an Internet troll:
>>
>> Use wget, e.g.:
>>
>> ,----[ Commands ]
>> | cd ~; wget http://www.google.co.uk/search?&ie=utf-8&oe=utf-8&q=nospam
>> `----
>>
>> Then, you could use Perl (script from Brian Wakem):
>>
>> ,----[ get_urls.sh ]
>> | cat ~/index.html| perl -ne '@url=m!(http://[^>"]+)!g;print "$_\n"
>> | foreach @url' > ~/googleurls
>> `----
>
> can you elaborate on the above script, I tried to run it with bash but it
> says cat command not found, I am on windows using bash from cygwin.
Cat (concatenate) is a very fundamental and simple command. cygwin is not
complete, so I advise you to install and run Linux as a virtual machine
under Windows (VMWare is now free) or get a dual-boot setting.
I am sorry if this answer is unhelpful, but if you don't have CAT(1), then I
doubt you will have Perl, among other valuable utilities.
Hope it helps,
Roy
--
Roy S. Schestowitz, Ph.D. Candidate in Medical Biophysics
http://Schestowitz.com | SuSE Linux ¦ PGP-Key: 0x74572E8E
3:50pm up 9 days 8:27, 7 users, load average: 0.08, 0.02, 0.02
http://iuron.com - Open Source knowledge engine project
|
|