__/ [Borek] on Tuesday 20 December 2005 08:06 \__
> On Tue, 20 Dec 2005 02:09:32 +0100, Roy Schestowitz
> <newsgroups@xxxxxxxxxxxxxxx> wrote:
>
>> I'd be /very/ interested in an answer/solution to that too, John. I need
>> to
>> generate files that contain newline-separated URL's rather than copy and
>> paste from Web pages. The closest I could ever get to minimal manual
>> labour was:
>>
>> less search.html | grep http://
>
> In google case it will not help - whole answer is one line.
What complicates matters are syntaxes like:
<A href="foo.bar"></A>
<a title="linky thing" href="foo.bar"></A>
<a HREF='./foo.bar'></a>
To make something that covers all cases, you couldn't just lazily scan all
text while spewing out text that is contained between "<a href="" and """.
That's where standards like XHTML and standards-compliant pages come
handy. As far as I know, Google are very fond of standards, which is rare
among other large companies.
Roy
--
Roy S. Schestowitz
http://Schestowitz.com | SuSE Linux | PGP-Key: 0x74572E8E
2:25pm up 9 days 21:36, 5 users, load average: 0.05, 0.19, 0.16
http://iuron.com - next generation of search paradigms
|
|