Re: [News] Linux Reciprocity is a Major Merit

Home	Messages Index

[Date Prev]	[Date Next]	[Thread Prev]	[Thread Next]

Author Index	Date Index	Thread Index

Re: [News] Linux Reciprocity is a Major Merit

Subject: Re: [News] Linux Reciprocity is a Major Merit
From: Mark Kent <mark.kent@xxxxxxxxxxx>
Date: Fri, 16 Mar 2007 21:04:16 +0000
Newsgroups: comp.os.linux.advocacy
References: <2179247.mT9dLdKHgq@schestowitz.com> <7q8oc4-165.ln1@ellandroad.demon.co.uk> <1663547.NfZRGtPRV2@schestowitz.com> <qo0qc4-28p.ln1@ellandroad.demon.co.uk> <34548891.OefCp5DZZk@schestowitz.com> <dnlqc4-21o.ln1@ellandroad.demon.co.uk> <bitqc4-ld.ln1@sky.matrix>
User-agent: slrn/0.9.7.4 (Linux)
Xref: ellandroad.demon.co.uk comp.os.linux.advocacy:505578

[H]omer <spam@xxxxxxx> espoused:
> Verily I say unto thee, that Mark Kent spake thusly:
> 
>> Has anyone tried to do anything like this already and perhaps has
>> solutions for these issues?
> 
> How about running this on a leafnode spool?:
> 
> ######
> #!/usr/bin/perl -w
> # parse-urls.pl
> 
> use strict;
> use URI::Find;
> 
> my $finder = URI::Find->new(
>   sub {
>     my($uri, $orig_uri) = @_;
>     return $orig_uri;
>   });
> 
> while (<>) {
>   my $text = $_;
>   $finder->find(\$text);
>   exec "lynx -source $text" or die;
> }
> 
> 1;
> ######
> 
>  - http://search.cpan.org/dist/URI-Find/
> 
> I'll play around with this, and see about adding URI verification.
> 
> Also IMHO the final output should be something like:
> 
> Article Name: <html title>
> Archive Date: <date fetched>
> Article URI : <orig_uri>
> Article Body: <output from parse-urls.pl>
> 
> Getting the *real* posting date for an upstream article is a more
> difficult proposition, since that info is not always available.
> 
> Also, for a proper citation, the upstream article *author* should be
> included, where possible.
> 

Homer - it's a great start, but part of my issue was about dealing with
the web-page end, where many articles are broken up over multiple pages.
Still, let me know how you get along.

-- 
| Mark Kent   --   mark at ellandroad dot demon dot co dot uk          |
| Cola faq:  http://www.faqs.org/faqs/linux/advocacy/faq-and-primer/   |
| Cola trolls:  http://colatrolls.blogspot.com/                        |

References:
- [News] Linux Reciprocity is a Major Merit
  - From: Roy Schestowitz
- Re: [News] Linux Reciprocity is a Major Merit
  - From: Roy Schestowitz
- Re: [News] Linux Reciprocity is a Major Merit
  - From: Roy Schestowitz
- Re: [News] Linux Reciprocity is a Major Merit
  - From: Mark Kent
- Re: [News] Linux Reciprocity is a Major Merit
  - From: [H]omer

[Date Prev]	[Date Next]	[Thread Prev]	[Thread Next]

Author Index	Date Index	Thread Index