__/ [ Roy Culley ] on Saturday 06 May 2006 01:22 \__
> begin risky.vbs
> <1146872219.535116.192710@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>,
> "Ramon F Herrera" <ramon@xxxxxxxxxxx> writes:
>>
>>> Google *automatically* posts stats for every newsgroup it hosts. I
>>> believe this happens around midnight on Sunday.
Google, being an outsider and a third-party, /might/ be irrelevant.
>> I beg to differ, and would love to be proven wrong.
>
> Larry does tend to put his foot in it every so often. Google may well
> produce stats but they certainly don't post them.
>
>> How come the COLA stats are posted under the name of Roy Culley and
>> the only postings that he has all go to COLA? Roy is a regular
>> poster in COLA, isn't he?. It seems like he wrote some gateway and
>> sends the summary to COLA and only to COLA.
>
> I used to use turqstat to generate the COLA stats. It is written in
> C++ and has a few bugs that kept annoying me. The code is awful. It
> is full of hard coded crap. For example, things like column width
> are hard coded instead of being a variable.
I happen to use that code and I even modified it slightly. I didn't find it
all that bad. Another package that I use is MLS (targetted at mailing lists)
and its code seems rather elegant. It can probably be used to process
newsgroup data, stored as mbox-formatted mail archives. Some output
'sections' might be irrelevant, but other than that, consider exploring:
http://freshmeat.net/projects/mls
I had to mend something with its handling of dates, so contact me off list
(NG) if you want some code/advice.
> Anyway, I decided one day to see how hard it would be to do the same
> task with perl. Within a few hours I had a script that did what I
> wanted. What amazed me was that the perl script was faster than
> turqstat! The script has been 'enhanced' since to what it is today.
MLS takes bwwteen 0 and 1 seconds to process approximately 500 messages.
Turqstat takes over a minute for the same task, but in its defence, it pulls
the data off the Net 'on the fly' (still, on a 100Mbit connection, it's no
excuse).
> I keep a local news spool using leafnode. The COLA stats are produced
> from this local news spool.
>
>> Hello Roy, are you out there?
>
> I'm always here. :-)
Why not post more often?
> I just don't post much these days. There are much better Linux
> advocates to tear the trolls apart than me.
>
>> Would you host stats of a NG for me? I don't have the computer
>> resources I used to, being between companies.
>
> How much are you going to pay me? :-)
>
> Unless its a newsgroup I subscribe to and am interested in I doubt
> I would be interested.
>
> To do what you want isn't difficult. Install a Linux box, run leafnode
> and use one of the stats programmes available.
Ramon, if you tell me which newsgroup, I'll happily subscribe produce stats.
I already have the process cronned for 3 newsgroups (Palm, search engines
and SuSE), so adding another one would be trivial.
Best wishes,
Roy
--
Roy S. Schestowitz, Ph.D. Candidate (Medical Biophysics)
http://Schestowitz.com | GNU/Linux ¦ PGP-Key: 0x74572E8E
5:25am up 8 days 12:22, 12 users, load average: 0.35, 0.65, 0.68
http://iuron.com - next generation of search paradigms
|
|