Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: Who keeps the COLA stats?

  • Subject: Re: Who keeps the COLA stats?
  • From: Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx>
  • Date: Sat, 06 May 2006 05:38:29 +0100
  • Newsgroups: comp.os.linux.advocacy
  • Organization: schestowitz.com / MCC / Manchester University
  • References: <1146819374.128715.279480@u72g2000cwu.googlegroups.com> <1146832280.604592.283380@y43g2000cwc.googlegroups.com> <1146872219.535116.192710@e56g2000cwe.googlegroups.com> <eq6ti3-3gc.ln1@dog.did.it>
  • Reply-to: newsgroups@xxxxxxxxxxxxxxx
  • User-agent: KNode/0.7.2
__/ [ Roy Culley ] on Saturday 06 May 2006 01:22 \__

> begin  risky.vbs
> <1146872219.535116.192710@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>,
> "Ramon F Herrera" <ramon@xxxxxxxxxxx> writes:
>>
>>> Google *automatically* posts stats for every newsgroup it hosts. I
>>> believe this happens around midnight on Sunday.


Google, being an outsider and a third-party, /might/ be irrelevant.


>> I beg to differ, and would love to be proven wrong.
> 
> Larry does tend to put his foot in it every so often. Google may well
> produce stats but they certainly don't post them.
> 
>> How come the COLA stats are posted under the name of Roy Culley and
>> the only postings that he has all go to COLA?  Roy is a regular
>> poster in COLA, isn't he?.  It seems like he wrote some gateway and
>> sends the summary to COLA and only to COLA.
> 
> I used to use turqstat to generate the COLA stats. It is written in
> C++ and has a few bugs that kept annoying me. The code is awful. It
> is full of hard coded crap. For example, things like column width
> are hard coded instead of being a variable.


I happen to use that code and I even modified it slightly. I didn't find it
all that bad. Another package that I use is MLS (targetted at mailing lists)
and its code seems rather elegant. It can probably be used to process
newsgroup data, stored as mbox-formatted mail archives. Some output
'sections' might be irrelevant, but other than that, consider exploring:

        http://freshmeat.net/projects/mls

I had to mend something with its handling of dates, so contact me off list
(NG) if you want some code/advice.


> Anyway, I decided one day to see how hard it would be to do the same
> task with perl. Within a few hours I had a script that did what I
> wanted. What amazed me was that the perl script was faster than
> turqstat! The script has been 'enhanced' since to what it is today.


MLS takes bwwteen 0 and 1 seconds to process approximately 500 messages.
Turqstat takes over a minute for the same task, but in its defence, it pulls
the data off the Net 'on the fly' (still, on a 100Mbit connection, it's no
excuse).


> I keep a local news spool using leafnode. The COLA stats are produced
> from this local news spool.
> 
>> Hello Roy, are you out there?
> 
> I'm always here. :-)


Why not post more often?


> I just don't post much these days. There are much better Linux
> advocates to tear the trolls apart than me.
> 
>> Would you host stats of a NG for me?  I don't have the computer
>> resources I used to, being between companies.
> 
> How much are you going to pay me? :-)
> 
> Unless its a newsgroup I subscribe to and am interested in I doubt
> I would be interested.
> 
> To do what you want isn't difficult. Install a Linux box, run leafnode
> and use one of the stats programmes available.


Ramon, if you tell me which newsgroup, I'll happily subscribe produce stats.
I already have the process cronned for 3 newsgroups (Palm, search engines
and SuSE), so adding another one would be trivial.

Best wishes,

Roy

-- 
Roy S. Schestowitz, Ph.D. Candidate (Medical Biophysics)
http://Schestowitz.com  |     GNU/Linux     ¦     PGP-Key: 0x74572E8E
  5:25am  up 8 days 12:22,  12 users,  load average: 0.35, 0.65, 0.68
      http://iuron.com - next generation of search paradigms

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index