Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

COLA Stats 'Quality' Poster calculation

  • Subject: COLA Stats 'Quality' Poster calculation
  • From: "[H]omer" <spam@xxxxxxx>
  • Date: Fri, 24 Aug 2007 17:12:26 +0100
  • Bytes: 8453
  • Newsgroups: comp.os.linux.advocacy
  • Openpgp: id=BF436EC9; url=http://slated.org/files/GPG-KEY-SLATED.asc
  • Organization: Slated.org
  • User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.6) Gecko/20070811 Remi/2.0.0.6-1.fc6.remi Thunderbird/2.0.0.6 Mnenhy/0.7.5.666
  • Xref: ellandroad.demon.co.uk comp.os.linux.advocacy:553707
This is the relevant (computational) section of Roy Culley's ngstats.pl,
WRT 'Quality' Posters.

This is the last ... and I /mean/ the *very* last time I am *ever* going
to discuss this subject.

Any further comment on the matter will be completely ignored by me.
Anything anyone still doesn't believe about the integrity of these stats
from now on, that's their own problem. I simply don't give a rats arse.

Although no explicit license was stated, copyright remains with Roy
Culley. The only modifications I have made were with explicit
permission, and affected only formatting (and the removal of references
to specific posters like Kadaicha Man, who is long gone). Further to
that, I use various wrappers (bash) for error checking and post
submission. None of that has any bearing on actual statistics
whatsoever. This is as much of the script as I will *ever* make public.
Don't like it? Tough:

####
sub QualityPosters {
    my($XP, $Q, $F, $T, $FullFrom, $LastXPostQ);
    my $Cnt = 1;
    foreach my $Poster (keys %FromName) {
        next if $FromName{$Poster} < 6;
        $XP = int((($FromName{$Poster} - $XPost{$Poster}) /
                      $FromName{$Poster}) * 100);
        if ($QuoteFrom{$Poster}) {
            $Q = 100 - $QuoteFrom{$Poster};
        }
        else {
            $Q = 0;
        }
        if ($DirReply{$Poster} > $FromName{$Poster}) {
            $F = $DirReply{$Poster} - $FromName{$Poster} + 50;
        }
        else {
            $F = $FromName{$Poster} - $DirReply{$Poster} ;
        }
        if (exists $TrollCntFrom{$Poster}) {
            $T = (100 - (2.0 * $TrollCntFrom{$Poster}));
        }
        else {
            $T = 100;
        }

        $XPostQ{$Poster} = int(($XP + $Q + $F + $T) / 4);
        $XPostQ{$Poster} -= 75 if exists $FromTroll{$Poster};
    }


IOW, just like it reads in the weekly posting:

"The poster 'quality' stats is based on:
       a) quoting: 100 - %'age quoted
       b) cross-posting: 100 - %'age cross-posted
       c) number of direct followups posters articles get
       d) troll feeding: 100 - 2.0 * %'age troll followups
       e) 75 deducted for known trolls"

Everything else in the script is merely about counting total posts, and
formatting the output. There is no "conspiracy". What you see is what
you get.

There are very infrequent errors (mostly caused by power outages beyond
my control). E.g. DFS noted that too many posts had been attributed to
him in the course of one particular week (timestamps were messed up).
The fact that *too many* posts were attributed to someone I classify as
a Troll, should be some indication to the cynics that anomalies (if any)
are *not* being deliberately introduced.

I do now have an UPS in place, although the quality of my energy
supplier's service in this area is so poor, that it occassionally takes
them longer to restore service than the lifespan of the UPS charge. As a
result of this, I have chained several UPS together, and this should
provide a full 24 hours of battery life for that server at full load
(yes it did once take my energy supplier *that* long to restore
service). Anomalies with the Stats should now be very rare.

For the purposes of these stats, the "known Trolls" are as follows:

(ACDC|amicus_curious):::amicus_curious - Troll
Alexander Terekhov:::Alexander Terekhov - Facist Troll
Aunty Diluvian:::flatfish [nyms] - Mental-Ward Escapee
Bill Gates:::Oxford - Mac Looney
cc:::cc - Shifty MS Apologist
Derk den Klotsoksel:::Derk den Klotsoksel - Troll
(dfs|DFS):::DooFy - Money-Obsessed Goon
Dom:::Dom - Idiot
Erik Funkenbusch:::Ewik FUDenbusch
Ian Semmel:::Ian Semmel - WinTroll / Idiot
i@xxxxxxxxxx:::"Scottish" Mike - Troll Apprentice
JD Cantafio:::JD Cantafio - Troll
Jeff_Relf:::Jeff_Relf - .Net k00k
Joerg Schilling:::Joerg Schilling - 'Superior' Software Developer
Larry Qualig:::Larry 'Msg-ID Larry' Qualig - Vile MS Shill
Lintard:::Lintard - Moronic Goon
linux-sux:::Scott [Nudds] Douglas [nyms] - Looney Troll
Meat Plow:::flatfish [nyms] - Mental-Ward Escapee
Mike Cox:::Idiotic Attention Seeker
OK:::OK - MS Apologist and Liar
Oliver Wong:::Oliver Wrong - MS Apologist
(oxford|Oxford):::Oxford - Mac Looney
Sandman:::Sandman - Mac Troll
snit:::snit - Ridiculous MAC Troll
Tim Smith:::Timmy 'Funkenbusch Wannabee' Smith
Toad:::Toad - Troll
waterskidoo:::waterskidoo - Troll

This is by no means the full list, but merely some of the more infamous
or recent entries. The full list is far too long (goes back years) to
post here. In particular, the number of flatfish nyms would require
several pages just by itself.

Certain posters are dropped completely and simply not counted at all,
due to the extremity of their behaviour (leafnode filter):

^From:.*(?i)Alexander Terekhov
^From:.*(?i)raylopez99
^From:.*(?i)Capt.*Morgan
^From:.*(?i)Dr(\.|\ )
^From:.*(?i)Mr(\.|\ )
^Newsgroups:.*24hoursupport.helpdesk.*
^Newsgroups:.*alt\.idiots.*
^Newsgroups:.*alt\.flame.*
^Newsgroups:.*alt\.politics.*
^Newsgroups:.*comp\.sys\.mac\.advocacy.*

The above is the *complete* filter used by COLA Stats, anything else
that you think /should/ have gotten through but didn't, is an issue with
SuperNews (my uplink), not the script.

Placement in the killfile or Trolls list is not necessarily permanent
(e.g. this week sees the return of Hadron from the bitbucket, and Bailo
is out of the Trolls list).

There is *zero* manual intervention with this automated process. No
post-processing is done by me at all. The only time that would ever
happen, would be if the script fscked up and sent an empty/corrupted
post (did happen once - my fault). If the text file
"advocate-of-the-week.txt" is present, it is also automatically picked
up by a wrapper for the COLA Stats script, if not then that section is
blank. Again, this requires no manual intervention with the actual stats
process (once a week, I pick out a good post and send it to the above
text file. Haven't done that recently).

Final comments:

COLA is an unmoderated group. Neither I nor anyone else "owns" COLA. The
weekly Stats post is merely a voluntary contribution that I do not have
any exclusive rights to, which I make for the sake of interest and fun only.

The script used was *not* written by me, and is not "owned" by me, but I
merely use it. Refusing to *release* the source to software that is
*not* publicly distributed, is *not* in contradiction to the principles
of Free Software. *If* it *is* distributed, then it should be licensed,
and if it is licensed, then I'd prefer software to be licensed under the
GPL. This does *not* mean that *everyone* is *obligated* to release
*every* piece of software they ever write for their own purposes ... or
any other private works. Those who cry "hypocrisy" (including some who
purport to be "advocates") need to shut their fat gobs and engage their
braincells ... assuming they have that capacity.

As for the weekly post itself ... don't like it? Don't read it. But quit
bitching and get a life. Better yet, just stop Trolling. If you don't
like GNU/Linux, you don't belong here - so piss off. Just my opinion.

-- 
K.
http://slated.org

.----
| "Proprietary licenses, the crack cocaine of software finance."
|  - Matt Asay, CNET
`----

Fedora release 7 (Moonshine) on sky, running kernel 2.6.22.1-41.fc7
 17:11:05 up 15 days, 16:06,  3 users,  load average: 0.25, 0.19, 0.20

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index