Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: Blog Spam: Banned word/ string CSV

  • Subject: Re: Blog Spam: Banned word/ string CSV
  • From: Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx>
  • Date: Sun, 19 Mar 2006 06:59:55 +0000
  • Newsgroups: alt.www.webmaster
  • Organization: schestowitz.com / MCC / Manchester University
  • References: <Xns978AC18697C08karlkarlcorecom@216.196.97.136> <1fdp12tg38l9jkkn9rlhbhmka22pgrcs0o@4ax.com>
  • Reply-to: newsgroups@xxxxxxxxxxxxxxx
  • User-agent: KNode/0.7.2
__/ [ hug ] on Sunday 19 March 2006 01:41 \__

> Karl Groves <karl@xxxxxxxxxxxxxxxxxx> wrote:
> 
>>Anyone know where I can get a CSV or SQL dump of banned strings to fight
>>blog spam?
>>
>>I'm looking for something to validate against to eliminate blog/ guestbook
>>spamming.
> 
> Good luck, but I don't think it'll get you where you want to go, you
> really need 0Em s0ftware shipped cheep worldwide, a 10w-interest h0me
> loan, and <g> the latest erection-pack -- I mean, how many successful
> antispam programs operate via a banned string list, vs how many
> operate via a whitelist?  I assume you have some kind of turing test
> in place so the shitbags at least have to do the work by hand?

Hi Karl,

Have a look at Akismet, which uses a repository of IP's, sites and words to
distinguish between ham and spam comments. The filter benefits from many
blogs at the moment (input/training data) and it has got hooks for many
programming language, which benefit from its open API's.

http://akismet.com/

Also, for what it's worth, he are my lists of blacklist terms. I accumulated
them over the past year and a half, whenever I got flooded:

poker
shoes
ambien
diamond
pills
drugs
metformin
rakeback
rake
baccarat
soma
craps
protonix
tramadol
slots
enlargement
amitriptyline 
Lexapro
smoking
diet
gambling
fioricet
wsop
effexor
adipex
backgammon
supplements
hoodia
viagra
levitra
propecia
loan
credit
casino
pharmacy
roulette
Prozac
Cialis
phentermine
free
sex
xxx
texas
porn
loan
blackjack

Expect some false positive, so ensure you enqueue for moderation rather than
immediately ditch. Also add a disclaimer to commenters, as regards
moderation.

Best wishes,

Roy

-- 
Roy S. Schestowitz      |    while (sig==sig) sig=!sig;
http://Schestowitz.com  |    SuSE Linux     ¦     PGP-Key: 0x74572E8E
  6:50am  up 10 days 23:27,  10 users,  load average: 0.61, 0.67, 0.72
      http://iuron.com - Open Source knowledge engine project

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index