Re: [OT] Tom's/Marc's spam filters? - Mailing list pgsql-general

From Joe Conway
Subject Re: [OT] Tom's/Marc's spam filters?
Date
Msg-id 4086D49B.3060403@joeconway.com
Whole thread Raw
In response to Re: [OT] Tom's/Marc's spam filters?  ("Marc G. Fournier" <scrappy@postgresql.org>)
List pgsql-general
Marc G. Fournier wrote:
> On Wed, 21 Apr 2004, Joe Conway wrote:
>>   /usr/bin/sa-learn --mbox --spam /path/to/false-neg.mbox
>>
>>Now I just drop all false negatives into that mailbox, and clean them
>>out periodically. Hopefully that will make a significant improvement.
>
> This, for me, has made the big difference, since the false-negatives don't
> get autolearned :(

Actually, even much of what does (correctly) get marked as spam, ends up
with autolearn=no, because it seems SpamAssassin is somewhat
conservative with autolearning. I just sent this off list to Michael Chaney:
---------------------------------------------------------------------

I've noticed that the threshold for autolearn seems too high, i.e. a
high proportion of email correctly marked as spam, has autolearn=no.
Here's an example:

X-Spam-Status: Yes, hits=3.7 required=2.5
tests=BAYES_44,HTML_FONT_INVISIBLE, HTML_IMAGE_ONLY_04,
       HTML_MESSAGE,MIME_HTML_NO_CHARSET,MIME_HTML_ONLY,
       MIME_HTML_ONLY_MULTI autolearn=no version=2.63

Now in /etc/mail/spamassassin/local.cf I have this setting:

   # Enable Bayes auto-learning
   auto_learn              1
   bayes_auto_learn_threshold_spam    6

 From the SA docs, I get the impression that autolearn cannot be made
more aggressive.

So in order to counteract that, I just made an additional change -- I
put in a mail filter rule that automatically forwards any mail marked as
spam, but with autolearn=no, to false-neg.mbox. This should help too, I
think.

Joe


pgsql-general by date:

Previous
From: "scott.marlowe"
Date:
Subject: Re: kill -2
Next
From: Philipp Buehler
Date:
Subject: Re: 7.3.4 on Linux: UPDATE .. foo=foo+1 degrades massivly over time