Re: Stopping link spam on the lists - Mailing list pgsql-www
From | Stefan Kaltenbrunner |
---|---|
Subject | Re: Stopping link spam on the lists |
Date | |
Msg-id | 4F85D024.6020509@kaltenbrunner.cc Whole thread Raw |
In response to | Re: Stopping link spam on the lists (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Stopping link spam on the lists
|
List | pgsql-www |
On 04/08/2012 05:14 AM, Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> The remaining question, in my mind, is: is there a way to reliably >> detect that link spam is just link spam and reject it altogether in >> Spamassassin? If that's the case, then we could do it at that level and >> save the work downstream. This is something that Stefan would have to >> answer. > > FWIW, all the examples I have seen recently bore all of these traits: > > * empty subject line (other than the [LISTNAME] prefix attached by our > own forwarding code) > * no content to speak of except the payload link > * To: addressed to multiple unrelated addresses well in principle there is no reason why we cannot give more weight to mails given that description in our inbound mail system, which would probably push those in a relative selective way over the current hard-inbound-reject threshold (which atm is fairly conservative given we are still kinda finetuning the "new" system). > > I'm not sure how much the last point helps, unfortunately, because a > heck of a lot of what passes through our lists has multiple To:, and > I doubt it's practical for the spam filter to test how many of the > target addresses are people subscribed to the lists. The empty subject > would be easy to test for, but surely the spammers will figure out > not to do that soon. > > Anyway, what I've been seeing lately has all had X-pg-spam-score 3.5 or > more, which is what made me suggest that moderating on that basis would > improve matters. any chance you can provide us with some pointers to these kind of mails, I don't really have the bandwidth to follow that many lists and I don't think I have seen one coming by on the lists I actually read regulary... One important point to note is that only ~2% of our rejects are actually based by heavy-style contentfiltering (based on SA and clamav) the remaining 98% are getting dealt much earlier in the pipeline and using much lighter weight stuff. FWIW we actually passed approximatly ~10000 mails (excluding traffic we get from hub.org back as bounces) back to the actual listserver on April 10th. Out of that a total of 140 mails would have exceeded a X-Pg-Spam-Score of 3.5(across all lists). I have no idea whether making those "moderated by default" that would put an enormous amount of additional burden on the moderators or not, given I have no idea what kind of mails need to get dealt with on a typical day. Stefan