Re: Postgres vs. DSpam - Mailing list pgsql-performance
From | Jim C. Nasby |
---|---|
Subject | Re: Postgres vs. DSpam |
Date | |
Msg-id | 20041129215056.GZ41545@decibel.org Whole thread Raw |
In response to | Re: Postgres vs. DSpam (Andrew McMillan <andrew@catalyst.net.nz>) |
List | pgsql-performance |
FWIW, those queries won't be able to use an index. A better WHERE clause would be: AND last_hit < CURRENT_DATE - 60 On Fri, Nov 26, 2004 at 02:37:12PM +1300, Andrew McMillan wrote: > On Wed, 2004-11-24 at 14:14 +0100, Evilio del Rio wrote: > > Hi, > > > > I have installed the dspam filter > > (http://www.nuclearelephant.com/projects/dspam) on our mail server > > (RedHat 7.3 Linux with sendmail 8.13 and procmail). I have ~300 users > > with a quite low traffic of 4000 messages/day. So it's a quite common > > platform/environment, nothing spectacular. > > I am using DSpam with PostgreSQL here. I have a daily job that cleans > the DSpam database up, as follows: > > DELETE FROM dspam_token_data > WHERE (innocent_hits*2) + spam_hits < 5 > AND CURRENT_DATE - last_hit > 60; > > DELETE FROM dspam_token_data > WHERE innocent_hits = 1 > AND CURRENT_DATE - last_hit > 30; > > DELETE FROM dspam_token_data > WHERE CURRENT_DATE - last_hit > 180; > > DELETE FROM dspam_signature_data > WHERE CURRENT_DATE - created_on > 14; > > VACUUM dspam_token_data; > > VACUUM dspam_signature_data; > > > > I also occasionally do a "VACUUM FULL ANALYZE;" on the database as well. > > > In all honesty though, I think that MySQL is better suited to DSpam than > PostgreSQL is. > > > > Please, could anyone explain me this difference? > > Is Postgres that bad? > > Is MySQL that good? > > Am I the only one to have observed this behavior? > > I believe that what DSpam does that is not well-catered for in the way > PostgreSQL operates, is that it does very frequent updates to rows in > (eventually) quite large tables. In PostgreSQL the UPDATE will result > internally in a new record being written, with the old record being > marked as deleted. That old record won't be re-used until after a > VACUUM has run, and this means that the on-disk tables will have a lot > of dead rows in them quite quickly. > > The reason that PostgreSQL operates this way, is a direct result of the > way transactional support is implemented, and it may well change in a > version or two. It's got better over the last few versions, with things > like pg_autovacuum, but that approach still doesn't suit some types of > database updating. > > Cheers, > Andrew. > ------------------------------------------------------------------------- > Andrew @ Catalyst .Net .NZ Ltd, PO Box 11-053, Manners St, Wellington > WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St > DDI: +64(4)803-2201 MOB: +64(272)DEBIAN OFFICE: +64(4)499-2267 > These PRESERVES should be FORCE-FED to PENTAGON OFFICIALS!! > ------------------------------------------------------------------------- > -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
pgsql-performance by date: