Re: Performance problems testing with Spamassassin 3.1.0 - Mailing list pgsql-performance

From Matthew Schumacher
Subject Re: Performance problems testing with Spamassassin 3.1.0
Date
Msg-id 42E9749F.6000709@aptalaska.net
Whole thread Raw
In response to Re: Performance problems testing with Spamassassin 3.1.0  (Karim Nassar <karim.nassar@nau.edu>)
Responses Re: Performance problems testing with Spamassassin 3.1.0  (Gavin Sherry <swm@alcove.com.au>)
Re: Performance problems testing with Spamassassin 3.1.0  (Andrew McMillan <andrew@catalyst.net.nz>)
List pgsql-performance
Karim Nassar wrote:
> On Wed, 2005-07-27 at 14:35 -0800, Matthew Schumacher wrote:
>
>
>>I put the rest of the schema up at
>>http://www.aptalaska.net/~matt.s/bayes/bayes_pg.sql in case someone
>>needs to see it too.
>
>
> Do you have sample data too?
>

Ok, I finally got some test data together so that others can test
without installing SA.

The schema and test dataset is over at
http://www.aptalaska.net/~matt.s/bayes/bayesBenchmark.tar.gz

I have a pretty fast machine with a tuned postgres and it takes it about
2 minutes 30 seconds to load the test data.  Since the test data is the
bayes information on 616 spam messages than comes out to be about 250ms
per message.  While that is doable, it does add quite a bit of overhead
to the email system.

Perhaps this is as fast as I can expect it to go, if that's the case I
may have to look at mysql, but I really don't want to do that.

I will be working on some other benchmarks, and reading though exactly
how bayes works, but at least there is some data to play with.

schu

pgsql-performance by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: [PATCHES] COPY FROM performance improvements
Next
From: Michael Fuhr
Date:
Subject: Re: Two queries are better than one?