Re: Performance problems testing with Spamassassin 3.1.0 - Mailing list pgsql-performance

From Gavin Sherry
Subject Re: Performance problems testing with Spamassassin 3.1.0
Date
Msg-id Pine.LNX.4.58.0507291550180.11044@linuxworld.com.au
Whole thread Raw
In response to Re: Performance problems testing with Spamassassin 3.1.0  (Matthew Schumacher <matt.s@aptalaska.net>)
List pgsql-performance
zOn Thu, 28 Jul 2005, Matthew Schumacher wrote:

> Gavin Sherry wrote:
>
> >
> > I had a look at your data -- thanks.
> >
> > I have a question though: put_token() is invoked 120596 times in your
> > benchmark... for 616 messages. That's nearly 200 queries (not even
> > counting the 1-8 (??) inside the function itself) per message. Something
> > doesn't seem right there....
> >
> > Gavin
>
> I am pretty sure that's right because it is doing word statistics on
> email messages.
>
> I need to spend some time studying the code, I just haven't found time yet.
>
> Would it be safe to say that there isn't any glaring performance
> penalties other than the sheer volume of queries?

Well, everything relating to one message should be issued in a transaction
block. Secondly, the initial select may be unnecessary -- I haven't looked
at the logic that closely.

There is, potentially, some parser overhead. In C, you could get around
this with PQprepare() et al.

It would also be interesting to look at the cost of a C function.

Gavin

pgsql-performance by date:

Previous
From: Andrew McMillan
Date:
Subject: Re: Performance problems testing with Spamassassin 3.1.0
Next
From: Dennis Bjorklund
Date:
Subject: Re: Performance problems testing with Spamassassin 3.1.0