Re: Postgres vs. MySQL - Mailing list pgsql-performance

From Steinar H. Gunderson
Subject Re: Postgres vs. MySQL
Date
Msg-id 20041125011823.GA4640@uio.no
Whole thread Raw
In response to Postgres vs. MySQL  (Evilio del Rio <edelrio@cmima.csic.es>)
List pgsql-performance
On Wed, Nov 24, 2004 at 02:14:18PM +0100, Evilio del Rio wrote:
> I have installed the dspam filter
> (http://www.nuclearelephant.com/projects/dspam) on our mail server
> (RedHat 7.3 Linux with sendmail 8.13 and procmail). I have ~300 users
> with a quite low traffic of 4000 messages/day. So it's a quite common
> platform/environment, nothing spectacular.

We just had a case just like this on #postgresql. The (somewhat surprising)
solution was increasing the statistics target on the "token" column to
something like 200, which makes the planner choose an index scan instead of a
sequential scan.

For the people who did not follow the case: The culprit is a query like

  SELECT * FROM table WHERE token IN ('346369873476346', '4376376034', ...)

(token is a numeric(20,0)) With one entry in the IN (), the cost of an index
scan was estimated to 4.77; with ten entries, it was about 48, but with 574
entries the estimated cost was 513565 (!!), making the planner prefer an
index scan to 574 consecutive index scans. Upping the statistics target made
the planner estimate the cost to about ~4000, and thus select the index scan,
which was two orders of magnitude faster.

BTW, this case was with PostgreSQL 7.4.6, not 7.3 as the poster here is
reporting.

/* Steinar */
--
Homepage: http://www.sesse.net/

pgsql-performance by date:

Previous
From: Mike Mascari
Date:
Subject: Re: Slow execution time when querying view with WHERE clause
Next
From: "Dave Page"
Date:
Subject: Re: [pgsql-hackers-win32] scalability issues on win32