Re: SQL command speed - Mailing list pgsql-sql

From Tom Lane
Subject Re: SQL command speed
Date
Msg-id 9901.958772341@sss.pgh.pa.us
Whole thread Raw
In response to Re: SQL command speed  (Kate Collins <klcollins@wsicorp.com>)
List pgsql-sql
Kate Collins <klcollins@wsicorp.com> writes:
> You are correct in your assessment of the nature of the distribution
> of the elements in the item_a column.  Some values return none or a
> few rows.  While other values return many rows (100 or more).  It is
> not an even distribution.  I had never really considered what effect
> this would have on searching before.

Apparently you have at least one value with close to 2000 rows (IIRC,
the rows estimate is basically half the MCV's row count, except in
the special case where the planner can see that you are comparing
against the MCV itself).

If that's a real data value and not a dummy, then of course you can't
replace it by NULL.  In that case I'm kind of stuck for a good answer
for 7.0; we can't do much to distinguish the few-distinct-values case
from the many-distinct-values-with-some-frequency-outliers case until
we have better statistics than 7.0 keeps.  What you could do as a
stopgap is to force the system to use an indexscan against its
judgment, by doing this before issuing the problem query:
SET enable_seqscan = OFF;

Naturally I recommend keeping this ON most of the time, but sometimes
you just have to get out the blunt instruments ;-)
        regards, tom lane


pgsql-sql by date:

Previous
From: Tom Lane
Date:
Subject: Re: Creating Tables in Functions
Next
From: "Stuart Grimshaw"
Date:
Subject: Adding many rows to a table.