Re: Very slow select - Mailing list pgsql-general

From Tom Lane
Subject Re: Very slow select
Date
Msg-id 11978.977499339@sss.pgh.pa.us
Whole thread Raw
In response to Re: Very slow select  (Stefano Bargioni <bargioni@usc.urbe.it>)
List pgsql-general
Stefano Bargioni <bargioni@usc.urbe.it> writes:
>> What does EXPLAIN say about the two queries?

> ucc2=> explain select count(*) from document where xtitle='a';
> NOTICE:  QUERY PLAN:

> Aggregate  (cost=70.61 rows=759 width=4)
>  -> Index Scan using xtitle_index on document  (cost=70.61 rows=759 width=4)

> EXPLAIN
> ucc2=> explain select count(*) from document where xothertitles='a';
> NOTICE:  QUERY PLAN:

> Aggregate  (cost=26005.44 rows=287195 width=4)
>   -> Seq Scan on document  (cost=26005.44 rows=287195 width=4)

The reason for the difference in plan is the difference in estimated
number of rows retrieved.  You probably have some extremely common
value in xothertitles, which is skewing the statistics that the planner
uses for this estimate.

Sometimes, what's skewing the stats is a value that's actually just a
placeholder for "unknown", such as an empty string or 'UNKNOWN' or
something like that.  If possible, I recommend coding your application
to use NULL for this purpose, rather than any ordinary data value.
The stats account for NULLs separately, so the system won't be fooled
by having a lot of them.

            regards, tom lane

pgsql-general by date:

Previous
From: "Adam Lang"
Date:
Subject: Re: Re: [HACKERS] Merry X-Mass
Next
From: Tom Lane
Date:
Subject: Re: allowing users access to a trusted C function