Re: CTE query plan ignores selective index

From: Tom Lane
Subject: Re: CTE query plan ignores selective index
Date: ,
Msg-id: 1727.1417558023@sss.pgh.pa.us
(view: Whole thread, Raw)
In response to: CTE query plan ignores selective index  (Patrick Krecker)
List: pgsql-performance

Patrick Krecker <> writes:
>            ->  Nested Loop  (cost=0.57..6148400.28 rows=1 width=24)
> (actual time=0.063..4.054 rows=88 loops=9)
>                  ->  WorkTable Scan on path path_1  (cost=0.00..33.40
> rows=1670 width=16) (actual time=0.000..0.006 rows=112 loops=9)
>                  ->  Index Scan using component_document_id on
> component t_1  (cost=0.57..3681.65 rows=1 width=16) (actual
> time=0.023..0.036 rows=1 loops=1007)
>                        Index Cond: (document_id = path_1.document_id)
>                        Filter: (path_1.parent_id = internal_id)
>                        Rows Removed by Filter: 237

> I would think that it has decided that the document_id index is not
> very selective for the given mix of rows, however I checked the
> statistics for the table and I found that n_distinct for document_id
> is 101559 (the true value is 162545). The value of pg_class.reltuples
> for the table is 96055600, which is very close to the true value
> 94613537.

> In the first query, it appears to me that postgres thinks the index
> scan is much more expensive than it really is. However, given the
> accurate statistics, I can't see how.

I think the problem is that it doesn't have any stats for the output of
path_1, so it's probably falling back on some rather generic assumptions
about how many component rows will match each of the two join conditions.
That causes it to think that the indexscan will reject a lot of rows at
the filter step and therefore be expensive.  Possibly that could be
improved, but it won't happen overnight.

The most expeditious way to fix this would likely be to provide an
index on component(document_id, internal_id).  The planner should
then think an indexscan on that is cheap, regardless of whether the
check on internal_id is really doing much of anything.

            regards, tom lane



pgsql-performance by date:

From: Tom Lane
Date:
Subject: Re: CTE query plan ignores selective index
From: Josh Berkus
Date:
Subject: Re: issue in postgresql 9.1.3 in using arrow key in Solaris platform