Thread: CTE query plan ignores selective index

CTE query plan ignores selective index

From
Patrick Krecker
Date:
Hi all -- I am trying to do a better job of understanding why the
planner chooses some plans over others, and I ran into this issue this
morning where the planner ends up choosing a query that's 15000x
slower. I have a table that represents nodes (here called
"components") in a tree. Each node has a parent_id; the root node has
a NULL parent_id. I wanted to find the route from each node to its
root. Here is my query:

# explain analyze  WITH RECURSIVE path(start, id, internal_id,
parent_id, document_id, depth) AS (
        SELECT t.id, t.id, t.internal_id, t.parent_id, t.document_id, 1
        FROM component t
        WHERE id < 6361197
    UNION ALL
        SELECT path.start, t.id, t.internal_id, t.parent_id,
t.document_id, path.depth+1
        FROM component t, path
        WHERE t.internal_id = path.parent_id AND t.document_id=path.document_id
)
SELECT * FROM path ;

          QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 CTE Scan on path  (cost=61484650.85..61484654.39 rows=177 width=24)
(actual time=0.007..36.958 rows=1007 loops=1)
   CTE path
     ->  Recursive Union  (cost=0.57..61484650.85 rows=177 width=24)
(actual time=0.007..36.755 rows=1007 loops=1)
           ->  Index Scan using component_pkey on component t
(cost=0.57..644.56 rows=167 width=16) (actual time=0.006..0.076
rows=218 loops=1)
                 Index Cond: (id < 6361197)
           ->  Nested Loop  (cost=0.57..6148400.28 rows=1 width=24)
(actual time=0.063..4.054 rows=88 loops=9)
                 ->  WorkTable Scan on path path_1  (cost=0.00..33.40
rows=1670 width=16) (actual time=0.000..0.006 rows=112 loops=9)
                 ->  Index Scan using component_document_id on
component t_1  (cost=0.57..3681.65 rows=1 width=16) (actual
time=0.023..0.036 rows=1 loops=1007)
                       Index Cond: (document_id = path_1.document_id)
                       Filter: (path_1.parent_id = internal_id)
                       Rows Removed by Filter: 237
 Total runtime: 37.039 ms


However, when I add one more row to the seed query of the CTE, it
changes the plan to this:

# explain analyze  WITH RECURSIVE path(start, id, internal_id,
parent_id, document_id, depth) AS (
        SELECT t.id, t.id, t.internal_id, t.parent_id, t.document_id, 1
        FROM component t
        WHERE id < 6361198
    UNION ALL
        SELECT path.start, t.id, t.internal_id, t.parent_id,
t.document_id, path.depth+1
        FROM component t, path
        WHERE t.internal_id = path.parent_id AND t.document_id=path.document_id
)
SELECT * FROM path ;

     QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------
 CTE Scan on path  (cost=61122640.05..61122643.61 rows=178 width=24)
(actual time=0.008..587814.729 rows=1008 loops=1)
   CTE path
     ->  Recursive Union  (cost=0.57..61122640.05 rows=178 width=24)
(actual time=0.006..587814.346 rows=1008 loops=1)
           ->  Index Scan using component_pkey on component t
(cost=0.57..648.36 rows=168 width=16) (actual time=0.006..0.076
rows=219 loops=1)
                 Index Cond: (id < 6361198)
           ->  Hash Join  (cost=5543292.40..6112198.81 rows=1
width=24) (actual time=64743.932..65312.625 rows=88 loops=9)
                 Hash Cond: ((path_1.parent_id = t_1.internal_id) AND
(path_1.document_id = t_1.document_id))
                 ->  WorkTable Scan on path path_1  (cost=0.00..33.60
rows=1680 width=16) (actual time=0.001..0.015 rows=112 loops=9)
                 ->  Hash  (cost=3627866.96..3627866.96 rows=96335696
width=16) (actual time=64572.641..64572.641 rows=94613537 loops=9)
                       Buckets: 4096  Batches: 8192  Memory Usage: 537kB
                       ->  Seq Scan on component t_1
(cost=0.00..3627866.96 rows=96335696 width=16) (actual
time=0.005..43364.346 rows=94613537 loops=9)
 Total runtime: 587814.885 ms

I would think that it has decided that the document_id index is not
very selective for the given mix of rows, however I checked the
statistics for the table and I found that n_distinct for document_id
is 101559 (the true value is 162545). The value of pg_class.reltuples
for the table is 96055600, which is very close to the true value
94613537.

In the first query, it appears to me that postgres thinks the index
scan is much more expensive than it really is. However, given the
accurate statistics, I can't see how.

BTW I tried playing with random_page_cost. If I lower it to 2.0 then
it chooses the fast plan. At 3.0 it chooses the slow plan.

Thanks!
Patrick


Re: CTE query plan ignores selective index

From
Tom Lane
Date:
Patrick Krecker <patrick@judicata.com> writes:
>            ->  Nested Loop  (cost=0.57..6148400.28 rows=1 width=24)
> (actual time=0.063..4.054 rows=88 loops=9)
>                  ->  WorkTable Scan on path path_1  (cost=0.00..33.40
> rows=1670 width=16) (actual time=0.000..0.006 rows=112 loops=9)
>                  ->  Index Scan using component_document_id on
> component t_1  (cost=0.57..3681.65 rows=1 width=16) (actual
> time=0.023..0.036 rows=1 loops=1007)
>                        Index Cond: (document_id = path_1.document_id)
>                        Filter: (path_1.parent_id = internal_id)
>                        Rows Removed by Filter: 237

> I would think that it has decided that the document_id index is not
> very selective for the given mix of rows, however I checked the
> statistics for the table and I found that n_distinct for document_id
> is 101559 (the true value is 162545). The value of pg_class.reltuples
> for the table is 96055600, which is very close to the true value
> 94613537.

> In the first query, it appears to me that postgres thinks the index
> scan is much more expensive than it really is. However, given the
> accurate statistics, I can't see how.

I think the problem is that it doesn't have any stats for the output of
path_1, so it's probably falling back on some rather generic assumptions
about how many component rows will match each of the two join conditions.
That causes it to think that the indexscan will reject a lot of rows at
the filter step and therefore be expensive.  Possibly that could be
improved, but it won't happen overnight.

The most expeditious way to fix this would likely be to provide an
index on component(document_id, internal_id).  The planner should
then think an indexscan on that is cheap, regardless of whether the
check on internal_id is really doing much of anything.

            regards, tom lane