Re: PostgreSQL 12.3 slow index scan chosen - Mailing list pgsql-performance

From Tom Lane
Subject Re: PostgreSQL 12.3 slow index scan chosen
Date
Msg-id 1755412.1592597470@sss.pgh.pa.us
Whole thread Raw
In response to PostgreSQL 12.3 slow index scan chosen  (Kenneth Marshall <ktm@rice.edu>)
Responses Re: PostgreSQL 12.3 slow index scan chosen  (Kenneth Marshall <ktm@rice.edu>)
List pgsql-performance
Kenneth Marshall <ktm@rice.edu> writes:
> I was looking at a slow query in our CMDB that using postgresql-12.3 as its
> backend. I since I am using the pg_trgm extension elsewhere I decided to give
> it a try. To my surprise, the query plan did not change. But when I disabled
> the index scan I got the much, much faster scan using a bitmap index scan.
> Any ideas about why that is being chosen? Here are the details:

It looks like the planner is being too optimistic about how quickly the
mergejoin will end:

>          ->  Merge Join  (cost=0.71..892.64 rows=1 width=137) (actual time=21165.453..21165.453 rows=0 loops=1)
>                Merge Cond: (main.id = objectcustomfieldvalues_1.objectid)
>                ->  Index Scan using articles_pkey on articles main  (cost=0.14..9.08 rows=142 width=137) (actual
time=0.007..0.007rows=1 loops=1) 
>                      Filter: (disabled = '0'::smallint)
>                ->  Index Scan using objectcustomfieldvalues3 on objectcustomfieldvalues objectcustomfieldvalues_1
(cost=0.56..807603.40rows=915 width=4) (actual time=21165.441..21165.441 rows=0 loops=1) 
>                      Filter: ((disabled = 0) AND ((largecontent ~~* '%958575%'::text) OR ((content)::text ~~*
'%958575%'::text)))
>                      Rows Removed by Filter: 19030904

This merge cost estimate is way lower than the sum of the input scan
estimates, where normally it would be that sum plus a nontrivial charge
for comparisons.  So the planner must think that the input scans won't
run to completion.  Which is something that can happen; merge join
will stop as soon as either input is exhausted.  But in this case it
looks like the objectcustomfieldvalues scan is the one that ran to
completion, while the articles scan had only one row demanded from it.
(We can see from the other plan that articles has 146 rows satisfying
the filter, so that scan must have been shut down before completion.)
The planner must have been expecting the other way around, with not
very much of the expensive objectcustomfieldvalues scan actually getting
done.

The reason for such an estimation error usually is that the maximum
join key values recorded in pg_stats are off: the join side that is
going to be exhausted is the one with the smaller max join key.
"articles" seems to be small enough that the stats for it will be
exact, so your problem is a poor estimate of the max value of
objectcustomfieldvalues.objectid.  You might try raising the statistics
target for that table.  Or maybe it's just that ANALYZE hasn't been
done lately on one table or the other?

            regards, tom lane



pgsql-performance by date:

Previous
From: Kenneth Marshall
Date:
Subject: PostgreSQL 12.3 slow index scan chosen
Next
From: "Benjamin Coutu"
Date:
Subject: Re: Unclamped row estimates whith OR-ed subplans