Re: PostgreSQL 12.3 slow index scan chosen - Mailing list pgsql-performance

From Kenneth Marshall
Subject Re: PostgreSQL 12.3 slow index scan chosen
Date
Msg-id 20200619204950.GL1497@aart.rice.edu
Whole thread Raw
In response to Re: PostgreSQL 12.3 slow index scan chosen  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: PostgreSQL 12.3 slow index scan chosen  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
On Fri, Jun 19, 2020 at 04:11:10PM -0400, Tom Lane wrote:
> 
> It looks like the planner is being too optimistic about how quickly the
> mergejoin will end:
> 
> >          ->  Merge Join  (cost=0.71..892.64 rows=1 width=137) (actual time=21165.453..21165.453 rows=0 loops=1)
> >                Merge Cond: (main.id = objectcustomfieldvalues_1.objectid)
> >                ->  Index Scan using articles_pkey on articles main  (cost=0.14..9.08 rows=142 width=137) (actual
time=0.007..0.007rows=1 loops=1)
 
> >                      Filter: (disabled = '0'::smallint)
> >                ->  Index Scan using objectcustomfieldvalues3 on objectcustomfieldvalues objectcustomfieldvalues_1
(cost=0.56..807603.40rows=915 width=4) (actual time=21165.441..21165.441 rows=0 loops=1)
 
> >                      Filter: ((disabled = 0) AND ((largecontent ~~* '%958575%'::text) OR ((content)::text ~~*
'%958575%'::text)))
> >                      Rows Removed by Filter: 19030904
> 
> This merge cost estimate is way lower than the sum of the input scan
> estimates, where normally it would be that sum plus a nontrivial charge
> for comparisons.  So the planner must think that the input scans won't
> run to completion.  Which is something that can happen; merge join
> will stop as soon as either input is exhausted.  But in this case it
> looks like the objectcustomfieldvalues scan is the one that ran to
> completion, while the articles scan had only one row demanded from it.
> (We can see from the other plan that articles has 146 rows satisfying
> the filter, so that scan must have been shut down before completion.)
> The planner must have been expecting the other way around, with not
> very much of the expensive objectcustomfieldvalues scan actually getting
> done.
> 
> The reason for such an estimation error usually is that the maximum
> join key values recorded in pg_stats are off: the join side that is
> going to be exhausted is the one with the smaller max join key.
> "articles" seems to be small enough that the stats for it will be
> exact, so your problem is a poor estimate of the max value of
> objectcustomfieldvalues.objectid.  You might try raising the statistics
> target for that table.  Or maybe it's just that ANALYZE hasn't been
> done lately on one table or the other?
> 
>             regards, tom lane

Hi Tod,

Thank you for the information and suggestion. I tried bumping the statistics for the
objectcustomfieldvalues.objectid column to 2k, 5k and 10k followed by an analyze and
the query plan stayed the same. I also analyzed the article table
repeatedly and their was no change in the plan. The table articles only has 151 rows
while the objectcustomfieldvalues table has 19031909 rows. Any idea
about why it is so far off?

Regards,
Ken



pgsql-performance by date:

Previous
From: "Benjamin Coutu"
Date:
Subject: Re: Unclamped row estimates whith OR-ed subplans
Next
From: Tom Lane
Date:
Subject: Re: PostgreSQL 12.3 slow index scan chosen