Re: PostgreSQL 12.3 slow index scan chosen - Mailing list pgsql-performance

From Tom Lane
Subject Re: PostgreSQL 12.3 slow index scan chosen
Date
Msg-id 1762066.1592608033@sss.pgh.pa.us
Whole thread Raw
In response to PostgreSQL 12.3 slow index scan chosen  (Kenneth Marshall <ktm@rice.edu>)
Responses Re: PostgreSQL 12.3 slow index scan chosen  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
[ please keep the mailing list cc'd ]

Kenneth Marshall <ktm@rice.edu> writes:
> Here are the stats for articles.id:

> 4,7,9,11,13,14,16,17,18,19,20,21,22,23,
> 24,25,26,32,33,34,36,40,41,42,43,44,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,99,100,101,102,106,107,108,109,113,1
14,115,116,117,118,119,120,121,122,123,125,126,127,128,129,130,131,133,134,135,136,137,140,141,142,143,144,145,146,14
7,148,149,150,151,152,153,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177
,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,
207,208,1209,1210,1212,1213,1214,1215,1216,1219,1220,1221,1222,1223}          
> That completely matches the max(id) for articles.id.

Hm, well it's clear why the planner is going for the mergejoin strategy:
it expects to only have to scan a very small fraction of the other table
before it's up past objectid = 1223 and can stop merging.  And it
seems like it's right ...

... oh, now I see: apparently, your filter condition is such that *no*
rows of the objectcustomfieldvalues table get past the filter:

              ->  Index Scan using objectcustomfieldvalues3 on objectcustomfieldvalues objectcustomfieldvalues_1
(cost=0.56..807603.40rows=915 width=4) (actual time=21165.441..21165.441 rows=0 loops=1) 
                     Filter: ((disabled = 0) AND ((largecontent ~~* '%958575%'::text) OR ((content)::text ~~*
'%958575%'::text)))
                     Rows Removed by Filter: 19030904

"rows=0" is the telltale.  So even after we're past objectid = 1223, that
scan continues, because the mergejoin needs to see a higher key before it
knows it can stop.

That's kind of annoying :-(.  I wonder if there's a way to be smarter?
This case would work a lot better if the filter conditions were not
applied till after the merge; but of course that wouldn't be an
improvement in general.  Or maybe we should penalize the mergejoin
cost estimate if there's a highly selective filter in the way.
(It does look like the planner is correctly estimating that the
filter is quite selective --- it's just not considering the potential
impact on the scan-until-finding-a-greater-key behavior.)

Right now I don't have any better suggestion than disabling mergejoin
if you think the filter is going to be very selective.

            regards, tom lane



pgsql-performance by date:

Previous
From: Kenneth Marshall
Date:
Subject: Re: PostgreSQL 12.3 slow index scan chosen
Next
From: Tom Lane
Date:
Subject: Re: PostgreSQL 12.3 slow index scan chosen