Re: PostgreSQL 12.3 slow index scan chosen - Mailing list pgsql-performance

From Tom Lane
Subject Re: PostgreSQL 12.3 slow index scan chosen
Date
Msg-id 1815407.1592677323@sss.pgh.pa.us
Whole thread Raw
In response to Re: PostgreSQL 12.3 slow index scan chosen  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: PostgreSQL 12.3 slow index scan chosen  (Kenneth Marshall <ktm@rice.edu>)
Re: PostgreSQL 12.3 slow index scan chosen  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-performance
I wrote:
> ... oh, now I see: apparently, your filter condition is such that *no*
> rows of the objectcustomfieldvalues table get past the filter:
>
>               ->  Index Scan using objectcustomfieldvalues3 on objectcustomfieldvalues objectcustomfieldvalues_1
(cost=0.56..807603.40rows=915 width=4) (actual time=21165.441..21165.441 rows=0 loops=1) 
>                      Filter: ((disabled = 0) AND ((largecontent ~~* '%958575%'::text) OR ((content)::text ~~*
'%958575%'::text)))
>                      Rows Removed by Filter: 19030904

> That's kind of annoying :-(.  I wonder if there's a way to be smarter?
> This case would work a lot better if the filter conditions were not
> applied till after the merge; but of course that wouldn't be an
> improvement in general.  Or maybe we should penalize the mergejoin
> cost estimate if there's a highly selective filter in the way.

I experimented with this some more, with the intention of creating a
planner patch that would do the latter, and was surprised to find that
there already is such a penalty.  It's sort of indirect and undocumented,
but nonetheless the estimate for whether a mergejoin can stop early
does get heavily de-rated if the planner realizes that the table is
being heavily filtered.  So now I'm thinking that your problem is that
"rows=915" is not a good enough estimate of what will happen in this
indexscan.  It looks good in comparison to the table size of 19M rows,
but on a percentage basis compared to the true value of 0 rows, it's
still pretty bad.  You said you'd increased the stats target for
objectcustomfieldvalues.objectid, but maybe the real problem is needing
to increase the targets for content and largecontent, in hopes of driving
down the estimate for how many rows will pass this filter condition.

            regards, tom lane



pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: PostgreSQL 12.3 slow index scan chosen
Next
From: Kenneth Marshall
Date:
Subject: Re: PostgreSQL 12.3 slow index scan chosen