Home > mailing lists

Re: Avoid full GIN index scan when possible - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: Avoid full GIN index scan when possible
Date	June 28, 2019 22:54:23
Msg-id	20190628225423.dgtyyux7tlk7p3fk@development Whole thread
In response to	Re: Avoid full GIN index scan when possible (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

On Fri, Jun 28, 2019 at 04:16:23PM -0400, Tom Lane wrote:
>Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> On Fri, Jun 28, 2019 at 03:03:19PM -0400, Tom Lane wrote:
>>> I not only don't want that function in indxpath.c, I don't even want
>>> it to be known/called from there.  If we need the ability for the index
>>> AM to editorialize on the list of indexable quals (which I'm not very
>>> convinced of yet), let's make an AM interface function to do it.
>
>> Wouldn't it be better to have a function that inspects a single qual and
>> says whether it's "optimizable" or not? That could be part of the AM
>> implementation, and we'd call it and it'd be us messing with the list.
>
>Uh ... we already determined that the qual is indexable (ie is a member
>of the index's opclass), or allowed the index AM to derive an indexable
>clause from it, so I'm not sure what you envision would happen
>additionally there.  If I understand what Julien is concerned about
>--- and I may not --- it's that the set of indexable clauses *as a whole*
>may have or lack properties of interest.  So I'm thinking the answer
>involves some callback that can do something to the whole list, not
>qual-at-a-time.  We've already got facilities for the latter case.
>

I'm not sure I understand the problem either.

I don't think "indexable" is the thing we care about here - in Julien's
original example the qual with '%a%' is indexable. And we probably want
to keep it that way.

The problem is that evaluating some of the quals may be inefficient with
a given index - but only if there are other quals. In Julien's example
it makes sense to just drop the '%a%' qual, but only when there are some
quals that work with the trigram index. But if there are no such 'good'
quals, it may be better to keep al least the bad ones.

So I think you're right we need to look at the list as a whole.

>> But that kinda resembles stuff we already have - selectivity/cost. So
>> why shouldn't this be considered as part of costing?
>
>Yeah, I'm not entirely convinced that we need anything new here.
>The cost estimate function can detect such situations, and so can
>the index AM at scan start --- for example, btree checks for
>contradictory quals at scan start.  There's a certain amount of
>duplicative effort involved there perhaps, but you also have to
>keep in mind that we don't know the values of run-time-determined
>comparison values until scan start.  So if you want certainty rather
>than just a cost estimate, you may have to do these sorts of checks
>at scan start.
>

Right, that's why I suggested doing this as part of costing, but you're
right scan start would be another option. I assume it should affect cost
estimates in some way, so the cost function would be my first choice.

But does the cost function really has enough info to make such decision?
For example, ignoring quals is valid only if we recheck them later. For
GIN that's not an issue thanks to the bitmap index scan.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

From: Nikita Glukhov
Date: 28 June 2019, 22:50:18
Subject: Re: Avoid full GIN index scan when possible

From: Amit Kapila
Date: 28 June 2019, 23:46:34
Subject: Re: [HACKERS] WAL logging problem in 9.4.3?

Re: Avoid full GIN index scan when possible - Mailing list pgsql-hackers

Previous

Next