Re: Avoid full GIN index scan when possible - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Avoid full GIN index scan when possible |
Date | |
Msg-id | 20190628225423.dgtyyux7tlk7p3fk@development Whole thread Raw |
In response to | Re: Avoid full GIN index scan when possible (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
On Fri, Jun 28, 2019 at 04:16:23PM -0400, Tom Lane wrote: >Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> On Fri, Jun 28, 2019 at 03:03:19PM -0400, Tom Lane wrote: >>> I not only don't want that function in indxpath.c, I don't even want >>> it to be known/called from there. If we need the ability for the index >>> AM to editorialize on the list of indexable quals (which I'm not very >>> convinced of yet), let's make an AM interface function to do it. > >> Wouldn't it be better to have a function that inspects a single qual and >> says whether it's "optimizable" or not? That could be part of the AM >> implementation, and we'd call it and it'd be us messing with the list. > >Uh ... we already determined that the qual is indexable (ie is a member >of the index's opclass), or allowed the index AM to derive an indexable >clause from it, so I'm not sure what you envision would happen >additionally there. If I understand what Julien is concerned about >--- and I may not --- it's that the set of indexable clauses *as a whole* >may have or lack properties of interest. So I'm thinking the answer >involves some callback that can do something to the whole list, not >qual-at-a-time. We've already got facilities for the latter case. > I'm not sure I understand the problem either. I don't think "indexable" is the thing we care about here - in Julien's original example the qual with '%a%' is indexable. And we probably want to keep it that way. The problem is that evaluating some of the quals may be inefficient with a given index - but only if there are other quals. In Julien's example it makes sense to just drop the '%a%' qual, but only when there are some quals that work with the trigram index. But if there are no such 'good' quals, it may be better to keep al least the bad ones. So I think you're right we need to look at the list as a whole. >> But that kinda resembles stuff we already have - selectivity/cost. So >> why shouldn't this be considered as part of costing? > >Yeah, I'm not entirely convinced that we need anything new here. >The cost estimate function can detect such situations, and so can >the index AM at scan start --- for example, btree checks for >contradictory quals at scan start. There's a certain amount of >duplicative effort involved there perhaps, but you also have to >keep in mind that we don't know the values of run-time-determined >comparison values until scan start. So if you want certainty rather >than just a cost estimate, you may have to do these sorts of checks >at scan start. > Right, that's why I suggested doing this as part of costing, but you're right scan start would be another option. I assume it should affect cost estimates in some way, so the cost function would be my first choice. But does the cost function really has enough info to make such decision? For example, ignoring quals is valid only if we recheck them later. For GIN that's not an issue thanks to the bitmap index scan. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: