Re: Bitmap index scans use of filters on available columns - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Bitmap index scans use of filters on available columns
Date
Msg-id 563ABBC3.1050309@2ndquadrant.com
Whole thread Raw
In response to Re: Bitmap index scans use of filters on available columns  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Bitmap index scans use of filters on available columns  (Robert Haas <robertmhaas@gmail.com>)
Re: Bitmap index scans use of filters on available columns  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
Hi,

On 11/04/2015 11:32 PM, Tom Lane wrote:
> Jeff Janes <jeff.janes@gmail.com> writes:
>> On Wed, Nov 4, 2015 at 7:14 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> You're missing my point: that is possible in an indexscan, but
>>> *not* in a bitmap indexscan, because the index AM APIs are
>>> totally different in the two cases. In a bitmap scan, nothing
>>> more than a TID bitmap is ever returned out to anyplace that
>>> could execute arbitrary expressions.
>
>> I had thought it must already be able to execute arbitrary
>> expressions, due to the ability to already support user-defined
>> btree ops (and ops of non-btree types in the case of other index
>> types).
>
> No. An index AM is only expected to be able to evaluate clauses of
> the form <indexed_column> <indexable_operator> <constant>, and the
> key restriction there is that the operator is one that the AM has
> volunteered to support. Well, actually, it's the opclass more than
> the AM that determines this, but anyway it's not just some random
> operator; more than likely, the AM and/or opclass has got special
> logic about the operator.

Isn't that pretty much exactly the point made by Jeff and Simon, that 
index AM is currently only allowed to handle the indexable operators, 
i.e. operators that it can explicitly optimize (e.g. use to walk the 
btree and such), and completely ignores the other operators despite 
having all the columns in the index. Which means we'll have to do the 
heap fetch, which usually means a significant performance hit.

>
> This also ties into Robert's point about evaluation of operators
> against index entries for dead or invisible rows. Indexable operators
> are much less likely than others to have unexpected side-effects.

I certainly understand there are cases that require care - like the 
leakproof thing pointed out by Robert for example. I don't immediately 
see why evaluation against dead rows would be a problem.

But maybe we can derive a set of rules required from the operators? Say 
only those marked as leakproof when RLS is enabled on the table, and 
perhaps additional things.

A "bruteforce" way would be to extend each index AM with every possible 
operator, but that's not quite manageable I guess. But why couldn't we 
provide a generic infrastructure that would allow filtering "safe" 
expressions and validating them on an index tuple?


kind regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: David Fetter
Date:
Subject: Re: [patch] Proposal for \rotate in psql
Next
From: David Rowley
Date:
Subject: Re: WIP: Make timestamptz_out less slow.