Re: Tid scan improvements - Mailing list pgsql-hackers

From Edmund Horner
Subject Re: Tid scan improvements
Date
Msg-id CAMyN-kDaFgpkbYw1dVSvADp4KZisq7oNdaQ1GkyVAqEZ0Xvk=Q@mail.gmail.com
Whole thread Raw
In response to Re: Tid scan improvements  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Tid scan improvements
Re: Tid scan improvements
List pgsql-hackers
On Sat, 19 Jan 2019 at 05:35, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Edmund Horner <ejrh00@gmail.com> writes:
> My patch uses the same path type and executor for all extractable tidquals.

> This worked pretty well, but I am finding it difficult to reimplement it in
> the new tidpath.c code.

I didn't like that approach to begin with, and would suggest that you go
over to using a separate path type and executor node.  I don't think the
amount of commonality for the two cases is all that large, and doing it
as you had it required some ugly ad-hoc conventions about the semantics
of the tidquals list.  Where I think this should go is that the tidquals
list still has OR semantics in the existing path type, but you use AND
semantics in the new path type, so that "ctid > ? AND ctid < ?" is just
represented as an implicit-AND list of two simple RestrictInfos.

Thanks for the advice.  This approach resembles my first draft, which had a separate executor type.  However, it did have a combined path type, with an enum TidPathMethod to determine how tidquals was interpreted.  At this point, I think a different path type is clearer, though generation of both types can live in tidpath.c (just as indxpath.c generates different index path types).
 
Now admittedly, this wouldn't give us an efficient way to execute
queries with conditions like "WHERE ctid = X OR (ctid > Y AND ctid < Z)",
but I find myself quite unable to get excited about supporting that.
I see no reason for the new code to worry about any cases more complex
than one or two TID inequalities at top level of the restriction list.

I'm a bit sad to see support for multiple ranges go, though I never saw such queries as ever being particularly common.  (And there was always a nagging feeling that tidpath.c was beginning to perform feats of boolean acrobatics out of proportion to its importance.  Perhaps in some distant future, TID quals will become another way of supplying TIDs to a bitmap heap scan, which would enable complicated boolean queries using both indexes and TID scans.  But that's just musing, not a proposal.)

> In the query information given to the path generator, there is no existing
> RestrictInfo relating to the whole expression "ctid > ? AND ctid < ?".  I
> am still learning about RestrictInfos, but my understanding is it doesn't
> make sense to have a RestrictInfo for an AND clause, anyway; you're
> supposed to have them for the sub-expressions of it.

FWIW, the actual data structure for cases like that is that there's
a RestrictInfo for the whole clause ctid = X OR (ctid > Y AND ctid < Z),
and if you look into its "orclause" field, you will find RestrictInfos
attached to the primitive clauses ctid = X, ctid > Y, ctid < Z.  (The
old code in tidpath.c didn't know that, because it'd never been rewritten
since RestrictInfos were invented.)  However, I think this new code should
not worry about OR cases at all, but just pull out top-level TID
comparison clauses.

Thanks for the explanation.

> And it doesn't seem a good idea to try to create new RestrictInfos in the
> path generation just to pass the tidquals back to plan creation.

No, you should avoid that.  There are places that assume there's only
one RestrictInfo for any given original clause (or sub-clause).
 

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Fixing findDependentObjects()'s dependency on scan order(regressions in DROP diagnostic messages)
Next
From: Michael Paquier
Date:
Subject: Re: [HACKERS] REINDEX CONCURRENTLY 2.0