Re: More efficient RI checks - take 2 - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: More efficient RI checks - take 2
Date
Msg-id 20200423124046.GD13712@tamriel.snowman.net
Whole thread Raw
In response to Re: More efficient RI checks - take 2  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Wed, Apr 22, 2020 at 6:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > But it's not entirely clear to me that we know the best plan for a
> > statement-level RI action with sufficient certainty to go that way.
> > Is it really the case that the plan would not vary based on how
> > many tuples there are to check, for example?  If we *do* know
> > exactly what should happen, I'd tend to lean towards Andres'
> > idea that we shouldn't be using the executor at all, but just
> > hard-wiring stuff at the level of "do these table scans".
>
> Well, I guess I'd naively think we want an index scan on a plain
> table. It is barely possible that in some corner case a sequential
> scan would be faster, but could it be enough faster to save the cost
> of planning? I doubt it, but I just work here.
>
> On a partitioning hierarchy we want to figure out which partition is
> relevant for the value we're trying to find, and then scan that one.
>
> I'm not sure there are any other cases. We have to have a UNIQUE
> constraint or we can't be referencing this target table. So it can't
> be a plain inheritance hierarchy, nor (I think) a foreign table.

In the cases where we have a UNIQUE constraint, and therefore a clear
index to use, I tend to agree that we should just be getting to it and
avoiding the planner/executor, as Andres suggest.

I'm not super thrilled about the idea of throwing an ERROR when we
haven't got an index to use though, and we don't require an index on the
referring side, meaning that, with such a change, a DELETE or UPDATE on
the referred table with an ON CASCADE FK will just start throwing
errors.  That's not terribly friendly, even if it's not really best
practice to not have an index to help with those cases.

I'd hope that we would at least teach pg_upgrade to look for such cases
and throw errors (though maybe that could be downgraded to a WARNING
with a flag..?) if it finds any when upgrading, so that users don't
upgrade and then suddenly start getting errors for simple statements
that used to work just fine.

> > On the whole I still think that generating a Query tree and then
> > letting the planner do its thing might be the best approach.
>
> Maybe, but it seems awfully heavy-weight. Once you go into the planner
> it's pretty hard to avoid considering indexes we don't care about,
> bitmap scans we don't care about, a sequential scan we don't care
> about, etc.  You'd certainly save something just from avoiding
> parsing, but planning's pretty expensive too.

Agreed.

Thanks,

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Ranier Vilela
Date:
Subject: [PATCH] Fix division by zero (explain.c)
Next
From: Tomas Vondra
Date:
Subject: Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays