Re: More efficient RI checks - take 2 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: More efficient RI checks - take 2
Date
Msg-id 20200422154231.6shz4kdor4yb5w5b@alap3.anarazel.de
Whole thread Raw
In response to Re: More efficient RI checks - take 2  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: More efficient RI checks - take 2  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
Hi,

On 2020-04-21 11:34:54 -0400, Alvaro Herrera wrote:
> On 2020-Apr-20, Corey Huinker wrote:
> 
> > > I can imagine removal of the SPI from the current implementation (and
> > > constructing the plans "manually"), but note that the queries I use in my
> > > patch are no longer that trivial. So the SPI makes sense to me because it
> > > ensures regular query planning.
> > 
> > As an intermediate step, in the case where we have one row, it should be
> > simple enough to extract that row manually, and do an SPI call with fixed
> > values rather than the join to the ephemeral table, yes?
> 
> I do wonder if the RI stuff would actually end up being faster without
> SPI.

I would suspect so. How much is another question.

I assume that with constructing plans "manually" you don't mean to
create a plan tree, but to invoke parser/planner directly? I think
that'd likely be better than going through SPI, and there's precedent
too.


But honestly, my gut feeling is that for a lot of cases it'd be best
just bypass parser, planner *and* executor. And just do manual
systable_beginscan() style checks. For most cases we exactly know what
plan shape we expect, and going through the overhead of creating a query
string, parsing, planning, caching the previous steps, and creating an
executor tree for every check is a lot. Even just the amount of memory
for caching the plans can be substantial.

Side note: I for one would appreciate a setting that just made all RI
actions requiring a seqscan error out...


> If not, we'd only end up writing more code to do the same thing.  Now
> that tables can be partitioned, it is much more of a pain than when
> only regular tables could be supported.  Obviously without SPI you
> wouldn't *have* to go through the planner, which might be a win in
> itself if the execution tree to use were always perfectly clear
> ... but now that the queries get more complex per partitioning and
> this optimization, is it?

I think it's actually a good case where we will commonly be able to do
*better* than generic planning. The infrastructure for efficient
partition pruning exists (for COPY etc) - but isn't easily applicable to
generic plans.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: design for parallel backup
Next
From: Andres Freund
Date:
Subject: Re: More efficient RI checks - take 2