Thread: More efficient RI checks - take 2

More efficient RI checks - take 2

From
Antonin Houska
Date:
After having reviewed [1] more than a year ago (the problem I found was that
the transient table is not available for deferred constraints), I've tried to
implement the same in an alternative way. The RI triggers still work as row
level triggers, but if multiple events of the same kind appear in the queue,
they are all passed to the trigger function at once. Thus the check query does
not have to be executed that frequently.

Some performance comparisons are below. (Besides the execution time, please
note the difference in the number of trigger function executions.) In general,
the checks are significantly faster if there are many rows to process, and a
bit slower when we only need to check a single row. However I'm not sure about
the accuracy if only a single row is measured (if a single row check is
performed several times, the execution time appears to fluctuate).

Comments are welcome.

Setup
=====

CREATE TABLE p(i int primary key);
INSERT INTO p SELECT x FROM generate_series(1, 16384) g(x);
CREATE TABLE f(i int REFERENCES p);


Insert many rows into the FK table
==================================

master:

EXPLAIN ANALYZE INSERT INTO f SELECT i FROM generate_series(1, 16384) g(i);
                                                           QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------
 Insert on f  (cost=0.00..163.84 rows=16384 width=4) (actual time=32.741..32.741 rows=0 loops=1)
   ->  Function Scan on generate_series g  (cost=0.00..163.84 rows=16384 width=4) (actual time=2.403..4.802 rows=16384
loops=1)
 Planning Time: 0.050 ms
 Trigger for constraint f_i_fkey: time=448.986 calls=16384
 Execution Time: 485.444 ms
(5 rows)

patched:

EXPLAIN ANALYZE INSERT INTO f SELECT i FROM generate_series(1, 16384) g(i);
                                                           QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------
 Insert on f  (cost=0.00..163.84 rows=16384 width=4) (actual time=34.053..34.053 rows=0 loops=1)
   ->  Function Scan on generate_series g  (cost=0.00..163.84 rows=16384 width=4) (actual time=2.223..4.448 rows=16384
loops=1)
 Planning Time: 0.047 ms
 Trigger for constraint f_i_fkey: time=105.164 calls=8
 Execution Time: 141.201 ms


Insert a single row into the FK table
=====================================

master:

EXPLAIN ANALYZE INSERT INTO f VALUES (1);
                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Insert on f  (cost=0.00..0.01 rows=1 width=4) (actual time=0.060..0.060 rows=0 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1)
 Planning Time: 0.026 ms
 Trigger for constraint f_i_fkey: time=0.435 calls=1
 Execution Time: 0.517 ms
(5 rows)

patched:

EXPLAIN ANALYZE INSERT INTO f VALUES (1);
                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Insert on f  (cost=0.00..0.01 rows=1 width=4) (actual time=0.066..0.066 rows=0 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1)
 Planning Time: 0.025 ms
 Trigger for constraint f_i_fkey: time=0.578 calls=1
 Execution Time: 0.670 ms


Check if FK row exists during deletion from the PK
==================================================

master:

DELETE FROM p WHERE i=16384;
ERROR:  update or delete on table "p" violates foreign key constraint "f_i_fkey" on table "f"
DETAIL:  Key (i)=(16384) is still referenced from table "f".
Time: 3.381 ms

patched:

DELETE FROM p WHERE i=16384;
ERROR:  update or delete on table "p" violates foreign key constraint "f_i_fkey" on table "f"
DETAIL:  Key (i)=(16384) is still referenced from table "f".
Time: 5.561 ms


Cascaded DELETE --- many PK rows
================================

DROP TABLE f;
CREATE TABLE f(i int REFERENCES p ON UPDATE CASCADE ON DELETE CASCADE);
INSERT INTO f SELECT i FROM generate_series(1, 16384) g(i);

master:

EXPLAIN ANALYZE DELETE FROM p;
                                                QUERY PLAN
-----------------------------------------------------------------------------------------------------------
 Delete on p  (cost=0.00..236.84 rows=16384 width=6) (actual time=38.334..38.334 rows=0 loops=1)
   ->  Seq Scan on p  (cost=0.00..236.84 rows=16384 width=6) (actual time=0.019..3.925 rows=16384 loops=1)
 Planning Time: 0.049 ms
 Trigger for constraint f_i_fkey: time=31348.756 calls=16384
 Execution Time: 31390.784 ms

patched:

EXPLAIN ANALYZE DELETE FROM p;
                                                QUERY PLAN
-----------------------------------------------------------------------------------------------------------
 Delete on p  (cost=0.00..236.84 rows=16384 width=6) (actual time=33.360..33.360 rows=0 loops=1)
   ->  Seq Scan on p  (cost=0.00..236.84 rows=16384 width=6) (actual time=0.012..3.183 rows=16384 loops=1)
 Planning Time: 0.094 ms
 Trigger for constraint f_i_fkey: time=9.580 calls=8
 Execution Time: 43.941 ms


Cascaded DELETE --- a single PK row
===================================

INSERT INTO p SELECT x FROM generate_series(1, 16384) g(x);
INSERT INTO f SELECT i FROM generate_series(1, 16384) g(i);

master:

DELETE FROM p WHERE i=16384;
DELETE 1
Time: 5.754 ms

patched:

DELETE FROM p WHERE i=16384;
DELETE 1
Time: 8.098 ms


Cascaded UPDATE - many rows
===========================

master:

EXPLAIN ANALYZE UPDATE p SET i = i + 16384;
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
 Update on p  (cost=0.00..277.80 rows=16384 width=10) (actual time=166.954..166.954 rows=0 loops=1)
   ->  Seq Scan on p  (cost=0.00..277.80 rows=16384 width=10) (actual time=0.013..7.780 rows=16384 loops=1)
 Planning Time: 0.177 ms
 Trigger for constraint f_i_fkey on p: time=60405.362 calls=16384
 Trigger for constraint f_i_fkey on f: time=455.874 calls=16384
 Execution Time: 61036.996 ms

patched:

EXPLAIN ANALYZE UPDATE p SET i = i + 16384;
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
 Update on p  (cost=0.00..277.77 rows=16382 width=10) (actual time=159.512..159.512 rows=0 loops=1)
   ->  Seq Scan on p  (cost=0.00..277.77 rows=16382 width=10) (actual time=0.014..7.783 rows=16382 loops=1)
 Planning Time: 0.146 ms
 Trigger for constraint f_i_fkey on p: time=169.628 calls=9
 Trigger for constraint f_i_fkey on f: time=124.079 calls=2
 Execution Time: 456.072 ms


Cascaded UPDATE - a single row
==============================

master:

UPDATE p SET i = i - 16384 WHERE i=32767;
UPDATE 1
Time: 4.858 ms

patched:

UPDATE p SET i = i - 16384 WHERE i=32767;
UPDATE 1
Time: 11.955 ms


[1] https://commitfest.postgresql.org/22/1975/

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com


Attachment

Re: More efficient RI checks - take 2

From
Pavel Stehule
Date:


st 8. 4. 2020 v 18:36 odesílatel Antonin Houska <ah@cybertec.at> napsal:
After having reviewed [1] more than a year ago (the problem I found was that
the transient table is not available for deferred constraints), I've tried to
implement the same in an alternative way. The RI triggers still work as row
level triggers, but if multiple events of the same kind appear in the queue,
they are all passed to the trigger function at once. Thus the check query does
not have to be executed that frequently.

Some performance comparisons are below. (Besides the execution time, please
note the difference in the number of trigger function executions.) In general,
the checks are significantly faster if there are many rows to process, and a
bit slower when we only need to check a single row. However I'm not sure about
the accuracy if only a single row is measured (if a single row check is
performed several times, the execution time appears to fluctuate).

It is hard task to choose good strategy for immediate constraints, but for deferred constraints you know how much rows should be checked, and then you can choose better strategy.

Is possible to use estimation for choosing method of RI checks?



Comments are welcome.

Setup
=====

CREATE TABLE p(i int primary key);
INSERT INTO p SELECT x FROM generate_series(1, 16384) g(x);
CREATE TABLE f(i int REFERENCES p);


Insert many rows into the FK table
==================================

master:

EXPLAIN ANALYZE INSERT INTO f SELECT i FROM generate_series(1, 16384) g(i);
                                                           QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
 Insert on f  (cost=0.00..163.84 rows=16384 width=4) (actual time=32.741..32.741 rows=0 loops=1)
   ->  Function Scan on generate_series g  (cost=0.00..163.84 rows=16384 width=4) (actual time=2.403..4.802 rows=16384 loops=1)
 Planning Time: 0.050 ms
 Trigger for constraint f_i_fkey: time=448.986 calls=16384
 Execution Time: 485.444 ms
(5 rows)

patched:

EXPLAIN ANALYZE INSERT INTO f SELECT i FROM generate_series(1, 16384) g(i);
                                                           QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
 Insert on f  (cost=0.00..163.84 rows=16384 width=4) (actual time=34.053..34.053 rows=0 loops=1)
   ->  Function Scan on generate_series g  (cost=0.00..163.84 rows=16384 width=4) (actual time=2.223..4.448 rows=16384 loops=1)
 Planning Time: 0.047 ms
 Trigger for constraint f_i_fkey: time=105.164 calls=8
 Execution Time: 141.201 ms


Insert a single row into the FK table
=====================================

master:

EXPLAIN ANALYZE INSERT INTO f VALUES (1);
                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Insert on f  (cost=0.00..0.01 rows=1 width=4) (actual time=0.060..0.060 rows=0 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1)
 Planning Time: 0.026 ms
 Trigger for constraint f_i_fkey: time=0.435 calls=1
 Execution Time: 0.517 ms
(5 rows)

patched:

EXPLAIN ANALYZE INSERT INTO f VALUES (1);
                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Insert on f  (cost=0.00..0.01 rows=1 width=4) (actual time=0.066..0.066 rows=0 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1)
 Planning Time: 0.025 ms
 Trigger for constraint f_i_fkey: time=0.578 calls=1
 Execution Time: 0.670 ms


Check if FK row exists during deletion from the PK
==================================================

master:

DELETE FROM p WHERE i=16384;
ERROR:  update or delete on table "p" violates foreign key constraint "f_i_fkey" on table "f"
DETAIL:  Key (i)=(16384) is still referenced from table "f".
Time: 3.381 ms

patched:

DELETE FROM p WHERE i=16384;
ERROR:  update or delete on table "p" violates foreign key constraint "f_i_fkey" on table "f"
DETAIL:  Key (i)=(16384) is still referenced from table "f".
Time: 5.561 ms


Cascaded DELETE --- many PK rows
================================

DROP TABLE f;
CREATE TABLE f(i int REFERENCES p ON UPDATE CASCADE ON DELETE CASCADE);
INSERT INTO f SELECT i FROM generate_series(1, 16384) g(i);

master:

EXPLAIN ANALYZE DELETE FROM p;
                                                QUERY PLAN
-----------------------------------------------------------------------------------------------------------
 Delete on p  (cost=0.00..236.84 rows=16384 width=6) (actual time=38.334..38.334 rows=0 loops=1)
   ->  Seq Scan on p  (cost=0.00..236.84 rows=16384 width=6) (actual time=0.019..3.925 rows=16384 loops=1)
 Planning Time: 0.049 ms
 Trigger for constraint f_i_fkey: time=31348.756 calls=16384
 Execution Time: 31390.784 ms

patched:

EXPLAIN ANALYZE DELETE FROM p;
                                                QUERY PLAN
-----------------------------------------------------------------------------------------------------------
 Delete on p  (cost=0.00..236.84 rows=16384 width=6) (actual time=33.360..33.360 rows=0 loops=1)
   ->  Seq Scan on p  (cost=0.00..236.84 rows=16384 width=6) (actual time=0.012..3.183 rows=16384 loops=1)
 Planning Time: 0.094 ms
 Trigger for constraint f_i_fkey: time=9.580 calls=8
 Execution Time: 43.941 ms


Cascaded DELETE --- a single PK row
===================================

INSERT INTO p SELECT x FROM generate_series(1, 16384) g(x);
INSERT INTO f SELECT i FROM generate_series(1, 16384) g(i);

master:

DELETE FROM p WHERE i=16384;
DELETE 1
Time: 5.754 ms

patched:

DELETE FROM p WHERE i=16384;
DELETE 1
Time: 8.098 ms


Cascaded UPDATE - many rows
===========================

master:

EXPLAIN ANALYZE UPDATE p SET i = i + 16384;
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
 Update on p  (cost=0.00..277.80 rows=16384 width=10) (actual time=166.954..166.954 rows=0 loops=1)
   ->  Seq Scan on p  (cost=0.00..277.80 rows=16384 width=10) (actual time=0.013..7.780 rows=16384 loops=1)
 Planning Time: 0.177 ms
 Trigger for constraint f_i_fkey on p: time=60405.362 calls=16384
 Trigger for constraint f_i_fkey on f: time=455.874 calls=16384
 Execution Time: 61036.996 ms

patched:

EXPLAIN ANALYZE UPDATE p SET i = i + 16384;
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
 Update on p  (cost=0.00..277.77 rows=16382 width=10) (actual time=159.512..159.512 rows=0 loops=1)
   ->  Seq Scan on p  (cost=0.00..277.77 rows=16382 width=10) (actual time=0.014..7.783 rows=16382 loops=1)
 Planning Time: 0.146 ms
 Trigger for constraint f_i_fkey on p: time=169.628 calls=9
 Trigger for constraint f_i_fkey on f: time=124.079 calls=2
 Execution Time: 456.072 ms


Cascaded UPDATE - a single row
==============================

master:

UPDATE p SET i = i - 16384 WHERE i=32767;
UPDATE 1
Time: 4.858 ms

patched:

UPDATE p SET i = i - 16384 WHERE i=32767;
UPDATE 1
Time: 11.955 ms


[1] https://commitfest.postgresql.org/22/1975/

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

Re: More efficient RI checks - take 2

From
Corey Huinker
Date:
On Wed, Apr 8, 2020 at 1:06 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:


st 8. 4. 2020 v 18:36 odesílatel Antonin Houska <ah@cybertec.at> napsal:
After having reviewed [1] more than a year ago (the problem I found was that
the transient table is not available for deferred constraints), I've tried to
implement the same in an alternative way. The RI triggers still work as row
level triggers, but if multiple events of the same kind appear in the queue,
they are all passed to the trigger function at once. Thus the check query does
not have to be executed that frequently.

I'm excited that you picked this up!
 

Some performance comparisons are below. (Besides the execution time, please
note the difference in the number of trigger function executions.) In general,
the checks are significantly faster if there are many rows to process, and a
bit slower when we only need to check a single row. However I'm not sure about
the accuracy if only a single row is measured (if a single row check is
performed several times, the execution time appears to fluctuate).

These numbers are very promising, and much more in line with my initial expectations. Obviously the impact on single-row DML is of major concern, though.

It is hard task to choose good strategy for immediate constraints, but for deferred constraints you know how much rows should be checked, and then you can choose better strategy.

Is possible to use estimation for choosing method of RI checks?

In doing my initial attempt, the feedback I was getting was that the people who truly understood the RI checks fell into the following groups:
1. people who wanted to remove the SPI calls from the triggers
2. people who wanted to completely refactor RI to not use triggers
3. people who wanted to completely refactor triggers

While #3 is clearly beyond the scope for an endeavor like this, #1 seems like it would nearly eliminate the 1-row penalty (we'd still have the TupleStore initi penalty, but it would just be a handy queue structure, and maybe that cost would be offset by removing the SPI overhead), and once that is done, we could see about step #2.
 

Re: More efficient RI checks - take 2

From
Antonin Houska
Date:
Pavel Stehule <pavel.stehule@gmail.com> wrote:

>> st 8. 4. 2020 v 18:36 odesílatel Antonin Houska <ah@cybertec.at> napsal:
>
>>  Some performance comparisons are below. (Besides the execution time, please
>>  note the difference in the number of trigger function executions.) In general,
>>  the checks are significantly faster if there are many rows to process, and a
>>  bit slower when we only need to check a single row. However I'm not sure about
>>  the accuracy if only a single row is measured (if a single row check is
>>  performed several times, the execution time appears to fluctuate).
>
> It is hard task to choose good strategy for immediate constraints, but for
> deferred constraints you know how much rows should be checked, and then you
> can choose better strategy.
>
> Is possible to use estimation for choosing method of RI checks?

The exact number of rows ("batch size") is always known before the query is
executed. So one problem to solve is that, when only one row is affected, we
need to convince the planner that the "transient table" really contains a
single row. Otherwise it can, for example, produce a hash join where the hash
eventually contains a single row.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com



Re: More efficient RI checks - take 2

From
Antonin Houska
Date:
Corey Huinker <corey.huinker@gmail.com> wrote:

> These numbers are very promising, and much more in line with my initial
> expectations. Obviously the impact on single-row DML is of major concern,
> though.

Yes, I agree.

> In doing my initial attempt, the feedback I was getting was that the people
> who truly understood the RI checks fell into the following groups:

> 1. people who wanted to remove the SPI calls from the triggers
> 2. people who wanted to completely refactor RI to not use triggers
> 3. people who wanted to completely refactor triggers
> 
> While #3 is clearly beyond the scope for an endeavor like this, #1 seems
> like it would nearly eliminate the 1-row penalty (we'd still have the
> TupleStore initi penalty, but it would just be a handy queue structure, and
> maybe that cost would be offset by removing the SPI overhead),

I can imagine removal of the SPI from the current implementation (and
constructing the plans "manually"), but note that the queries I use in my
patch are no longer that trivial. So the SPI makes sense to me because it
ensures regular query planning.

As for the tuplestore, I'm not sure the startup cost is a problem: if you're
concerned about the 1-row case, the row should usually be stored in memory.

> and once that is done, we could see about step #2.

As I said during my review of your patch last year, I think the RI semantics
has too much in common with that of triggers. I'd need more info to imagine
such a change.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com



Re: More efficient RI checks - take 2

From
Corey Huinker
Date:
I can imagine removal of the SPI from the current implementation (and
constructing the plans "manually"), but note that the queries I use in my
patch are no longer that trivial. So the SPI makes sense to me because it
ensures regular query planning.

As an intermediate step, in the case where we have one row, it should be simple enough to extract that row manually, and do an SPI call with fixed values rather than the join to the ephemeral table, yes?
 
As for the tuplestore, I'm not sure the startup cost is a problem: if you're
concerned about the 1-row case, the row should usually be stored in memory.

 
> and once that is done, we could see about step #2.

As I said during my review of your patch last year, I think the RI semantics
has too much in common with that of triggers. I'd need more info to imagine
such a change.

As a general outline, I think that DML would iterate over the 2 sets of potentially relevant RI definitions rather than iterating over the triggers. 

The similarities between RI and general triggers are obvious, which explains why they went that route initially, but they're also a crutch, but since all RI operations boil down to either an iteration over a tuplestore to do lookups in an index (when checking for referenced rows), or a hash join of the transient data against the un-indexed table when checking for referencing rows, and people who know this stuff far better than me seem to think that SPI overhead is best avoided when possible. I'm looking forward to having more time to spend on this.

Re: More efficient RI checks - take 2

From
Alvaro Herrera
Date:
On 2020-Apr-20, Corey Huinker wrote:

> > I can imagine removal of the SPI from the current implementation (and
> > constructing the plans "manually"), but note that the queries I use in my
> > patch are no longer that trivial. So the SPI makes sense to me because it
> > ensures regular query planning.
> 
> As an intermediate step, in the case where we have one row, it should be
> simple enough to extract that row manually, and do an SPI call with fixed
> values rather than the join to the ephemeral table, yes?

I do wonder if the RI stuff would actually end up being faster without
SPI.  If not, we'd only end up writing more code to do the same thing.
Now that tables can be partitioned, it is much more of a pain than when
only regular tables could be supported.  Obviously without SPI you
wouldn't *have* to go through the planner, which might be a win in
itself if the execution tree to use were always perfectly clear ... but
now that the queries get more complex per partitioning and this
optimization, is it?

You could remove the crosscheck_snapshot feature from SPI, I suppose,
but that's not that much code.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: More efficient RI checks - take 2

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> I do wonder if the RI stuff would actually end up being faster without
> SPI.  If not, we'd only end up writing more code to do the same thing.
> Now that tables can be partitioned, it is much more of a pain than when
> only regular tables could be supported.  Obviously without SPI you
> wouldn't *have* to go through the planner, which might be a win in
> itself if the execution tree to use were always perfectly clear ... but
> now that the queries get more complex per partitioning and this
> optimization, is it?

AFAIK, we do not have any code besides the planner that is capable of
building a plan tree at all, and I'd be pretty hesitant to try to create
such; those things are complicated.

It'd likely only make sense to bypass the planner if the required work
is predictable enough that you don't need a plan tree at all, but can
just hard-wire what has to be done.  That seems a bit unlikely in the
presence of partitioning.

Instead of a plan tree, you could build a parse tree to pass through the
planner, rather than building a SQL statement string that has to be
parsed.  The code jumps through enough hoops to make sure the string will
be parsed "just so" that this might net out to about an equal amount of
code in ri_triggers.c, and it'd save a nontrivial amount of parsing work.
But you'd have to abandon SPI, probably, or at least it's not clear how
much that'd be doing for you anymore.

            regards, tom lane



Re: More efficient RI checks - take 2

From
Andres Freund
Date:
Hi,

On 2020-04-21 11:34:54 -0400, Alvaro Herrera wrote:
> On 2020-Apr-20, Corey Huinker wrote:
> 
> > > I can imagine removal of the SPI from the current implementation (and
> > > constructing the plans "manually"), but note that the queries I use in my
> > > patch are no longer that trivial. So the SPI makes sense to me because it
> > > ensures regular query planning.
> > 
> > As an intermediate step, in the case where we have one row, it should be
> > simple enough to extract that row manually, and do an SPI call with fixed
> > values rather than the join to the ephemeral table, yes?
> 
> I do wonder if the RI stuff would actually end up being faster without
> SPI.

I would suspect so. How much is another question.

I assume that with constructing plans "manually" you don't mean to
create a plan tree, but to invoke parser/planner directly? I think
that'd likely be better than going through SPI, and there's precedent
too.


But honestly, my gut feeling is that for a lot of cases it'd be best
just bypass parser, planner *and* executor. And just do manual
systable_beginscan() style checks. For most cases we exactly know what
plan shape we expect, and going through the overhead of creating a query
string, parsing, planning, caching the previous steps, and creating an
executor tree for every check is a lot. Even just the amount of memory
for caching the plans can be substantial.

Side note: I for one would appreciate a setting that just made all RI
actions requiring a seqscan error out...


> If not, we'd only end up writing more code to do the same thing.  Now
> that tables can be partitioned, it is much more of a pain than when
> only regular tables could be supported.  Obviously without SPI you
> wouldn't *have* to go through the planner, which might be a win in
> itself if the execution tree to use were always perfectly clear
> ... but now that the queries get more complex per partitioning and
> this optimization, is it?

I think it's actually a good case where we will commonly be able to do
*better* than generic planning. The infrastructure for efficient
partition pruning exists (for COPY etc) - but isn't easily applicable to
generic plans.

Greetings,

Andres Freund



Re: More efficient RI checks - take 2

From
Andres Freund
Date:
Hi,

On 2020-04-21 16:14:53 -0400, Tom Lane wrote:
> AFAIK, we do not have any code besides the planner that is capable of
> building a plan tree at all, and I'd be pretty hesitant to try to create
> such; those things are complicated.

I suspect what was meant was not to create the plan tree directly, but
to bypass SPI when creating the plan / executing the query.


IMO SPI for most uses in core PG really adds more complication and
overhead than warranted.  The whole concept of having a global tuptable,
a stack and xact.c integration to repair that design defficiency... The
hiding of what happens behind a pretty random set of different
abstractions.  That all makes it appear as if SPI did something super
complicated, but it really doesn't. It just is a bad and
over-complicated abstraction layer.

Greetings,

Andres Freund



Re: More efficient RI checks - take 2

From
Alvaro Herrera
Date:
On 2020-Apr-22, Andres Freund wrote:

> I assume that with constructing plans "manually" you don't mean to
> create a plan tree, but to invoke parser/planner directly? I think
> that'd likely be better than going through SPI, and there's precedent
> too.

Well, I was actually thinking in building ready-made execution trees,
bypassing the planner altogether.  But apparently no one thinks that
this is a good idea, and we don't have any code that does that already,
so maybe it's not a great idea.

However:

> But honestly, my gut feeling is that for a lot of cases it'd be best
> just bypass parser, planner *and* executor. And just do manual
> systable_beginscan() style checks. For most cases we exactly know what
> plan shape we expect, and going through the overhead of creating a query
> string, parsing, planning, caching the previous steps, and creating an
> executor tree for every check is a lot. Even just the amount of memory
> for caching the plans can be substantial.

Avoiding the executor altogether scares me, but I can't say exactly why.
Foe example, you couldn't use foreign tables at either side of the FK --
but we don't allow FKs on those tables and we'd have to use some
specific executor node for such a thing anyway.  So this not a real
argument against going that route.

> Side note: I for one would appreciate a setting that just made all RI
> actions requiring a seqscan error out...

Hmm, interesting thought.  I guess there are actual cases where it's
not strictly necessary, for example where the referencing table is
really tiny -- not the *referenced* table, note, since you need the
UNIQUE index on that side in any case.  I suppose that's not a really
interesting case.  I don't think this is implementable when going
through SPI.

> I think it's actually a good case where we will commonly be able to do
> *better* than generic planning. The infrastructure for efficient
> partition pruning exists (for COPY etc) - but isn't easily applicable to
> generic plans.

True.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: More efficient RI checks - take 2

From
Robert Haas
Date:
On Wed, Apr 22, 2020 at 1:18 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Well, I was actually thinking in building ready-made execution trees,
> bypassing the planner altogether.  But apparently no one thinks that
> this is a good idea, and we don't have any code that does that already,
> so maybe it's not a great idea.

If it's any consolation, I had the same idea very recently while
chatting with Amit Langote. Maybe it's a bad idea, but you're not the
only one who had it. :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: More efficient RI checks - take 2

From
Andres Freund
Date:
Hi,

On 2020-04-22 13:46:22 -0400, Robert Haas wrote:
> On Wed, Apr 22, 2020 at 1:18 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> > Well, I was actually thinking in building ready-made execution trees,
> > bypassing the planner altogether.  But apparently no one thinks that
> > this is a good idea, and we don't have any code that does that already,
> > so maybe it's not a great idea.

I was commenting on what I understood Corey to say, but was fairly
unclear about it. But I'm also far from sure that I understood Corey
correctly...


> If it's any consolation, I had the same idea very recently while
> chatting with Amit Langote. Maybe it's a bad idea, but you're not the
> only one who had it. :-)

That seems extremely hard, given our current infrastructure. I think
there's actually a good case to be made for the idea in the abstract,
but ...  The amount of logic the ExecInit* routines have is substantial,
the state they set up ss complicates. A lot of nodes have state that is
private to their .c files. All executor nodes reference the
corresponding Plan nodes, so you also need to mock up those.

Greetings,

Andres Freund



Re: More efficient RI checks - take 2

From
Andres Freund
Date:
Hi,

On 2020-04-08 13:55:55 -0400, Corey Huinker wrote:
> In doing my initial attempt, the feedback I was getting was that the people
> who truly understood the RI checks fell into the following groups:
> 1. people who wanted to remove the SPI calls from the triggers
> 2. people who wanted to completely refactor RI to not use triggers
> 3. people who wanted to completely refactor triggers

FWIW, for me these three are largely independent avenues:

WRT 1: There's a lot of benefit in reducing the per-call overhead of
RI. Not going through SPI is one way to do that. Even if RI were not to
use triggers, we'd still want to reduce the per-statement costs.

WRT 2: Not using the generic per-row trigger framework for RI has significant
benefits too - checking multiple rows at once, deduplicating repeated
checks, reducing the per-row storage overhead ...

WRT 3: Fairly obviously improving the generic trigger code (more
efficient fetching of tuple versions, spilling etc) would have benefits
entirely independent of other RI improvements.

Greetings,

Andres Freund



Re: More efficient RI checks - take 2

From
Corey Huinker
Date:
On Wed, Apr 22, 2020 at 2:36 PM Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2020-04-22 13:46:22 -0400, Robert Haas wrote:
> On Wed, Apr 22, 2020 at 1:18 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> > Well, I was actually thinking in building ready-made execution trees,
> > bypassing the planner altogether.  But apparently no one thinks that
> > this is a good idea, and we don't have any code that does that already,
> > so maybe it's not a great idea.

I was commenting on what I understood Corey to say, but was fairly
unclear about it. But I'm also far from sure that I understood Corey
correctly...

I was unclear because, even after my failed foray into statement level triggers for RI checks, I'm still pretty inexperienced in this area.

I'm just happy that it's being discussed.

Re: More efficient RI checks - take 2

From
Robert Haas
Date:
On Wed, Apr 22, 2020 at 2:36 PM Andres Freund <andres@anarazel.de> wrote:
> > If it's any consolation, I had the same idea very recently while
> > chatting with Amit Langote. Maybe it's a bad idea, but you're not the
> > only one who had it. :-)
>
> That seems extremely hard, given our current infrastructure. I think
> there's actually a good case to be made for the idea in the abstract,
> but ...  The amount of logic the ExecInit* routines have is substantial,
> the state they set up ss complicates. A lot of nodes have state that is
> private to their .c files. All executor nodes reference the
> corresponding Plan nodes, so you also need to mock up those.

Right -- the idea I was talking about was to create a Plan tree
without using the main planner. So it wouldn't bother costing an index
scan on each index, and a sequential scan, on the target table - it
would just make an index scan plan, or maybe an index path that it
would then convert to an index plan. Or something like that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: More efficient RI checks - take 2

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Right -- the idea I was talking about was to create a Plan tree
> without using the main planner. So it wouldn't bother costing an index
> scan on each index, and a sequential scan, on the target table - it
> would just make an index scan plan, or maybe an index path that it
> would then convert to an index plan. Or something like that.

Consing up a Path tree and then letting create_plan() make it into
an executable plan might not be a terrible idea.  There's a whole
boatload of finicky details that you could avoid that way, like
everything in setrefs.c.

But it's not entirely clear to me that we know the best plan for a
statement-level RI action with sufficient certainty to go that way.
Is it really the case that the plan would not vary based on how
many tuples there are to check, for example?  If we *do* know
exactly what should happen, I'd tend to lean towards Andres'
idea that we shouldn't be using the executor at all, but just
hard-wiring stuff at the level of "do these table scans".

Also, it's definitely not the case that create_plan() has an API
that's so clean that you would be able to use it without major
hassles.  You'd still have to generate a pile of lookalike planner
data structures, and make sure that expression subtrees have been
fed through eval_const_expressions(), etc etc.

On the whole I still think that generating a Query tree and then
letting the planner do its thing might be the best approach.

            regards, tom lane



Re: More efficient RI checks - take 2

From
Robert Haas
Date:
On Wed, Apr 22, 2020 at 6:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> But it's not entirely clear to me that we know the best plan for a
> statement-level RI action with sufficient certainty to go that way.
> Is it really the case that the plan would not vary based on how
> many tuples there are to check, for example?  If we *do* know
> exactly what should happen, I'd tend to lean towards Andres'
> idea that we shouldn't be using the executor at all, but just
> hard-wiring stuff at the level of "do these table scans".

Well, I guess I'd naively think we want an index scan on a plain
table. It is barely possible that in some corner case a sequential
scan would be faster, but could it be enough faster to save the cost
of planning? I doubt it, but I just work here.

On a partitioning hierarchy we want to figure out which partition is
relevant for the value we're trying to find, and then scan that one.

I'm not sure there are any other cases. We have to have a UNIQUE
constraint or we can't be referencing this target table. So it can't
be a plain inheritance hierarchy, nor (I think) a foreign table.

> Also, it's definitely not the case that create_plan() has an API
> that's so clean that you would be able to use it without major
> hassles.  You'd still have to generate a pile of lookalike planner
> data structures, and make sure that expression subtrees have been
> fed through eval_const_expressions(), etc etc.

Yeah, that's annoying.

> On the whole I still think that generating a Query tree and then
> letting the planner do its thing might be the best approach.

Maybe, but it seems awfully heavy-weight. Once you go into the planner
it's pretty hard to avoid considering indexes we don't care about,
bitmap scans we don't care about, a sequential scan we don't care
about, etc.  You'd certainly save something just from avoiding
parsing, but planning's pretty expensive too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: More efficient RI checks - take 2

From
Antonin Houska
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Robert Haas <robertmhaas@gmail.com> writes:
> > Right -- the idea I was talking about was to create a Plan tree
> > without using the main planner. So it wouldn't bother costing an index
> > scan on each index, and a sequential scan, on the target table - it
> > would just make an index scan plan, or maybe an index path that it
> > would then convert to an index plan. Or something like that.
> 
> Consing up a Path tree and then letting create_plan() make it into
> an executable plan might not be a terrible idea.  There's a whole
> boatload of finicky details that you could avoid that way, like
> everything in setrefs.c.
> 
> But it's not entirely clear to me that we know the best plan for a
> statement-level RI action with sufficient certainty to go that way.
> Is it really the case that the plan would not vary based on how
> many tuples there are to check, for example?

I'm concerned about that too. With my patch the checks become a bit slower if
only a single row is processed. The problem seems to be that the planner is
not entirely convinced about that the number of input rows, so it can still
build a plan that expects many rows. For example (as I mentioned elsewhere in
the thread), a hash join where the hash table only contains one tuple. Or
similarly a sort node for a single input tuple.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com



Re: More efficient RI checks - take 2

From
Pavel Stehule
Date:


čt 23. 4. 2020 v 7:06 odesílatel Antonin Houska <ah@cybertec.at> napsal:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Robert Haas <robertmhaas@gmail.com> writes:
> > Right -- the idea I was talking about was to create a Plan tree
> > without using the main planner. So it wouldn't bother costing an index
> > scan on each index, and a sequential scan, on the target table - it
> > would just make an index scan plan, or maybe an index path that it
> > would then convert to an index plan. Or something like that.
>
> Consing up a Path tree and then letting create_plan() make it into
> an executable plan might not be a terrible idea.  There's a whole
> boatload of finicky details that you could avoid that way, like
> everything in setrefs.c.
>
> But it's not entirely clear to me that we know the best plan for a
> statement-level RI action with sufficient certainty to go that way.
> Is it really the case that the plan would not vary based on how
> many tuples there are to check, for example?

I'm concerned about that too. With my patch the checks become a bit slower if
only a single row is processed. The problem seems to be that the planner is
not entirely convinced about that the number of input rows, so it can still
build a plan that expects many rows. For example (as I mentioned elsewhere in
the thread), a hash join where the hash table only contains one tuple. Or
similarly a sort node for a single input tuple.

without statistics the planner expect about 2000 rows table , no?

Pavel


--
Antonin Houska
Web: https://www.cybertec-postgresql.com

Re: More efficient RI checks - take 2

From
Antonin Houska
Date:
Pavel Stehule <pavel.stehule@gmail.com> wrote:

> čt 23. 4. 2020 v 7:06 odesílatel Antonin Houska <ah@cybertec.at> napsal:
>
>  Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>  > But it's not entirely clear to me that we know the best plan for a
>  > statement-level RI action with sufficient certainty to go that way.
>  > Is it really the case that the plan would not vary based on how
>  > many tuples there are to check, for example?
>
>  I'm concerned about that too. With my patch the checks become a bit slower if
>  only a single row is processed. The problem seems to be that the planner is
>  not entirely convinced about that the number of input rows, so it can still
>  build a plan that expects many rows. For example (as I mentioned elsewhere in
>  the thread), a hash join where the hash table only contains one tuple. Or
>  similarly a sort node for a single input tuple.
>
> without statistics the planner expect about 2000 rows table , no?

I think that at some point it estimates the number of rows from the number of
table pages, but I don't remember details.

I wanted to say that if we constructed the plan "manually", we'd need at least
two substantially different variants: one to check many rows and the other to
check a single row.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com



Re: More efficient RI checks - take 2

From
Pavel Stehule
Date:


čt 23. 4. 2020 v 8:28 odesílatel Antonin Houska <ah@cybertec.at> napsal:
Pavel Stehule <pavel.stehule@gmail.com> wrote:

> čt 23. 4. 2020 v 7:06 odesílatel Antonin Houska <ah@cybertec.at> napsal:
>
>  Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>  > But it's not entirely clear to me that we know the best plan for a
>  > statement-level RI action with sufficient certainty to go that way.
>  > Is it really the case that the plan would not vary based on how
>  > many tuples there are to check, for example?
>
>  I'm concerned about that too. With my patch the checks become a bit slower if
>  only a single row is processed. The problem seems to be that the planner is
>  not entirely convinced about that the number of input rows, so it can still
>  build a plan that expects many rows. For example (as I mentioned elsewhere in
>  the thread), a hash join where the hash table only contains one tuple. Or
>  similarly a sort node for a single input tuple.
>
> without statistics the planner expect about 2000 rows table , no?

I think that at some point it estimates the number of rows from the number of
table pages, but I don't remember details.

I wanted to say that if we constructed the plan "manually", we'd need at least
two substantially different variants: one to check many rows and the other to
check a single row.

There can be more variants - a hash join should not be good enough for bigger data.

The overhead of RI is too big, so I think any solution that will be faster then current and can be inside Postgres 14 can be perfect.

But when you know so input is only one row, you can build a query without join





--
Antonin Houska
Web: https://www.cybertec-postgresql.com

Re: More efficient RI checks - take 2

From
Stephen Frost
Date:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Wed, Apr 22, 2020 at 6:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > But it's not entirely clear to me that we know the best plan for a
> > statement-level RI action with sufficient certainty to go that way.
> > Is it really the case that the plan would not vary based on how
> > many tuples there are to check, for example?  If we *do* know
> > exactly what should happen, I'd tend to lean towards Andres'
> > idea that we shouldn't be using the executor at all, but just
> > hard-wiring stuff at the level of "do these table scans".
>
> Well, I guess I'd naively think we want an index scan on a plain
> table. It is barely possible that in some corner case a sequential
> scan would be faster, but could it be enough faster to save the cost
> of planning? I doubt it, but I just work here.
>
> On a partitioning hierarchy we want to figure out which partition is
> relevant for the value we're trying to find, and then scan that one.
>
> I'm not sure there are any other cases. We have to have a UNIQUE
> constraint or we can't be referencing this target table. So it can't
> be a plain inheritance hierarchy, nor (I think) a foreign table.

In the cases where we have a UNIQUE constraint, and therefore a clear
index to use, I tend to agree that we should just be getting to it and
avoiding the planner/executor, as Andres suggest.

I'm not super thrilled about the idea of throwing an ERROR when we
haven't got an index to use though, and we don't require an index on the
referring side, meaning that, with such a change, a DELETE or UPDATE on
the referred table with an ON CASCADE FK will just start throwing
errors.  That's not terribly friendly, even if it's not really best
practice to not have an index to help with those cases.

I'd hope that we would at least teach pg_upgrade to look for such cases
and throw errors (though maybe that could be downgraded to a WARNING
with a flag..?) if it finds any when upgrading, so that users don't
upgrade and then suddenly start getting errors for simple statements
that used to work just fine.

> > On the whole I still think that generating a Query tree and then
> > letting the planner do its thing might be the best approach.
>
> Maybe, but it seems awfully heavy-weight. Once you go into the planner
> it's pretty hard to avoid considering indexes we don't care about,
> bitmap scans we don't care about, a sequential scan we don't care
> about, etc.  You'd certainly save something just from avoiding
> parsing, but planning's pretty expensive too.

Agreed.

Thanks,

Stephen

Attachment

Re: More efficient RI checks - take 2

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Apr 22, 2020 at 6:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> But it's not entirely clear to me that we know the best plan for a
>> statement-level RI action with sufficient certainty to go that way.

> Well, I guess I'd naively think we want an index scan on a plain
> table. It is barely possible that in some corner case a sequential
> scan would be faster, but could it be enough faster to save the cost
> of planning? I doubt it, but I just work here.

I think we're failing to communicate here.  I agree that if the goal
is simply to re-implement what the RI triggers currently do --- that
is, retail one-row-at-a-time checks --- then we could probably dispense
with all the parser/planner/executor overhead and directly implement
an indexscan using an API at about the level genam.c provides.
(The issue of whether it's okay to require an index to be available is
annoying, but we could always fall back to the old ways if one is not.)

However, what I thought this thread was about was switching to
statement-level RI checking.  At that point, what we're talking
about is performing a join involving a not-known-in-advance number
of tuples on each side.  If you think you can hard-wire the choice
of join technology and have it work well all the time, I'm going to
say with complete confidence that you are wrong.  The planner spends
huge amounts of effort on that and still doesn't always get it right
... but it does better than a hard-wired choice would do.

Maybe there's room to pursue both things --- you could imagine,
perhaps, looking at the planner's estimate of number of affected
rows at executor startup and deciding from that whether to fire
per-row or per-statement RI triggers.  But we're really going to
want different implementations within those two types of triggers.

            regards, tom lane



Re: More efficient RI checks - take 2

From
Amit Langote
Date:
On Thu, Apr 23, 2020 at 2:18 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> On 2020-Apr-22, Andres Freund wrote:
> > I assume that with constructing plans "manually" you don't mean to
> > create a plan tree, but to invoke parser/planner directly? I think
> > that'd likely be better than going through SPI, and there's precedent
> > too.
>
> Well, I was actually thinking in building ready-made execution trees,
> bypassing the planner altogether.  But apparently no one thinks that
> this is a good idea, and we don't have any code that does that already,
> so maybe it's not a great idea.

We do have an instance in validateForeignKeyConstraint() of "manually"
enforcing RI:

If RI_Initial_Check() (a relatively complex query) cannot be
performed, the referencing table is scanned manually and each tuple
thus found is looked up in the referenced table by using
RI_FKey_check_ins(), a simpler query.

Ironically though, RI_Initial_Check() is to short-circuit the manual algorithm.

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com



Re: More efficient RI checks - take 2

From
Robert Haas
Date:
On Thu, Apr 23, 2020 at 10:35 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I think we're failing to communicate here.  I agree that if the goal
> is simply to re-implement what the RI triggers currently do --- that
> is, retail one-row-at-a-time checks --- then we could probably dispense
> with all the parser/planner/executor overhead and directly implement
> an indexscan using an API at about the level genam.c provides.
> (The issue of whether it's okay to require an index to be available is
> annoying, but we could always fall back to the old ways if one is not.)
>
> However, what I thought this thread was about was switching to
> statement-level RI checking.  At that point, what we're talking
> about is performing a join involving a not-known-in-advance number
> of tuples on each side.  If you think you can hard-wire the choice
> of join technology and have it work well all the time, I'm going to
> say with complete confidence that you are wrong.  The planner spends
> huge amounts of effort on that and still doesn't always get it right
> ... but it does better than a hard-wired choice would do.

Oh, yeah. If we're talking about that, then getting by without using
the planner doesn't seem feasible. Sorry, I guess I didn't read the
thread carefully enough.

As you say, perhaps there's room for both things, but also as you say,
it's not obvious how to decide intelligently between them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: More efficient RI checks - take 2

From
Stephen Frost
Date:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Thu, Apr 23, 2020 at 10:35 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I think we're failing to communicate here.  I agree that if the goal
> > is simply to re-implement what the RI triggers currently do --- that
> > is, retail one-row-at-a-time checks --- then we could probably dispense
> > with all the parser/planner/executor overhead and directly implement
> > an indexscan using an API at about the level genam.c provides.
> > (The issue of whether it's okay to require an index to be available is
> > annoying, but we could always fall back to the old ways if one is not.)
> >
> > However, what I thought this thread was about was switching to
> > statement-level RI checking.  At that point, what we're talking
> > about is performing a join involving a not-known-in-advance number
> > of tuples on each side.  If you think you can hard-wire the choice
> > of join technology and have it work well all the time, I'm going to
> > say with complete confidence that you are wrong.  The planner spends
> > huge amounts of effort on that and still doesn't always get it right
> > ... but it does better than a hard-wired choice would do.
>
> Oh, yeah. If we're talking about that, then getting by without using
> the planner doesn't seem feasible. Sorry, I guess I didn't read the
> thread carefully enough.

Yeah, I had been thinking about what we might do with the existing
row-level RI checks too.  If we're able to get statement-level without
much impact on the single-row-statement case then that's certainly
interesting, although it sure feels like we're ending up with a lot left
on the table.

> As you say, perhaps there's room for both things, but also as you say,
> it's not obvious how to decide intelligently between them.

The single-row case seems pretty clear and also seems common enough that
it'd be worth paying the cost to figure out if it's a single-row
statement or not.

Perhaps we start with row-level for the first row, implemented directly
using an index lookup, and when we hit some threshold (maybe even just
"more than one") switch to using the transient table and queue'ing
the rest to check at the end.

What bothers me the most about this approach (though, to be clear, I
think we should still pursue it) is the risk that we might end up
picking a spectacularly bad plan that ends up taking a great deal more
time than the index-probe based approach we almost always have today.
If we limit that impact to only cases where >1 row is involved, then
that's certainly better (though maybe we'll need a GUC for this
anyway..?  If we had the single-row approach + the statement-level one,
presumably the GUC would just make us always take the single-row method,
so it hopefully wouldn't be too grotty to have).

Thanks,

Stephen

Attachment

Re: More efficient RI checks - take 2

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> * Robert Haas (robertmhaas@gmail.com) wrote:
>> As you say, perhaps there's room for both things, but also as you say,
>> it's not obvious how to decide intelligently between them.

> The single-row case seems pretty clear and also seems common enough that
> it'd be worth paying the cost to figure out if it's a single-row
> statement or not.

That seems hard to do in advance ... but it would be easy to code
a statement-level AFTER trigger along the lines of

    if (transition table contains one row)
        // fast special case here
    else
        // slow general case here.

I think the question really comes down to this: is the per-row overhead of
the transition-table mechanism comparable to that of the AFTER trigger
queue?  Or if not, can we make it so?

            regards, tom lane



Re: More efficient RI checks - take 2

From
Andres Freund
Date:
Hi,

On 2020-04-28 10:44:58 -0400, Tom Lane wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > * Robert Haas (robertmhaas@gmail.com) wrote:
> >> As you say, perhaps there's room for both things, but also as you say,
> >> it's not obvious how to decide intelligently between them.
>
> > The single-row case seems pretty clear and also seems common enough that
> > it'd be worth paying the cost to figure out if it's a single-row
> > statement or not.

It's not that obvious to me that it's going to be beneficial to go down
a planned path in all that many cases. If all that the RI check does is
a index_rescan() && index_getnext_slot(), there's not that many
realistic types of plans that are going to be better.

IIUC a query to check a transition table would, simplified, boil down to
either:

SELECT * FROM transition_table tt
WHERE
    -- for a insertion/update into the referencing table and
    (
        NOT EXISTS (SELECT * FROM referenced_table rt WHERE rt.referenced_column = tt.referencing_column)
        [AND ... , ]
    )
    -- for update / delete of referenced table
    OR EXISTS (SELECT * FROM referencing_table rt1 WHERE rt1.referencing_column = tt.referenced_column1)
    [OR ... , ]
LIMIT 1;

Where a returned row would signal an error. But it would need to handle
row locking, CASCADE/SET NULL/SET DEFAULT etc. More on that below.

While it's tempting to want to write the latter check as

-- for update / delete of referenced table
SELECT * FROM referencing_table rt
WHERE referencing_column IN (SELECT referenced_column FROM transition_table tt)
LIMIT 1;
that'd make it harder to know the violating row.


As the transition table isn't ordered it seems like there's not that
many realistic ways to execute this:

1) A nestloop semi/anti-join with an inner index scan
2) Sort transition table, do a merge semi/anti-join between sort and an
  ordered index scan on the referenced / referencing table(s).
3) Hash semi/anti-join, requiring a full table scan of the tables


I think 1) would be worse than just doing the indexscan manually. 2)
would probably be beneficial if there's a lot of rows on the inner side,
due to the ordered access and deduplication. 3) would sometimes be
beneficial because it'd avoid an index scan for each tuple in the
transition table.

The cases in which it is clear to me that a bulk check could
theoretically be significantly better than a fast per-row check are:

1) There's a *lot* of rows in the transition table to comparatively small
  referenced / referencing tables. As those tables can cheaply be
  hashed, a hashjoin will be be more efficient than doing a index lookup
  for each transition table row.
2) There's a lot of duplicate content in the transition
  table. E.g. because there's a lot of references to the same row.

Did I forget important ones?


With regard to the row locking etc that I elided above: I think that
actually will prevent most if not all interesting optimizations: Because
of the FOR KEY SHARE that's required, the planner plan will pretty much
always degrade to a per row subplan check anyway.  Is there any
formulation of the necessary queries that don't suffer from this
problem?


> That seems hard to do in advance ... but it would be easy to code
> a statement-level AFTER trigger along the lines of
>
>     if (transition table contains one row)
>         // fast special case here
>     else
>         // slow general case here.

I suspect we'd need something more complicated than this for it to be
beneficial. My gut feeling would be that the case where a transition
table style check would be most commonly beneficial is when you have a
very small referenced table, and a *lot* of rows get inserted.  But
clearly we wouldn't want to have bulk insertion suddenly also store all
rows in a transition table.

Nor would we want to have a bulk UPDATE cause all the updated rows to be
stored in the transition table, even though none of the relevant columns
changed (i.e. the RI_FKey_[f]pk_upd_check_required logic in
AfterTriggerSaveEvent()).


I still don't quite see how shunting RI checks through triggers saves us
more than it costs:

Especially for the stuff we do as AFTER: Most of the time we could do
the work we defer till query end immediately, rather than building up an
in-memory queue. Besides saving memory, in a lot of cases that'd also
make it unnecessary to refetch the row at a later time, possibly needing
to chase updated row versions.

But even for the BEFORE checks, largely going through generic trigger
code means it's much harder to batch checks without suddenly requiring
memory proportional to the number of inserted rows.


There obviously are cases where it's not possible to check just after
each row. Deferrable constraints, as well as CASCADING / SET NOT NULL /
SET DEFAULT on tables with user defined triggers, for example. But it'd
likely be sensible to handle that in the way we already handle deferred
uniqueness checks, i.e. we only queue something if there's a potential
for a problem.


> I think the question really comes down to this: is the per-row overhead of
> the transition-table mechanism comparable to that of the AFTER trigger
> queue?  Or if not, can we make it so?

It's probably more expensive, in some ways, at the moment. The biggest
difference is that the transition table stores complete rows, valid as
of the moment they've been inserted/updated/deleted, whereas the trigger
queue only stores enough information to fetch the tuple again during
trigger execution.  Several RI checks however re-check visiblity before
executing, so that's another fetch, that'd likely not be elided by a
simple change to using transition tables.

Both have significant downsides, obviously. Storing complete rows can
take a lot more memory, and refetching rows is expensive, especially if
it happens much later (with the row pushed out of shared_buffers
potentially).


I think it was a mistake to have these two different systems in
trigger.c. When transition tables were added we shouldn't have kept
per-tuple state both in the queue and in the transition
tuplestore. Instead we should have only used the tuplestore, and
optimized what information we store inside depending on the need of the
various after triggers.

Greetings,

Andres Freund



Re: More efficient RI checks - take 2

From
Antonin Houska
Date:
Antonin Houska <ah@cybertec.at> wrote:

> In general, the checks are significantly faster if there are many rows to
> process, and a bit slower when we only need to check a single row.

Attached is a new version that uses the existing simple queries if there's
only one row to check. SPI is used for both single-row and bulk checks - as
discussed in this thread, it can perhaps be replaced with a different approach
if appears to be beneficial, at least for the single-row checks.

I think using a separate query for the single-row check is more practicable
than convincing the planner that the bulk-check query should only check a
single row. So this patch version tries to show what it'd look like.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com


Attachment

Re: More efficient RI checks - take 2

From
Andres Freund
Date:
Hi,

I was looking at this patch with Corey during a patch-review session. So
these are basically our "combined" comments.


On 2020-06-05 17:16:43 +0200, Antonin Houska wrote:
> From 6c1cb8ae7fbf0a8122d8c6637c61b9915bc57223 Mon Sep 17 00:00:00 2001
> From: Antonin Houska <ah@cybertec.at>
> Date: Fri, 5 Jun 2020 16:42:34 +0200
> Subject: [PATCH 1/5] Check for RI violation outside ri_PerformCheck().

Probably good to add a short comment to the commit explaining why you're
doing this.

The change makes sense to me. Unless somebody protests I think we should
just apply it regardless of the rest of the series - the code seems
clearer afterwards.


> From 6b09e5598553c8e57b4ef9342912f51adb48f8af Mon Sep 17 00:00:00 2001
> From: Antonin Houska <ah@cybertec.at>
> Date: Fri, 5 Jun 2020 16:42:34 +0200
> Subject: [PATCH 2/5] Changed ri_GenerateQual() so it generates the whole
>  qualifier.
> 
> This way we can use the function to reduce the amount of copy&pasted code a
> bit.


>  /*
> - * ri_GenerateQual --- generate a WHERE clause equating two variables
> + * ri_GenerateQual --- generate WHERE/ON clause.
> + *
> + * Note: to avoid unnecessary explicit casts, make sure that the left and
> + * right operands match eq_oprs expect (ie don't swap the left and right
> + * operands accidentally).
> + */
> +static void
> +ri_GenerateQual(StringInfo buf, char *sep, int nkeys,
> +                const char *ltabname, Relation lrel,
> +                const int16 *lattnums,
> +                const char *rtabname, Relation rrel,
> +                const int16 *rattnums,
> +                const Oid *eq_oprs,
> +                GenQualParams params,
> +                Oid *paramtypes)
> +{
> +    for (int i = 0; i < nkeys; i++)
> +    {
> +        Oid            ltype = RIAttType(lrel, lattnums[i]);
> +        Oid            rtype = RIAttType(rrel, rattnums[i]);
> +        Oid            lcoll = RIAttCollation(lrel, lattnums[i]);
> +        Oid            rcoll = RIAttCollation(rrel, rattnums[i]);
> +        char        paramname[16];
> +        char       *latt,
> +                   *ratt;
> +        char       *sep_current = i > 0 ? sep : NULL;
> +
> +        if (params != GQ_PARAMS_NONE)
> +            sprintf(paramname, "$%d", i + 1);
> +
> +        if (params == GQ_PARAMS_LEFT)
> +        {
> +            latt = paramname;
> +            paramtypes[i] = ltype;
> +        }
> +        else
> +            latt = ri_ColNameQuoted(ltabname, RIAttName(lrel, lattnums[i]));
> +
> +        if (params == GQ_PARAMS_RIGHT)
> +        {
> +            ratt = paramname;
> +            paramtypes[i] = rtype;
> +        }
> +        else
> +            ratt = ri_ColNameQuoted(rtabname, RIAttName(rrel, rattnums[i]));


Why do we need support for having params on left or right side, instead
of just having them on one side?


> +        ri_GenerateQualComponent(buf, sep_current, latt, ltype, eq_oprs[i],
> +                                 ratt, rtype);
> +
> +        if (lcoll != rcoll)
> +            ri_GenerateQualCollation(buf, lcoll);
> +    }
> +}
> +
> +/*
> + * ri_GenerateQual --- generate a component of WHERE/ON clause equating two
> + * variables, to be AND-ed to the other components.
>   *
>   * This basically appends " sep leftop op rightop" to buf, adding casts
>   * and schema qualification as needed to ensure that the parser will select
> @@ -1828,17 +1802,86 @@ quoteRelationName(char *buffer, Relation rel)
>   * if they aren't variables or parameters.
>   */
>  static void
> -ri_GenerateQual(StringInfo buf,
> -                const char *sep,
> -                const char *leftop, Oid leftoptype,
> -                Oid opoid,
> -                const char *rightop, Oid rightoptype)
> +ri_GenerateQualComponent(StringInfo buf,
> +                         const char *sep,
> +                         const char *leftop, Oid leftoptype,
> +                         Oid opoid,
> +                         const char *rightop, Oid rightoptype)
>  {
> -    appendStringInfo(buf, " %s ", sep);
> +    if (sep)
> +        appendStringInfo(buf, " %s ", sep);
>      generate_operator_clause(buf, leftop, leftoptype, opoid,
>                               rightop, rightoptype);
>  }

Why is this handled inside ri_GenerateQualComponent() instead of
ri_GenerateQual()? Especially because the latter now has code to pass in
a different sep into ri_GenerateQualComponent().


> +/*
> + * ri_ColNameQuoted() --- return column name, with both table and column name
> + * quoted.
> + */
> +static char *
> +ri_ColNameQuoted(const char *tabname, const char *attname)
> +{
> +    char        quoted[MAX_QUOTED_NAME_LEN];
> +    StringInfo    result = makeStringInfo();
> +
> +    if (tabname && strlen(tabname) > 0)
> +    {
> +        quoteOneName(quoted, tabname);
> +        appendStringInfo(result, "%s.", quoted);
> +    }
> +
> +    quoteOneName(quoted, attname);
> +    appendStringInfoString(result, quoted);
> +
> +    return result->data;
> +}

Why does this new function accept a NULL / zero length string? I guess
that's because we currently don't qualify in all places?


> +/*
> + * Check that RI trigger function was called in expected context
> + */
> +static void
> +ri_CheckTrigger(FunctionCallInfo fcinfo, const char *funcname, int tgkind)
> +{
> +    TriggerData *trigdata = (TriggerData *) fcinfo->context;
> +
> +    if (!CALLED_AS_TRIGGER(fcinfo))
> +        ereport(ERROR,
> +                (errcode(ERRCODE_E_R_I_E_TRIGGER_PROTOCOL_VIOLATED),
> +                 errmsg("function \"%s\" was not called by trigger manager", funcname)));
> +
> +    /*
> +     * Check proper event
> +     */
> +    if (!TRIGGER_FIRED_AFTER(trigdata->tg_event) ||
> +        !TRIGGER_FIRED_FOR_ROW(trigdata->tg_event))
> +        ereport(ERROR,
> +                (errcode(ERRCODE_E_R_I_E_TRIGGER_PROTOCOL_VIOLATED),
> +                 errmsg("function \"%s\" must be fired AFTER ROW", funcname)));
> +
> +    switch (tgkind)
> +    {
> +        case RI_TRIGTYPE_INSERT:
> +            if (!TRIGGER_FIRED_BY_INSERT(trigdata->tg_event))
> +                ereport(ERROR,
> +                        (errcode(ERRCODE_E_R_I_E_TRIGGER_PROTOCOL_VIOLATED),
> +                         errmsg("function \"%s\" must be fired for INSERT", funcname)));
> +            break;
> +        case RI_TRIGTYPE_UPDATE:
> +            if (!TRIGGER_FIRED_BY_UPDATE(trigdata->tg_event))
> +                ereport(ERROR,
> +                        (errcode(ERRCODE_E_R_I_E_TRIGGER_PROTOCOL_VIOLATED),
> +                         errmsg("function \"%s\" must be fired for UPDATE", funcname)));
> +            break;
> +
> +        case RI_TRIGTYPE_DELETE:
> +            if (!TRIGGER_FIRED_BY_DELETE(trigdata->tg_event))
> +                ereport(ERROR,
> +                        (errcode(ERRCODE_E_R_I_E_TRIGGER_PROTOCOL_VIOLATED),
> +                         errmsg("function \"%s\" must be fired for DELETE", funcname)));
> +            break;
> +    }
> +}
> +

Why did you move this around, as part of this commit?




> From 208c733d759592402901599446b4f7e7197c1777 Mon Sep 17 00:00:00 2001
> From: Antonin Houska <ah@cybertec.at>
> Date: Fri, 5 Jun 2020 16:42:34 +0200
> Subject: [PATCH 4/5] Introduce infrastructure for batch processing RI events.
> 
> Separate storage is used for the RI trigger events because the "transient
> table" that we provide to statement triggers would not be available for
> deferred constraints. Also, the regular statement level trigger is not ideal
> for the RI checks because it requires the query execution to complete before
> the RI checks even start. On the other hand, if we use batches of row trigger
> events, we only need to tune the batch size so that user gets RI violation
> error rather soon.
> 
> This patch only introduces the infrastructure, however the trigger function is
> still called per event. This is just to reduce the size of the diffs.
> ---
>  src/backend/commands/tablecmds.c    |   68 +-
>  src/backend/commands/trigger.c      |  406 ++++++--
>  src/backend/executor/spi.c          |   16 +-
>  src/backend/utils/adt/ri_triggers.c | 1385 +++++++++++++++++++--------
>  src/include/commands/trigger.h      |   25 +
>  5 files changed, 1381 insertions(+), 519 deletions(-)

My first comment here is that this is too large a change and should be
broken up.

I think there's also not enough explanation in here what the new design
is. I can infer some of that from the code, but that's imo shifting work
to the reviewer / reader unnecessarily.



> +static void AfterTriggerExecuteRI(EState *estate,
> +                                  ResultRelInfo *relInfo,
> +                                  FmgrInfo *finfo,
> +                                  Instrumentation *instr,
> +                                  TriggerData *trig_last,
> +                                  MemoryContext batch_context);
>  static AfterTriggersTableData *GetAfterTriggersTableData(Oid relid,
>                                                           CmdType cmdType);
>  static void AfterTriggerFreeQuery(AfterTriggersQueryData *qs);
> @@ -3807,13 +3821,16 @@ afterTriggerDeleteHeadEventChunk(AfterTriggersQueryData *qs)
>   *    fmgr lookup cache space at the caller level.  (For triggers fired at
>   *    the end of a query, we can even piggyback on the executor's state.)
>   *
> - *    event: event currently being fired.
> + *    event: event currently being fired. Pass NULL if the current batch of RI
> + *        trigger events should be processed.
>   *    rel: open relation for event.
>   *    trigdesc: working copy of rel's trigger info.
>   *    finfo: array of fmgr lookup cache entries (one per trigger in trigdesc).
>   *    instr: array of EXPLAIN ANALYZE instrumentation nodes (one per trigger),
>   *        or NULL if no instrumentation is wanted.
> + *    trig_last: trigger info used for the last trigger execution.
>   *    per_tuple_context: memory context to call trigger function in.
> + *    batch_context: memory context to store tuples for RI triggers.
>   *    trig_tuple_slot1: scratch slot for tg_trigtuple (foreign tables only)
>   *    trig_tuple_slot2: scratch slot for tg_newtuple (foreign tables only)
>   * ----------
> @@ -3824,39 +3841,55 @@ AfterTriggerExecute(EState *estate,
>                      ResultRelInfo *relInfo,
>                      TriggerDesc *trigdesc,
>                      FmgrInfo *finfo, Instrumentation *instr,
> +                    TriggerData *trig_last,
>                      MemoryContext per_tuple_context,
> +                    MemoryContext batch_context,
>                      TupleTableSlot *trig_tuple_slot1,
>                      TupleTableSlot *trig_tuple_slot2)
>  {
>      Relation    rel = relInfo->ri_RelationDesc;
>      AfterTriggerShared evtshared = GetTriggerSharedData(event);
>      Oid            tgoid = evtshared->ats_tgoid;
> -    TriggerData LocTriggerData = {0};
>      HeapTuple    rettuple;
> -    int            tgindx;
>      bool        should_free_trig = false;
>      bool        should_free_new = false;
> +    bool        is_new = false;
>  
> -    /*
> -     * Locate trigger in trigdesc.
> -     */
> -    for (tgindx = 0; tgindx < trigdesc->numtriggers; tgindx++)
> +    if (trig_last->tg_trigger == NULL)
>      {
> -        if (trigdesc->triggers[tgindx].tgoid == tgoid)
> +        int            tgindx;
> +
> +        /*
> +         * Locate trigger in trigdesc.
> +         */
> +        for (tgindx = 0; tgindx < trigdesc->numtriggers; tgindx++)
>          {
> -            LocTriggerData.tg_trigger = &(trigdesc->triggers[tgindx]);
> -            break;
> +            if (trigdesc->triggers[tgindx].tgoid == tgoid)
> +            {
> +                trig_last->tg_trigger = &(trigdesc->triggers[tgindx]);
> +                trig_last->tgindx = tgindx;
> +                break;
> +            }
>          }
> +        if (trig_last->tg_trigger == NULL)
> +            elog(ERROR, "could not find trigger %u", tgoid);
> +
> +        if (RI_FKey_trigger_type(trig_last->tg_trigger->tgfoid) !=
> +            RI_TRIGGER_NONE)
> +            trig_last->is_ri_trigger = true;
> +
> +        is_new = true;
>      }
> -    if (LocTriggerData.tg_trigger == NULL)
> -        elog(ERROR, "could not find trigger %u", tgoid);
> +
> +    /* trig_last for non-RI trigger should always be initialized again. */
> +    Assert(trig_last->is_ri_trigger || is_new);
>  
>      /*
>       * If doing EXPLAIN ANALYZE, start charging time to this trigger. We want
>       * to include time spent re-fetching tuples in the trigger cost.
>       */
> -    if (instr)
> -        InstrStartNode(instr + tgindx);
> +    if (instr && !trig_last->is_ri_trigger)
> +        InstrStartNode(instr + trig_last->tgindx);

I'm pretty unhappy about the amount of new infrastructure this adds to
trigger.c. We're now going to have a third copy of the tuples (for a
time). trigger.c is already a pretty complicated / archaic piece of
infrastructure, and this patchset seems to make that even worse. We'll
grow yet another separate representation of tuples, there's a lot new
branches (less concerned about the runtime costs, more about the code
complexity) etc.



> +/* ----------
> + * Construct the query to check inserted/updated rows of the FK table.
> + *
> + * If "insert" is true, the rows are inserted, otherwise they are updated.
> + *
> + * The query string built is
> + *    SELECT t.fkatt1 [, ...]
> + *        FROM <tgtable> t LEFT JOIN LATERAL
> + *            (SELECT t.fkatt1 [, ...]
> + *               FROM [ONLY] <pktable> p
> + *                 WHERE t.fkatt1 = p.pkatt1 [AND ...]
> + *                 FOR KEY SHARE OF p) AS m
> + *             ON t.fkatt1 = m.fkatt1 [AND ...]
> + *        WHERE m.fkatt1 ISNULL
> + *        LIMIT 1
> + *

Why do we need the lateral query here?


Greetings,

Andres Freund



Re: More efficient RI checks - take 2

From
Justin Pryzby
Date:
On Fri, Jun 05, 2020 at 05:16:43PM +0200, Antonin Houska wrote:
> Antonin Houska <ah@cybertec.at> wrote:
> 
> > In general, the checks are significantly faster if there are many rows to
> > process, and a bit slower when we only need to check a single row.
> 
> Attached is a new version that uses the existing simple queries if there's
> only one row to check. SPI is used for both single-row and bulk checks - as
> discussed in this thread, it can perhaps be replaced with a different approach
> if appears to be beneficial, at least for the single-row checks.
> 
> I think using a separate query for the single-row check is more practicable
> than convincing the planner that the bulk-check query should only check a
> single row. So this patch version tries to show what it'd look like.

I'm interested in testing this patch, however there's a lot of internals to
digest.

Are there any documentation updates or regression tests to add ?  If FKs
support "bulk" validation, users should know when that applies, and be able to
check that it's working as intended.  Even if the test cases are overly verbose
or not stable, and not intended for commit, that would be a useful temporary
addition.

I think that calls=4 indicates this is using bulk validation.

postgres=# begin; explain(analyze, timing off, costs off, summary off, verbose) DELETE FROM t WHERE i<999; rollback;
BEGIN
                              QUERY PLAN                               
-----------------------------------------------------------------------
 Delete on public.t (actual rows=0 loops=1)
   ->  Index Scan using t_pkey on public.t (actual rows=998 loops=1)
         Output: ctid
         Index Cond: (t.i < 999)
 Trigger RI_ConstraintTrigger_a_16399 for constraint t_i_fkey: calls=4

I started thinking about this 1+ years ago wondering if a BRIN index could be
used for (bulk) FK validation.

So I would like to be able to see the *plan* for the query.

I was able to show the plan and see that BRIN can be used like so:
|SET auto_explain.log_nested_statements=on; SET client_min_messages=debug; SET auto_explain.log_min_duration=0;
Should the plan be visible in explain (not auto-explain) ?

BTW did you see this older thread ?
https://www.postgresql.org/message-id/flat/CA%2BU5nMLM1DaHBC6JXtUMfcG6f7FgV5mPSpufO7GRnbFKkF2f7g%40mail.gmail.com

-- 
Justin



Re: More efficient RI checks - take 2

From
Michael Paquier
Date:
On Sat, Sep 26, 2020 at 09:59:17PM -0500, Justin Pryzby wrote:
> I'm interested in testing this patch, however there's a lot of internals to
> digest.

Even with that, the thread has been waiting on author for a couple of
weeks now, so I have marke dthe entry as RwF.
--
Michael

Attachment

Re: More efficient RI checks - take 2

From
Antonin Houska
Date:
Justin Pryzby <pryzby@telsasoft.com> wrote:

> I'm interested in testing this patch, however there's a lot of internals to
> digest.
>
> Are there any documentation updates or regression tests to add ?

I'm not sure if user documentation should be changed unless a new GUC or
statistics information is added. As for regression tests, perhaps in the next
version of the patch. But right now I don't know how to implement the feature
in a less invasive way (see the complaint by Andres in [1]), nor do I have
enough time to work on the patch.

> If FKs support "bulk" validation, users should know when that applies, and
> be able to check that it's working as intended.  Even if the test cases are
> overly verbose or not stable, and not intended for commit, that would be a
> useful temporary addition.
>
> I think that calls=4 indicates this is using bulk validation.
>
> postgres=# begin; explain(analyze, timing off, costs off, summary off, verbose) DELETE FROM t WHERE i<999; rollback;
> BEGIN
>                               QUERY PLAN
> -----------------------------------------------------------------------
>  Delete on public.t (actual rows=0 loops=1)
>    ->  Index Scan using t_pkey on public.t (actual rows=998 loops=1)
>          Output: ctid
>          Index Cond: (t.i < 999)
>  Trigger RI_ConstraintTrigger_a_16399 for constraint t_i_fkey: calls=4

> I started thinking about this 1+ years ago wondering if a BRIN index could be
> used for (bulk) FK validation.
>
> So I would like to be able to see the *plan* for the query.

> I was able to show the plan and see that BRIN can be used like so:
> |SET auto_explain.log_nested_statements=on; SET client_min_messages=debug; SET auto_explain.log_min_duration=0;
> Should the plan be visible in explain (not auto-explain) ?

For development purposes, I thin I could get the plan this way:

SET debug_print_plan TO on;
SET client_min_messages TO debug;

(The plan is cached, so I think the query will only be displayed during the
first execution in the session).

Do you think that the documentation should advise the user to create BRIN
index on the FK table?

> BTW did you see this older thread ?
> https://www.postgresql.org/message-id/flat/CA%2BU5nMLM1DaHBC6JXtUMfcG6f7FgV5mPSpufO7GRnbFKkF2f7g%40mail.gmail.com

Not yet. Thanks.

[1] https://www.postgresql.org/message-id/20200630011729.mr25bmmbvsattxe2%40alap3.anarazel.de

--
Antonin Houska
Web: https://www.cybertec-postgresql.com