Re: More efficient RI checks - take 2 - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: More efficient RI checks - take 2
Date
Msg-id CAFj8pRAGtHrJ4YKUpC5YNWrLyyTsxjfucDp5f7tx6CS7po1xDA@mail.gmail.com
Whole thread Raw
In response to More efficient RI checks - take 2  (Antonin Houska <ah@cybertec.at>)
Responses Re: More efficient RI checks - take 2
Re: More efficient RI checks - take 2
List pgsql-hackers


st 8. 4. 2020 v 18:36 odesílatel Antonin Houska <ah@cybertec.at> napsal:
After having reviewed [1] more than a year ago (the problem I found was that
the transient table is not available for deferred constraints), I've tried to
implement the same in an alternative way. The RI triggers still work as row
level triggers, but if multiple events of the same kind appear in the queue,
they are all passed to the trigger function at once. Thus the check query does
not have to be executed that frequently.

Some performance comparisons are below. (Besides the execution time, please
note the difference in the number of trigger function executions.) In general,
the checks are significantly faster if there are many rows to process, and a
bit slower when we only need to check a single row. However I'm not sure about
the accuracy if only a single row is measured (if a single row check is
performed several times, the execution time appears to fluctuate).

It is hard task to choose good strategy for immediate constraints, but for deferred constraints you know how much rows should be checked, and then you can choose better strategy.

Is possible to use estimation for choosing method of RI checks?



Comments are welcome.

Setup
=====

CREATE TABLE p(i int primary key);
INSERT INTO p SELECT x FROM generate_series(1, 16384) g(x);
CREATE TABLE f(i int REFERENCES p);


Insert many rows into the FK table
==================================

master:

EXPLAIN ANALYZE INSERT INTO f SELECT i FROM generate_series(1, 16384) g(i);
                                                           QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
 Insert on f  (cost=0.00..163.84 rows=16384 width=4) (actual time=32.741..32.741 rows=0 loops=1)
   ->  Function Scan on generate_series g  (cost=0.00..163.84 rows=16384 width=4) (actual time=2.403..4.802 rows=16384 loops=1)
 Planning Time: 0.050 ms
 Trigger for constraint f_i_fkey: time=448.986 calls=16384
 Execution Time: 485.444 ms
(5 rows)

patched:

EXPLAIN ANALYZE INSERT INTO f SELECT i FROM generate_series(1, 16384) g(i);
                                                           QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
 Insert on f  (cost=0.00..163.84 rows=16384 width=4) (actual time=34.053..34.053 rows=0 loops=1)
   ->  Function Scan on generate_series g  (cost=0.00..163.84 rows=16384 width=4) (actual time=2.223..4.448 rows=16384 loops=1)
 Planning Time: 0.047 ms
 Trigger for constraint f_i_fkey: time=105.164 calls=8
 Execution Time: 141.201 ms


Insert a single row into the FK table
=====================================

master:

EXPLAIN ANALYZE INSERT INTO f VALUES (1);
                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Insert on f  (cost=0.00..0.01 rows=1 width=4) (actual time=0.060..0.060 rows=0 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1)
 Planning Time: 0.026 ms
 Trigger for constraint f_i_fkey: time=0.435 calls=1
 Execution Time: 0.517 ms
(5 rows)

patched:

EXPLAIN ANALYZE INSERT INTO f VALUES (1);
                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Insert on f  (cost=0.00..0.01 rows=1 width=4) (actual time=0.066..0.066 rows=0 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1)
 Planning Time: 0.025 ms
 Trigger for constraint f_i_fkey: time=0.578 calls=1
 Execution Time: 0.670 ms


Check if FK row exists during deletion from the PK
==================================================

master:

DELETE FROM p WHERE i=16384;
ERROR:  update or delete on table "p" violates foreign key constraint "f_i_fkey" on table "f"
DETAIL:  Key (i)=(16384) is still referenced from table "f".
Time: 3.381 ms

patched:

DELETE FROM p WHERE i=16384;
ERROR:  update or delete on table "p" violates foreign key constraint "f_i_fkey" on table "f"
DETAIL:  Key (i)=(16384) is still referenced from table "f".
Time: 5.561 ms


Cascaded DELETE --- many PK rows
================================

DROP TABLE f;
CREATE TABLE f(i int REFERENCES p ON UPDATE CASCADE ON DELETE CASCADE);
INSERT INTO f SELECT i FROM generate_series(1, 16384) g(i);

master:

EXPLAIN ANALYZE DELETE FROM p;
                                                QUERY PLAN
-----------------------------------------------------------------------------------------------------------
 Delete on p  (cost=0.00..236.84 rows=16384 width=6) (actual time=38.334..38.334 rows=0 loops=1)
   ->  Seq Scan on p  (cost=0.00..236.84 rows=16384 width=6) (actual time=0.019..3.925 rows=16384 loops=1)
 Planning Time: 0.049 ms
 Trigger for constraint f_i_fkey: time=31348.756 calls=16384
 Execution Time: 31390.784 ms

patched:

EXPLAIN ANALYZE DELETE FROM p;
                                                QUERY PLAN
-----------------------------------------------------------------------------------------------------------
 Delete on p  (cost=0.00..236.84 rows=16384 width=6) (actual time=33.360..33.360 rows=0 loops=1)
   ->  Seq Scan on p  (cost=0.00..236.84 rows=16384 width=6) (actual time=0.012..3.183 rows=16384 loops=1)
 Planning Time: 0.094 ms
 Trigger for constraint f_i_fkey: time=9.580 calls=8
 Execution Time: 43.941 ms


Cascaded DELETE --- a single PK row
===================================

INSERT INTO p SELECT x FROM generate_series(1, 16384) g(x);
INSERT INTO f SELECT i FROM generate_series(1, 16384) g(i);

master:

DELETE FROM p WHERE i=16384;
DELETE 1
Time: 5.754 ms

patched:

DELETE FROM p WHERE i=16384;
DELETE 1
Time: 8.098 ms


Cascaded UPDATE - many rows
===========================

master:

EXPLAIN ANALYZE UPDATE p SET i = i + 16384;
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
 Update on p  (cost=0.00..277.80 rows=16384 width=10) (actual time=166.954..166.954 rows=0 loops=1)
   ->  Seq Scan on p  (cost=0.00..277.80 rows=16384 width=10) (actual time=0.013..7.780 rows=16384 loops=1)
 Planning Time: 0.177 ms
 Trigger for constraint f_i_fkey on p: time=60405.362 calls=16384
 Trigger for constraint f_i_fkey on f: time=455.874 calls=16384
 Execution Time: 61036.996 ms

patched:

EXPLAIN ANALYZE UPDATE p SET i = i + 16384;
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
 Update on p  (cost=0.00..277.77 rows=16382 width=10) (actual time=159.512..159.512 rows=0 loops=1)
   ->  Seq Scan on p  (cost=0.00..277.77 rows=16382 width=10) (actual time=0.014..7.783 rows=16382 loops=1)
 Planning Time: 0.146 ms
 Trigger for constraint f_i_fkey on p: time=169.628 calls=9
 Trigger for constraint f_i_fkey on f: time=124.079 calls=2
 Execution Time: 456.072 ms


Cascaded UPDATE - a single row
==============================

master:

UPDATE p SET i = i - 16384 WHERE i=32767;
UPDATE 1
Time: 4.858 ms

patched:

UPDATE p SET i = i - 16384 WHERE i=32767;
UPDATE 1
Time: 11.955 ms


[1] https://commitfest.postgresql.org/22/1975/

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: where should I stick that backup?
Next
From: Tom Lane
Date:
Subject: Re: A problem about partitionwise join