Re: More efficient RI checks - take 2 - Mailing list pgsql-hackers

From Corey Huinker
Subject Re: More efficient RI checks - take 2
Date
Msg-id CADkLM=fOUrqTTB2etCG2v4MuVdXDKa4vnO+FoFdj_Lj=+svEgg@mail.gmail.com
Whole thread Raw
In response to Re: More efficient RI checks - take 2  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: More efficient RI checks - take 2
Re: More efficient RI checks - take 2
List pgsql-hackers
On Wed, Apr 8, 2020 at 1:06 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:


st 8. 4. 2020 v 18:36 odesílatel Antonin Houska <ah@cybertec.at> napsal:
After having reviewed [1] more than a year ago (the problem I found was that
the transient table is not available for deferred constraints), I've tried to
implement the same in an alternative way. The RI triggers still work as row
level triggers, but if multiple events of the same kind appear in the queue,
they are all passed to the trigger function at once. Thus the check query does
not have to be executed that frequently.

I'm excited that you picked this up!
 

Some performance comparisons are below. (Besides the execution time, please
note the difference in the number of trigger function executions.) In general,
the checks are significantly faster if there are many rows to process, and a
bit slower when we only need to check a single row. However I'm not sure about
the accuracy if only a single row is measured (if a single row check is
performed several times, the execution time appears to fluctuate).

These numbers are very promising, and much more in line with my initial expectations. Obviously the impact on single-row DML is of major concern, though.

It is hard task to choose good strategy for immediate constraints, but for deferred constraints you know how much rows should be checked, and then you can choose better strategy.

Is possible to use estimation for choosing method of RI checks?

In doing my initial attempt, the feedback I was getting was that the people who truly understood the RI checks fell into the following groups:
1. people who wanted to remove the SPI calls from the triggers
2. people who wanted to completely refactor RI to not use triggers
3. people who wanted to completely refactor triggers

While #3 is clearly beyond the scope for an endeavor like this, #1 seems like it would nearly eliminate the 1-row penalty (we'd still have the TupleStore initi penalty, but it would just be a handy queue structure, and maybe that cost would be offset by removing the SPI overhead), and once that is done, we could see about step #2.
 

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: backup manifests and contemporaneous buildfarm failures
Next
From: Tom Lane
Date:
Subject: Re: backup manifests and contemporaneous buildfarm failures