simplifying foreign key/RI checks - Mailing list pgsql-hackers

From Amit Langote
Subject simplifying foreign key/RI checks
Date
Msg-id CA+HiwqGkfJfYdeq5vHPh6eqPKjSbfpDDY+j-kXYFePQedtSLeg@mail.gmail.com
Whole thread Raw
Responses Re: simplifying foreign key/RI checks
Re: simplifying foreign key/RI checks
List pgsql-hackers
While discussing the topic of foreign key performance off-list with
Robert and Corey (also came up briefly on the list recently [1], [2]),
a few ideas were thrown around to simplify our current system of RI
checks to enforce foreign keys with the aim of reducing some of its
overheads.  The two main aspects of  how we do these checks that
seemingly cause the most overhead are:

* Using row-level triggers that are fired during the modification of
the referencing and the referenced relations to perform them

* Using plain SQL queries issued over SPI

There is a discussion nearby titled "More efficient RI checks - take
2" [2] to address this problem from the viewpoint that it is using
row-level triggers that causes the most overhead, although there are
some posts mentioning that SQL-over-SPI is not without blame here.  I
decided to focus on the latter aspect and tried reimplementing some
checks such that SPI can be skipped altogether.

I started with the check that's performed when inserting into or
updating the referencing table to confirm that the new row points to a
valid row in the referenced relation.  The corresponding SQL is this:

SELECT 1 FROM pk_rel x WHERE x.pkey = $1 FOR KEY SHARE OF x

$1 is the value of the foreign key of the new row.  If the query
returns a row, all good.  Thanks to SPI, or its use of plan caching,
the query is re-planned only a handful of times before making a
generic plan that is then saved and reused, which looks like this:

              QUERY PLAN
--------------------------------------
 LockRows
   ->  Index Scan using pk_pkey on pk x
         Index Cond: (a = $1)
(3 rows)

So in most cases, the trigger's function would only execute the plan
that's already there, at least in a given session.  That's good but
what we realized would be even better is if we didn't have to
"execute" a full-fledged "plan" for this, that is, to simply find out
whether a row containing the key we're looking for exists in the
referenced relation and if found lock it.  Directly scanning the index
and locking it directly with table_tuple_lock() like ExecLockRows()
does gives us exactly that behavior, which seems simple enough to be
done in a not-so-long local function in ri_trigger.c.  I gave that a
try and came up with the attached.  It also takes care of the case
where the referenced relation is partitioned in which case its
appropriate leaf partition's index is scanned.

The patch results in ~2x improvement in the performance of inserts and
updates on referencing tables:

create table p (a numeric primary key);
insert into p select generate_series(1, 1000000);
create table f (a bigint references p);

-- unpatched
insert into f select generate_series(1, 2000000, 2);
INSERT 0 1000000
Time: 6340.733 ms (00:06.341)

update f set a = a + 1;
UPDATE 1000000
Time: 7490.906 ms (00:07.491)

-- patched
insert into f select generate_series(1, 2000000, 2);
INSERT 0 1000000
Time: 3340.808 ms (00:03.341)

update f set a = a + 1;
UPDATE 1000000
Time: 4178.171 ms (00:04.178)

The improvement is even more dramatic when the referenced table (that
we're no longer querying over SPI) is partitioned.  Here are the
numbers when the PK relation has 1000 hash partitions.

Unpatched:

insert into f select generate_series(1, 2000000, 2);
INSERT 0 1000000
Time: 35898.783 ms (00:35.899)

update f set a = a + 1;
UPDATE 1000000
Time: 37736.294 ms (00:37.736)

Patched:

insert into f select generate_series(1, 2000000, 2);
INSERT 0 1000000
Time: 5633.377 ms (00:05.633)

update f set a = a + 1;
UPDATE 1000000
Time: 6345.029 ms (00:06.345)

That's over ~5x improvement!

While the above case seemed straightforward enough for skipping SPI,
it seems a bit hard to do the same for other cases where we query the
*referencing* relation during an operation on the referenced table
(for example, checking if the row being deleted is still referenced),
because the plan in those cases is not predictably an index scan.
Also, the filters in those queries are more than likely to not match
the partition key of a partitioned referencing relation, so all
partitions will have to scanned.  I have left those cases as future
work.

The patch seems simple enough to consider for inclusion in v14 unless
of course we stumble into some dealbreaker(s).  I will add this to
March CF.

-- 
Amit Langote
EDB: http://www.enterprisedb.com

[1] https://www.postgresql.org/message-id/CADkLM%3DcTt_8Fg1Jtij5j%2BQEBOxz9Cuu4DiMDYOwdtktDAKzuLw%40mail.gmail.com

[2] https://www.postgresql.org/message-id/1813.1586363881%40antos

Attachment

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: ResourceOwner refactoring
Next
From: Heikki Linnakangas
Date:
Subject: Re: Yet another fast GiST build