BUG #16256: Checking deferred FK constraints when updating PK is ordrs of magnitude slower than when updating FK - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #16256: Checking deferred FK constraints when updating PK is ordrs of magnitude slower than when updating FK
Date
Msg-id 16256-2cdabbe938e57aff@postgresql.org
Whole thread Raw
Responses Re: BUG #16256: Checking deferred FK constraints when updating PK isordrs of magnitude slower than when updating FK
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      16256
Logged by:          Valentin Kovalenko
Email address:      valentin.male.kovalenko@gmail.com
PostgreSQL version: 12.1
Operating system:   Ubuntu 18.04.2 LTS
Description:

Environment:
OS:
  Ubuntu 18.04.2 LTS
PostgreSQL: 
  select version(); returns 'PostgreSQL 12.1 (Ubuntu 12.1-1.pgdg18.04+1) on
x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0,
64-bit'.
  PG is setup as specified at
https://www.postgresql.org/download/linux/ubuntu/.

Setup:
1) Install PG, start PG, start psql (the session is in auto-commit mode).
2) Run the following commands to create three functions which simplify
reproducing of the problem:

drop function if exists init_test;
create function init_test(in pk_holder_size bigint, in fk_holder_size
bigint) returns void
language plpgsql
as $$
  begin
    perform setseed(0);
    <<pk_holder>>begin
      drop table if exists pk_holder cascade;
      create unlogged table pk_holder (pk bigint primary key);
      insert into pk_holder select * from generate_series(0, pk_holder_size
- 1);
      analyze pk_holder;
    end;
    <<fk_holder>>begin
      drop table if exists fk_holder;
      create unlogged table fk_holder (
        fk bigint not null,
        serial bigint unique not null,
        constraint fk_holder_fk_fkey foreign key (fk) references pk_holder
(pk) deferrable initially immediate);
      insert into fk_holder (fk, serial) select floor(random() *
pk_holder_size), gs.v from generate_series(0, fk_holder_size - 1) as gs
(v);
      analyze fk_holder;
    end;
  end;
$$;

drop function if exists update_pk;
create function update_pk(in size bigint) returns void
language plpgsql
as $$
  declare
    min_free_pk bigint;
  begin
    select max(pk) + 1 into min_free_pk from pk_holder;
    set constraints fk_holder_fk_fkey deferred;
    --Simply doing update pk_holder set pk = pk where pk < size; does not do
the trick
    update pk_holder set pk = pk + min_free_pk where pk < size;
    update pk_holder set pk = pk - min_free_pk where pk >= min_free_pk;
  end;
$$;

drop function if exists update_fk;
create function update_fk(in size bigint, in deferred boolean) returns
void
language plpgsql
as $$
  declare
    max_pk bigint;
  begin
    if deferred then set constraints fk_holder_fk_fkey deferred; end if;
    update fk_holder set fk = min + floor(random() * (max - min + 1))
    from (select min(pk) as min, max(pk) as max from pk_holder) as
boundaries
    where serial < size;
  end;
$$;

Reproduce:
1) Run "select init_test(1000000, 1000000);" to init the test state (it also
analyzes the created and populated tables). This step creates two unlogged
tables pk_holder and fk_holder with a single FK constraint checking that
fk_holder.fk values are among pk_holder.pk values.
2) Run "\timing" to enable timing.
3.1) Run "begin; select update_pk(2000);" to defer the FK constraint and
update 2000 rows in the pk_holder by changing pk_holder.pk values back and
forth so that at the end their values stay the same, but the rows are
updated and FK constraints will be checked on commit (simply assigning
pk_holder.pk = pk_holder.pk is not enough for this test).
3.2) Run "commit;" to initiate validation of the deferred FK constraint for
the rows updated in the previous step. Take a note on how long it takes to
be executed as reported by psql (about 2 minutes for me).
4.1) Run "begin; select update_fk(2000, true);" to defer the FK constraint
and update 2000 rows in the fk_holder by randomly changing fk_holder.fk
values.
4.2) Run "commit;" to initiate validation of the deferred FK constraint for
the rows updated in the previous step. Take a note on how long it takes to
be executed as reported by psql (about 30 milliseconds for me).

Facts:
In the proposed experiment committing a transaction that updates PK values
on 2000 rows in the pk_holder table takes 2 minutes / 30 milliseconds = 4000
times more time than committing a transaction that updated FK values in 2000
rows in the fk_holder table. Both transactions use the same deferred FK
constraint, both tables are unlogged.

Problem:
Latency of the commit command that validates deferred FK constraints as a
result of primary key values being updated is so large that it makes the
functionality virtually unusable. If the performance in such scenarios
cannot be improved, then the documentation should at least explicitly state
that such scenarios are discouraged as the implementation is not supposed to
cope with them; otherwise it would be great to implement the validation in a
way that allows using such scenarios in practice.


pgsql-bugs by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: ERROR: subtransaction logged without previous top-level txn record
Next
From: "David G. Johnston"
Date:
Subject: Re: BUG #16256: Checking deferred FK constraints when updating PK isordrs of magnitude slower than when updating FK