Re: ALTER TABLE .. DETACH PARTITION CONCURRENTLY - Mailing list pgsql-hackers

From Amit Langote
Subject Re: ALTER TABLE .. DETACH PARTITION CONCURRENTLY
Date
Msg-id CA+HiwqEeuQz_dDM8Sa2MtSxDssSyCV7UYQaGakdRcdBhBc=Q4g@mail.gmail.com
Whole thread Raw
In response to Re: ALTER TABLE .. DETACH PARTITION CONCURRENTLY  (Amit Langote <amitlangote09@gmail.com>)
List pgsql-hackers
On Mon, Apr 12, 2021 at 4:42 PM Amit Langote <amitlangote09@gmail.com> wrote:
> On Mon, Apr 12, 2021 at 6:20 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> > On 2021-Mar-31, Tom Lane wrote:
> >
> > > diff -U3
/home/buildfarm/trilobite/buildroot/HEAD/pgsql.build/src/test/isolation/expected/detach-partition-concurrently-4.out
/home/buildfarm/trilobite/buildroot/HEAD/pgsql.build/src/test/isolation/output_iso/results/detach-partition-concurrently-4.out
> > > ---
/home/buildfarm/trilobite/buildroot/HEAD/pgsql.build/src/test/isolation/expected/detach-partition-concurrently-4.out
2021-03-29 20:14:21.258199311 +0200 
> > > +++
/home/buildfarm/trilobite/buildroot/HEAD/pgsql.build/src/test/isolation/output_iso/results/detach-partition-concurrently-4.out
  2021-03-30 18:54:34.272839428 +0200 
> > > @@ -324,6 +324,7 @@
> > >  1
> > >  2
> > >  step s1insert: insert into d4_fk values (1);
> > > +ERROR:  insert or update on table "d4_fk" violates foreign key constraint "d4_fk_a_fkey"
> > >  step s1c: commit;
> > >
> > >  starting permutation: s2snitch s1b s1s s2detach s1cancel s3vacfreeze s1s s1insert s1c
> >
> > Hmm, actually, looking at this closely, I think the expected output is
> > bogus and trilobite is doing the right thing by throwing this error
> > here.  The real question is why isn't this case behaving in that way in
> > every *other* animal.
>
> Indeed.
>
> I can see a wrong behavior of RelationGetPartitionDesc() in a case
> that resembles the above test case.
>
> drop table if exists d4_primary, d4_primary1, d4_fk, d4_pid;
> create table d4_primary (a int primary key) partition by list (a);
> create table d4_primary1 partition of d4_primary for values in (1);
> create table d4_primary2 partition of d4_primary for values in (2);
> insert into d4_primary values (1);
> insert into d4_primary values (2);
> create table d4_fk (a int references d4_primary);
> insert into d4_fk values (2);
> create table d4_pid (pid int);
>
> Session 1:
> begin isolation level serializable;
> select * from d4_primary;
>  a
> ---
>  1
>  2
> (2 rows)
>
> Session 2:
> alter table d4_primary detach partition d4_primary1 concurrently;
> <waits>
>
> Session 1:
> -- can see d4_primary1 as detached at this point, though still scans!
> select * from d4_primary;
>  a
> ---
>  1
>  2
> (2 rows)
> insert into d4_fk values (1);
> INSERT 0 1
> end;
>
> Session 2:
> <alter-finishes>
> ALTER TABLE
>
> Session 1:
> -- FK violated
> select * from d4_primary;
>  a
> ---
>  2
> (1 row)
> select * from d4_fk;
>  a
> ---
>  1
> (1 row)
>
> The 2nd select that session 1 performs adds d4_primary1, whose detach
> it *sees* is pending, to the PartitionDesc, but without setting its
> includes_detached.  The subsequent insert's RI query, because it uses
> that PartitionDesc as-is, returns 1 as being present in d4_primary,
> leading to the insert succeeding.  When session 1's transaction
> commits, the waiting ALTER proceeds with committing the 2nd part of
> the DETACH, without having a chance again to check if the DETACH would
> lead to the foreign key of d4_fk being violated.
>
> It seems problematic to me that the logic of setting includes_detached
> is oblivious of the special check in find_inheritance_children() to
> decide whether "force"-include a detach-pending child based on
> cross-checking its pg_inherit tuple's xmin against the active
> snapshot.  Maybe fixing that would help, although I haven't tried that
> myself yet.

I tried that in the attached.  It is indeed the above failing
isolation test whose output needed to be adjusted.

While at it, I tried rewording the comment around that special
visibility check done to force-include detached partitions, as I got
confused by the way it's worded currently.  Actually, it may be a good
idea to add some comments around the intended include_detached
behavior in the places where PartitionDesc is used; e.g.
set_relation_partition_info() lacks a one-liner on why it's okay for
the planner to not see detached partitions.  Or perhaps, a comment for
includes_detached of PartitionDesc should describe the various cases
in which it is true and the cases in which it is not.

--
Amit Langote
EDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Luc Vlaming
Date:
Subject: interaction between csps with dummy tlists and set_customscan_references
Next
From: Fujii Masao
Date:
Subject: Re: TRUNCATE on foreign table