Thread: partitioned tables referenced by FKs

partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
Here's a patch to allow partitioned tables to be referenced by foreign
keys.  Current state is WIP, but everything should work; see below for
the expected exception.

The design is very simple: have one pg_constraint row for each partition
on each side, each row pointing to the topmost table on the other side;
triggers appear on each leaf partition (and naturally they don't appear
on any intermediate partitioned table).

There are tests that should cover all the basic features.  pg_upgrade
tests work (containing relevant tables, as regress/foreign_key.sql
leaves them behind for this reason: partitioned-references-partitioned).

There is one requisite feature still missing from this patch: when a
partition on the referenced side is detached or dropped, we must ensure
no referencing row remains on the other side.  For this, I have an
(unmerged and thus unsubmitted here) new ri_triggers.c routine
RI_Inverted_Initial_Check (need to come up with better name, heh) that
verifies this, invoked at DETACH/DROP time.

Also, some general code cleanup and documentation patching is needed.

I'm posting this now so that I can take my hands off it for a few days;
will return to post an updated version at some point before next
commitfest.  I wanted to have this ready for this commitfest, but RL
dictated otherwise.  This patch took a *very long time* to write ... I
wrote three different recursion models for this.


One thing I realized while writing this, is that my commit
3de241dba86f ("Foreign keys on partitioned tables") put function
CloneForeignKeyConstraints() in catalog/pg_constraint.c that should
really have been in tablecmds.c.  In this patch I produced some siblings
of that function still in pg_constraint.c, but I intend to move the
whole lot to tablecmds.c before commit.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
Oh, I forgot to mention one thing.  When creating a constraint, an index
OID is normally given.  I'm not sure what is this for.  In the patch
it's a little convoluted to get the correct index OID, so I'm just
passing InvalidOid.  Things work nonetheless.  I wonder if we shouldn't
just do away with the index OID.  There are several /*FIXME*/ notes in
the code due to this.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: partitioned tables referenced by FKs

From
Corey Huinker
Date:
On Fri, Nov 2, 2018 at 7:42 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Here's a patch to allow partitioned tables to be referenced by foreign
keys.  Current state is WIP, but everything should work; see below for
the expected exception.

The design is very simple: have one pg_constraint row for each partition
on each side, each row pointing to the topmost table on the other side;
triggers appear on each leaf partition (and naturally they don't appear
on any intermediate partitioned table).

This is an important and much needed feature!

Based on my extremely naive reading of this code, I have two perhaps equally naive questions:

1. it seems that we will continue to to per-row RI checks for inserts and updates. However, there already exists a bulk check in RI_Initial_Check(). Could we modify this bulk check to do RI checks on a per-statement basis rather than a per-row basis?

2. If #1 is possible, is the overhead of transitions tables too great for the single-row case?

Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
On 2018-Nov-05, Corey Huinker wrote:

> This is an important and much needed feature!

I agree :-)

> Based on my extremely naive reading of this code, I have two perhaps
> equally naive questions:
> 
> 1. it seems that we will continue to to per-row RI checks for inserts and
> updates. However, there already exists a bulk check in RI_Initial_Check().
> Could we modify this bulk check to do RI checks on a per-statement basis
> rather than a per-row basis?

One of the goals when implementing trigger transition tables was to
supplant the current per-row implementation of RI triggers with
per-statement.  I haven't done that, but AFAIK it remains possible :-)

Changing that is definitely not a goal of this patch.

> 2. If #1 is possible, is the overhead of transitions tables too great for
> the single-row case?

Without an implementation, I can't say, but if I had to guess, I would
assume so.  Or maybe there are clever optimizations for that particular
case.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: partitioned tables referenced by FKs

From
Corey Huinker
Date:

> 1. it seems that we will continue to to per-row RI checks for inserts and
> updates. However, there already exists a bulk check in RI_Initial_Check().
> Could we modify this bulk check to do RI checks on a per-statement basis
> rather than a per-row basis?

One of the goals when implementing trigger transition tables was to
supplant the current per-row implementation of RI triggers with
per-statement.  I haven't done that, but AFAIK it remains possible :-)

Changing that is definitely not a goal of this patch.

Then I may try to tackle it myself in a separate thread.

Without an implementation, I can't say, but if I had to guess, I would
assume so.  Or maybe there are clever optimizations for that particular
case.

But in this case there is no actual defined trigger, it's internal code making an SPI call...is there an indicator that tells us whether this change was multi-row or not?

 

Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
Here's a more credible version of this patch series.

0001 refactors some duplicative code that interprets a pg_constraint row
for a foreign key back into a Constraint node.  Only what is necessary
for current features is read.

0002 moves some code that I added in 3de241dba86f to
src/backend/catalog/pg_constraint.c so that it appears in tablecmds.c
instead.  After developing the current patch I realized that the latter
is its natural home.

0003 changes the shape of a loop in deleteObjectsInList, so that I can
add object type-specific functions to be called before deleting an
object.  This is needed by 0004 and makes no sense on its own; I would
probably push both together, but splitting it like this makes it very
obvious what it is I'm doing.

0004 adds the feature itself

In 0004 there's also a new function "index_get_partition" which allows
to simplify some code I added in 8b08f7d4820f.  I will probably split it
up and commit separately because it's an obvious code beautify.

0005 modifies psql to show the constraint in a different way, which
makes more sense overall (rather than show the "internal" constraint
that concerns the partition, show the top-level contraint that concerns
the ancestor table on which it is defined.  This includes naming the
table in which the constraint is defined).

There's one half of 0005 that I think should be applied to pg11, because
when partitions have foreign keys, the display looks a bit odd
currently.  The other half I would probably merge into 0004 instead of
committing separately.  Even if not backpatched, it makes sense to apply
ahead of the rest because it changes expected output for a regression
test.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: partitioned tables referenced by FKs

From
Jesper Pedersen
Date:
Hi,

On 11/30/18 2:12 PM, Alvaro Herrera wrote:
> Here's a more credible version of this patch series.
> 

The patch series applies with hunks, and passes check-world.

No comments for 0001, 0002, 0003 and 0005.

While I'm still looking at 0004 - should we have a test that updates one 
of the constraints of fk to another partition in pk ?

I'll didn't try this yet with [1], so sending you a flamegraph of 
INSERTs off-list, since data loading takes a while during an OLTP based 
work load.

Thanks for working on this !

[1] https://commitfest.postgresql.org/21/1897/

Best regards,
  Jesper


Re: partitioned tables referenced by FKs

From
Jesper Pedersen
Date:
Hi,

On 1/7/19 1:07 PM, Jesper Pedersen wrote:
> While I'm still looking at 0004 - should we have a test that updates one 
> of the constraints of fk to another partition in pk ?
> 

In addition:

* Document pre_drop_class_check()
* I think index_get_partition() would be more visible in partition.c
* Remove code in #if 0 block

I have tested this with a hash based setup.

Thanks again !

Best regards,
  Jesper


Re: partitioned tables referenced by FKs

From
Amit Langote
Date:
Hi Alvaro,

On 2018/12/01 4:12, Alvaro Herrera wrote:
> Here's a more credible version of this patch series.

Are you going to post rebased patches soon?

Thanks,
Amit



Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
Hello,

On 2019-Jan-21, Amit Langote wrote:

> On 2018/12/01 4:12, Alvaro Herrera wrote:
> > Here's a more credible version of this patch series.
> 
> Are you going to post rebased patches soon?

Yes, very soon -- right now, in fact :-)

Two preliminary patches in this series are already pushed, per a nearby
bugfix.  I also split out the new index_get_partition routine to
catalog/partition.c, per comment from Jesper, and put it on its own
patch.  0003 is the interesting bits in this submission.

Note that there is a change in constraint naming on partitions.  This
affects some preexisting test output ... and I'm not liking the changes
much, anymore.  I'll have a look about how to revert to the previous
behavior.

As you noticed in the other thread, the part of the FK clone routine
that attaches to an existing constraint needed to be refactored into its
own routine.  I did that, though the split is different from what you
were proposing.

Jesper also mentioned removing the "#if 0" code.  Actually what I need
to be doing is reinstating that check in the cases where it's possible.
I haven't done that yet.

I also dropped the part that changed how psql reports constraints in \d,
since there's a separate commitfest entry for that one.

Hmm, I just noticed that there's an ereport that fails i18n by way of
using a ternary operator in the first argument of errmsg.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: partitioned tables referenced by FKs

From
Michael Paquier
Date:
On Tue, Jan 22, 2019 at 06:45:48PM -0300, Alvaro Herrera wrote:
> Yes, very soon -- right now, in fact :-)

This needs a rebase.  Moved to next CF, waiting on author.
--
Michael

Attachment

Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
Here's another rebase.  The code is structured somewhat differently from
the previous one, mostly because of the backpatched bugfixes, but also
because of the changes in dependency handling.

I think there's also at least one more bugfix to backpatch (included in
0003 here), related to whether foreign tables are allowed as having FKs,
but AFAICS it just cosmetic: you get an error if you have a foreign
partition and add a FK, but it says "you cannot have constraint
triggers" rather than "FKs are not supported", so it's a bit odd.
0003 also includes Amit L's patch to remove the parentConstraint
argument from ATExecAddForeignKey, which I suppose I should commit
separately.

Odd bugs fixed: a) handle the case where the default partition is the
only one and it's being detached or dropped.  b) when a partition was
dropped/detached from the referenced side, the child constraint was left
in place.

I have not yet fixed the "#if 0" section.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: partitioned tables referenced by FKs

From
Amit Langote
Date:
Hi Alvaro,

I looked at the latest patch and most of the issues/bugs that I was going
to report based on the late January version of the patch seem to have been
taken care of; sorry that I couldn't post sooner which would've saved you
some time.   The patch needs to be rebased on top of ff11e7f4b9 which
touched ri_triggers.c.

In the following case:

create table pk (a int primary key) partition by list (a);
create table pk1 (a int primary key);
create table fk (a int references pk1);

-- adds FK referencing pk1 via ATAddForeignKeyConstraint recursion
alter table pk attach partition pk1 for values in (1);
alter table fk add foreign key (a) references pk;

or:

-- adds FK referencing pk1 via CloneForeignKeyConstraints
alter table fk add foreign key (a) references pk;
alter table pk attach partition pk1 for values in (1);

\d fk
                 Table "public.fk"
 Column │  Type   │ Collation │ Nullable │ Default
────────┼─────────┼───────────┼──────────┼─────────
 a      │ integer │           │          │
Foreign-key constraints:
    "fk_a_fkey" FOREIGN KEY (a) REFERENCES pk1(a)
    "fk_a_fkey1" FOREIGN KEY (a) REFERENCES pk(a)
    "fk_a_fkey2" FOREIGN KEY (a) REFERENCES pk1(a)

Could we have avoided creating fk_a_fkey2 which must be same as fk_a_fkey
as far as the user-visible behavior is concerned?


* Regarding 0003

I'd like to hear your thoughts on some suggestions to alter the structure
of the reorganized code around foreign key addition/cloning.  With this
patch adding support for foreign keys to reference partitioned tables, the
code now has to consider various cases due to the possibility of having
partitioned tables on both sides of a foreign key, which is reflected in
the complexity of the new code.  Following functions have been introduced
in the foreign key DDL code in the course of adding support for partitioning:

addFkRecurseReferenced
addFkRecurseReferencing
CloneForeignKeyConstraints
CloneFkReferencing
CloneFkReferenced
tryAttachPartitionForeignKey

We have addFkRecurseReferencing and CloneForeignKeyConstraints calling
CloneFkReferencing to recursively add a foreign key constraint being
defined on (or being cloned to) a partitioned table to its partitions.
The latter invokes tryAttachPartitionForeignKey to see if an existing
foreign key constraint is functionally same as one being cloned to avoid
creating a duplicate constraint.

However, we have CloneFkReferenced() calling addFkRecurseReferenced, not
the other way around.  Neither of these functions attempts to reuse an
existing, functionally equivalent, foreign key constraint referencing a
given partition that's being recursed to.  So we get the behavior in the
above example, which again, I'm not sure is intentional.

Also, it seems a bit confusing that there is a CreateConstraintEntry call
in addFkRecurseReferenced() which is creating a constraint on the
*referencing* relation, which I think should be in
ATAddForeignKeyConstraint, the caller.  I know we need to create a copy of
the constraint to reference the partitions of the referenced table, but we
might as well put it in CloneFkReferenced and reverse who calls who --
make addFkRecurseReferenced() call CloneFkReferenced and have the code to
create the cloned constraint and action triggers in the latter.  That will
make the code to handle different sides of foreign key look similar, and
imho, easier to follow.


+/*
+ * addFkRecurseReferenced
+ *      recursive subroutine for ATAddForeignKeyConstraint, referenced side

How about:

Subroutine for ATAddForeignKeyConstraint to add action trigger on
referenced relation, recursing if partitioned, in which case, both the
constraint referencing a given partition and the action trigger are cloned
in all partitions

+/*
+ * addFkRecurseReferenced
+ *      recursive subroutine for ATAddForeignKeyConstraint, referenced side

How about:

Subroutine for ATAddForeignKeyConstraint to add check trigger on
referencing relation, recursing if partitioned, in which case, both the
constraint and the check trigger are cloned in all partitions

I noticed however that this function itself is not recursively called;
CloneFkReferencing handles the recursion.

/*
 * CloneForeignKeyConstraints
 *      Clone foreign keys from a partitioned table to a newly acquired
 *      partition.

Maybe:

Clone foreign key of (or referencing) a partitioned table to a newly
acquired partition


* In 0002,

 /*
+ * Return the OID of the index, in the given partition, that is a child
of the
+ * given index or InvalidOid if there isn't one.
+ */
+Oid
+index_get_partition(Relation partition, Oid indexId)

Maybe add a comma between "...given index" and "or InvalidOid", or maybe
rewrite the sentence as:

Return the OID of index of the given partition that is a child of the
given index, or InvalidOid if there isn't one.


* Unrelated to this thread, but while reviewing, I noticed this code in
ATExecAttachPartitionIdx:

    /* no deadlock risk: RangeVarGetRelidExtended already acquired the lock */
    partIdx = relation_open(partIdxId, AccessExclusiveLock);

    /* we already hold locks on both tables, so this is safe: */
    parentTbl = relation_open(parentIdx->rd_index->indrelid, AccessShareLock);
    partTbl = relation_open(partIdx->rd_index->indrelid, NoLock);


I wonder why not NoLock in *all* of these relation_open calls?  As
the comment says, RangeVarGetRelidExtended already got the lock for
'partIdxId'.  Then there must already be a lock on 'parentTbl' as it's the
target relation of the ALTER INDEX command (actually the target index's
table).

Thanks,
Amit



Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
On 2019-Feb-28, Amit Langote wrote:

> Hi Alvaro,
> 
> I looked at the latest patch and most of the issues/bugs that I was going
> to report based on the late January version of the patch seem to have been
> taken care of; sorry that I couldn't post sooner which would've saved you
> some time.   The patch needs to be rebased on top of ff11e7f4b9 which
> touched ri_triggers.c.

Rebased to current master.  I'll reply later to handle the other issues
you point out.

Given the comments on 0002 in this thread and elsewhere, I'm inclined to
push that one today with minor tweaks.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: partitioned tables referenced by FKs

From
Jesper Pedersen
Date:
Hi Alvaro,

On 2/28/19 1:28 PM, Alvaro Herrera wrote:
> Rebased to current master.  I'll reply later to handle the other issues
> you point out.
> 

Applying with hunks.

With 0003 using

export CFLAGS="-DCOPY_PARSE_PLAN_TREES -O0 -fno-omit-frame-pointer" && 
CC="ccache gcc" ./configure --prefix /usr/local/packages/postgresql-12.x 
--enable-dtrace --with-openssl --with-gssapi --with-libxml --with-llvm 
--enable-debug --enable-depend --enable-tap-tests --enable-cassert

I'm getting a failure in the pg_upgrade test:

  --
+-- Name: pk5 pk5_pkey; Type: CONSTRAINT; Schema: regress_fk; Owner: 
jpedersen
+--
+
+ALTER TABLE ONLY regress_fk.pk5
+    ADD CONSTRAINT pk5_pkey PRIMARY KEY (a);
+
+
+--

when running check-world.

Best regards,
  Jesper


Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
On 2019-Feb-28, Amit Langote wrote:

Hello

> In the following case:
> 
> create table pk (a int primary key) partition by list (a);
> create table pk1 (a int primary key);
> create table fk (a int references pk1);
> 
> -- adds FK referencing pk1 via ATAddForeignKeyConstraint recursion
> alter table pk attach partition pk1 for values in (1);
> alter table fk add foreign key (a) references pk;
> 
> or:
> 
> -- adds FK referencing pk1 via CloneForeignKeyConstraints
> alter table fk add foreign key (a) references pk;
> alter table pk attach partition pk1 for values in (1);
> 
> \d fk
>                  Table "public.fk"
>  Column │  Type   │ Collation │ Nullable │ Default
> ────────┼─────────┼───────────┼──────────┼─────────
>  a      │ integer │           │          │
> Foreign-key constraints:
>     "fk_a_fkey" FOREIGN KEY (a) REFERENCES pk1(a)
>     "fk_a_fkey1" FOREIGN KEY (a) REFERENCES pk(a)
>     "fk_a_fkey2" FOREIGN KEY (a) REFERENCES pk1(a)
> 
> Could we have avoided creating fk_a_fkey2 which must be same as fk_a_fkey
> as far as the user-visible behavior is concerned?

So I wrote the code that does as you describe, and it worked
beautifully.

Once I was finished, fixed bugs and tested it, I realized that that was
a stupid thing to have done -- because THOSE ARE DIFFERENT CONSTRAINTS.
When you say "fk (a) references pk1" you're saying that all the values
in fk(a) must appear in pk1.  OTOH when you say "fk references pk" you
mean that the values might appear anywhere in pk, not just pk1.  Have a
look at the triggers in pg_trigger that appear when you do "references
pk1" vs. when you do "references pk".  The constraint itself might
appear identical, but the former adds check triggers that are not
present for the latter.

The main problem here is the way the constraints are presented to the
user in psql (fk_a_fkey2 ought not to appear at all, because it's an
implementation detail for fk_a_fkey), which as you know I have another
patch for.

> * Regarding 0003
> 
> I'd like to hear your thoughts on some suggestions to alter the structure
> of the reorganized code around foreign key addition/cloning.  With this
> patch adding support for foreign keys to reference partitioned tables, the
> code now has to consider various cases due to the possibility of having
> partitioned tables on both sides of a foreign key, which is reflected in
> the complexity of the new code.  Following functions have been introduced
> in the foreign key DDL code in the course of adding support for partitioning:
> 
> addFkRecurseReferenced
> addFkRecurseReferencing
> CloneForeignKeyConstraints
> CloneFkReferencing
> CloneFkReferenced
> tryAttachPartitionForeignKey

I can understand how the lack of consistency in how we handle recursion
can be confusing.  I'll see if I can move the recursion so that it's
handled similarly for both cases, but I'm not convinced that it can.

Note there's a single call of tryAttachPartitionForeignKey().  I
invented that function because in the original code pattern there were
two callers, but after some backpatched code restructuring to fix a bug,
I changed that and now that function is no longer necessary.  I might
put the code back into CloneFkReferencing.

> The latter invokes tryAttachPartitionForeignKey to see if an existing
> foreign key constraint is functionally same as one being cloned to avoid
> creating a duplicate constraint.

Do note that "functionally the same" does not mean the same thing when
applied to the referencing side than when applied to the referenced
side.

> However, we have CloneFkReferenced() calling addFkRecurseReferenced, not
> the other way around.  Neither of these functions attempts to reuse an
> existing, functionally equivalent, foreign key constraint referencing a
> given partition that's being recursed to.

You'll have to take back the assertion that those constraints are
funtionally equivalent, because they are not.

> Also, it seems a bit confusing that there is a CreateConstraintEntry call
> in addFkRecurseReferenced() which is creating a constraint on the
> *referencing* relation, which I think should be in
> ATAddForeignKeyConstraint, the caller.  I know we need to create a copy of
> the constraint to reference the partitions of the referenced table, but we
> might as well put it in CloneFkReferenced and reverse who calls who --
> make addFkRecurseReferenced() call CloneFkReferenced and have the code to
> create the cloned constraint and action triggers in the latter.  That will
> make the code to handle different sides of foreign key look similar, and
> imho, easier to follow.

I'm looking into this code restructuring you suggest.  I'm not 100% sold
on it yet ... it took me quite a while to arrive to a working
arrangement, although I admit the backpatched bugfixes dislodged things.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: partitioned tables referenced by FKs

From
Robert Haas
Date:
On Thu, Mar 14, 2019 at 1:40 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Once I was finished, fixed bugs and tested it, I realized that that was
> a stupid thing to have done -- because THOSE ARE DIFFERENT CONSTRAINTS.

This made me laugh.

> When you say "fk (a) references pk1" you're saying that all the values
> in fk(a) must appear in pk1.  OTOH when you say "fk references pk" you
> mean that the values might appear anywhere in pk, not just pk1.  Have a
> look at the triggers in pg_trigger that appear when you do "references
> pk1" vs. when you do "references pk".  The constraint itself might
> appear identical, but the former adds check triggers that are not
> present for the latter.

It's probably not uncommon to have FKs between compatibly-partitioned
tables, though, and in that case they are really equivalent.  For
example, if you have an orders table and an order_lines table and the
latter has an FK pointing at the former, and both are partitioned on
the order number, then it must be that every order_line references an
order in the matching partition.  Even if it's not practical to use
the FK itself, it's a good excuse for skipping any validation scan you
otherwise might have performed.

But there are other cases in which that logic doesn't hold.  I can't
quite wrap my brain around the exact criteria at the moment.  I think
it's something like: the partitioning columns of the referring table
are a prefix of the foreign key columns, and the opfamilies match, and
likewise for the referenced table.  And the partition bounds match up
well enough.  And maybe some other things.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
On 2019-Mar-14, Robert Haas wrote:

> On Thu, Mar 14, 2019 at 1:40 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

> > When you say "fk (a) references pk1" you're saying that all the values
> > in fk(a) must appear in pk1.  OTOH when you say "fk references pk" you
> > mean that the values might appear anywhere in pk, not just pk1.  Have a
> > look at the triggers in pg_trigger that appear when you do "references
> > pk1" vs. when you do "references pk".  The constraint itself might
> > appear identical, but the former adds check triggers that are not
> > present for the latter.
> 
> It's probably not uncommon to have FKs between compatibly-partitioned
> tables, though, and in that case they are really equivalent.  For
> example, if you have an orders table and an order_lines table and the
> latter has an FK pointing at the former, and both are partitioned on
> the order number, then it must be that every order_line references an
> order in the matching partition.  Even if it's not practical to use
> the FK itself, it's a good excuse for skipping any validation scan you
> otherwise might have performed.

Well, I suppose that can be implemented as an optimization on top of
what we have, but I think that we should first get this feature right,
and later we can see about improving it.  In any case, since the RI
queries are run via SPI, any unnecessary partitions should get pruned by
partition pruning based on each partition's constraint.  So I'm not so
certain that this is a problem worth spending much time/effort/risk of
bugs on.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: partitioned tables referenced by FKs

From
Robert Haas
Date:
On Thu, Mar 14, 2019 at 3:36 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Well, I suppose that can be implemented as an optimization on top of
> what we have, but I think that we should first get this feature right,
> and later we can see about improving it.

Sure, not arguing with that.

> In any case, since the RI
> queries are run via SPI, any unnecessary partitions should get pruned by
> partition pruning based on each partition's constraint.  So I'm not so
> certain that this is a problem worth spending much time/effort/risk of
> bugs on.

I doubt that partition pruning would just automatically DTRT in a case
like this, but if I'm wrong, great!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: partitioned tables referenced by FKs

From
Amit Langote
Date:
Hi,

On 2019/03/15 2:31, Alvaro Herrera wrote:
> Once I was finished, fixed bugs and tested it, I realized that that was
> a stupid thing to have done -- because THOSE ARE DIFFERENT CONSTRAINTS.
> When you say "fk (a) references pk1" you're saying that all the values
> in fk(a) must appear in pk1.  OTOH when you say "fk references pk" you
> mean that the values might appear anywhere in pk, not just pk1.

Sure, then just drop the check trigger that queries only pk1, in favor of
one that checks pk, that's already in place due to the parent constraint.
Actually, if one doesn't drop it (that is, by way of dropping the
constraint that created it), they won't be able to insert into fk anything
but the subset of rows that pk1 allows as a partition; see this:

create table pk1 (a int primary key);
insert into pk1 values (1);
create table pk (a int primary key) partition by list (a);
create table fk (a int, foreign key (a) references pk1, foreign key (a)
references pk);

insert into fk values (1);
ERROR:  insert or update on table "fk" violates foreign key constraint
"fk_a_fkey1"
DETAIL:  Key (a)=(1) is not present in table "pk".

alter table pk attach partition pk1 for values in (1);
insert into fk values (1);

create table pk2 partition of pk for values in (2);
insert into pk values (2);

insert into fk values (2);
ERROR:  insert or update on table "fk" violates foreign key constraint
"fk_a_fkey"
DETAIL:  Key (a)=(2) is not present in table "pk1".

You can no longer insert (2) into pk1 though.

Now it's true that a user can drop the constraint directly referencing pk1
themselves to make the above error go away, but it would've been nice if
they didn't have to do it.  That would be if it was internally converted
into a child constraint when attaching pk1 to pk, as I'm suggesting.

> Have a
> look at the triggers in pg_trigger that appear when you do "references
> pk1" vs. when you do "references pk".  The constraint itself might
> appear identical, but the former adds check triggers that are not
> present for the latter.

Let's consider both triggers.

1. The action trigger on the referenced table does the same thing no
matter whether it's a partition or not, which is this:

select 1 from fk x where $1 = a for key share of x;

Note that the trigger is invoked both when the referenced table is
directly mentioned in the query (delete from pk1) and when it's indirectly
mentioned via its parent table (delete from pk).  Duplication of triggers
in the latter case causes the same set of rows to be added twice for
checking in the respective triggers' contexts, which seems pointless.

2. The check trigger on the referencing table checks exactly the primary
key table that's mentioned in the foreign key constraint.  So, when the
foreign key constraint is "referencing pk1", the check query is:

select 1 from pk1 x where a = $1 for key share of x;

whereas when the foreign key constraint is "referencing pk", it's:

select 1 from pk x where a = $1 for key share of x;

The latter scans appropriate partition of pk, which may be pk1, so we can
say the first check trigger is redundant, and in fact, even annoying as
shown above.


So, when attaching pk1 to pk (where both are referenced by fk using
exactly the same columns and same other constraint attributes), we *can*
reuse the foreign key on fk referencing pk1, as the child constraint of
foreign key on fk referencing pk, whereby, we:

1. don't need to create a new action trigger on pk1, because the existing
one is enough,

2. can drop the check trigger on fk that checks pk1, because the one that
checks pk will be enough

As you know, we've done similar stuff in 123cc697a8eb + cb90de1aac18 for
the case where the partition is on the referencing side.  In that case, we
can reuse the check trigger as it checks the same table in all partitions
and drop the redundant action trigger.

Thanks,
Amit



Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
Hi Amit

On 2019-Mar-18, Amit Langote wrote:

> On 2019/03/15 2:31, Alvaro Herrera wrote:
> > Once I was finished, fixed bugs and tested it, I realized that that was
> > a stupid thing to have done -- because THOSE ARE DIFFERENT CONSTRAINTS.
> > When you say "fk (a) references pk1" you're saying that all the values
> > in fk(a) must appear in pk1.  OTOH when you say "fk references pk" you
> > mean that the values might appear anywhere in pk, not just pk1.
> 
> Sure, then just drop the check trigger that queries only pk1, in favor of
> one that checks pk, that's already in place due to the parent constraint.
> Actually, if one doesn't drop it (that is, by way of dropping the
> constraint that created it), they won't be able to insert into fk anything
> but the subset of rows that pk1 allows as a partition;

That's true, and it is the correct behavior.  What they should be doing
(to prevent the insertion into the referencing relation from failing all
the time) is drop the constraint, as you suggest.

Converting the constraint so that it refers to a completely different
set of permitted values is second-guessing the user about what they
wanted to do; what if we get it wrong, and they really wanted the FK to
reference just exactly the subset of values that they specified, and not
a much larger superset of that?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
Hi Jesper

On 2019-Mar-01, Jesper Pedersen wrote:

> I'm getting a failure in the pg_upgrade test:
> 
>  --
> +-- Name: pk5 pk5_pkey; Type: CONSTRAINT; Schema: regress_fk; Owner:
> jpedersen
> +--
> +
> +ALTER TABLE ONLY regress_fk.pk5
> +    ADD CONSTRAINT pk5_pkey PRIMARY KEY (a);
> +
> +
> +--
> 
> when running check-world.

So I tested this case just now (after fixing a couple of bugs) and see a
slightly different problem now: those excess lines are actually just
that pg_dump chose to print the constraints in different order, as there
are identical lines with "-" in the diff I get.  The databases actually
*are* identical as far as I can tell.  I haven't yet figured out how to
fix this; of course, the easiest solution is to just drop the regress_fk
schema at the end of the foreign_key.sql regression test, but I would
prefer a different solution.

I'm going to post the new version of the patch separately.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
On 2019-Feb-28, Amit Langote wrote:

> I'd like to hear your thoughts on some suggestions to alter the structure
> of the reorganized code around foreign key addition/cloning.  With this
> patch adding support for foreign keys to reference partitioned tables, the
> code now has to consider various cases due to the possibility of having
> partitioned tables on both sides of a foreign key, which is reflected in
> the complexity of the new code.

I spent a few hours studying this and my conclusion is the opposite of
yours: we should make addFkRecurseReferencing the recursive one, and
CloneFkReferencing a non-recursive caller of that.  So we end up with
both addFkRecurseReferenced and addFkRecurseReferencing as recursive
routines, and CloneFkReferenced and CloneFkReferencing being
non-recursive callers of those.  With this structure change there is one
more call to CreateConstraintEntry than before, and now there are two
calls of tryAttachPartitionForeignKey instead of one; I think with this
new structure things are much simpler.  I also changed
CloneForeignKeyConstraints's API: instead of returning a list of cloned
constraint giving its caller the responsibility of adding FK checks to
phase 3, we now give CloneForeignKeyConstraints the 'wqueue' list, so
that it can add the FK checks itself.  It seems much cleaner this way.

> Also, it seems a bit confusing that there is a CreateConstraintEntry call
> in addFkRecurseReferenced() which is creating a constraint on the
> *referencing* relation, which I think should be in
> ATAddForeignKeyConstraint, the caller.  I know we need to create a copy of
> the constraint to reference the partitions of the referenced table, but we
> might as well put it in CloneFkReferenced and reverse who calls who --
> make addFkRecurseReferenced() call CloneFkReferenced and have the code to
> create the cloned constraint and action triggers in the latter.  That will
> make the code to handle different sides of foreign key look similar, and
> imho, easier to follow.

Well, if you think about it, *all* the constraints created by all these
routines are in the referencing relations.  The only question here is
*why* we create those tuples; in the case of the ...Referenced routines,
it's because of the partitions in the referenced side; in the case of
the ...Referencing routines, it's because of partitions in the
referencing side.  I think changing it the way you suggest would be even
more confusing.


As discussed in the other subthread, I'm not making any effort to reuse
an existing constraint defined in a partition of the referenced side; as
far as I can tell that's a nonsensical transformation.


A pretty silly bug remains here.  Watch:

create table pk (a int primary key) partition by list (a);
create table pk1 partition of pk for values in (1);
create table fk (a int references pk);
insert into pk values (1);
insert into fk values (1);
drop table pk cascade;

Note that the DROP of the main PK table is pretty aggressive: since it
cascades, you want it to drop the constraint on the FK side.  This would
work without a problem if 'pk' was a non-partitioned table, but in this
case it fails:

alvherre=# drop table pk cascade;
NOTICE:  drop cascades to constraint fk_a_fkey on table fk
ERROR:  removing partition "pk1" violates foreign key constraint "fk_a_fkey1"
DETALLE:  Key (a)=(1) still referenced from table "fk".

The reason is that the "pre drop check" that's done before allow a drop
of the partition doesn't realize that the constraint is also being
dropped (and so the check ought to be skipped).  If you were to do just
"DROP TABLE pk1" then this complaint would be correct, since it would
leave the constraint in an invalid state.  But here, it's bogus and
annoying.  You can DELETE the matching values from table FK and then the
DROP works.  Here's another problem caused by the same misbehavior:

alvherre=# drop table pk, fk;
ERROR:  removing partition "pk1" violates foreign key constraint "fk_a_fkey1"
DETALLE:  Key (a)=(1) still referenced from table "fk".

Note here we want to get rid of table 'fk' completely.  If you split it
up in a DROP of fk, followed by a DROP of pk, it works.

I'm not yet sure what's the best way to attack this.  Maybe the
"pre-drop check" needs a completely different approach.  The simplest
approach is to prohibit a table drop or detach for any partitioned table
that's referenced by a foreign key, but that seems obnoxious and
inconvenient.


I still haven't put back the code in "#if 0".

FWIW I think we should add an index on pg_constraint.confrelid now.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: partitioned tables referenced by FKs

From
Amit Langote
Date:
Hi,

Thanks for updating the patch.  I'll reply to other parts separately.

On 2019/03/19 7:02, Alvaro Herrera wrote:
> A pretty silly bug remains here.  Watch:
> 
> create table pk (a int primary key) partition by list (a);
> create table pk1 partition of pk for values in (1);
> create table fk (a int references pk);
> insert into pk values (1);
> insert into fk values (1);
> drop table pk cascade;
> 
> Note that the DROP of the main PK table is pretty aggressive: since it
> cascades, you want it to drop the constraint on the FK side.  This would
> work without a problem if 'pk' was a non-partitioned table, but in this
> case it fails:
> 
> alvherre=# drop table pk cascade;
> NOTICE:  drop cascades to constraint fk_a_fkey on table fk
> ERROR:  removing partition "pk1" violates foreign key constraint "fk_a_fkey1"
> DETALLE:  Key (a)=(1) still referenced from table "fk".
> 
> The reason is that the "pre drop check" that's done before allow a drop
> of the partition doesn't realize that the constraint is also being
> dropped (and so the check ought to be skipped).  If you were to do just
> "DROP TABLE pk1" then this complaint would be correct, since it would
> leave the constraint in an invalid state.  But here, it's bogus and
> annoying.  You can DELETE the matching values from table FK and then the
> DROP works.  Here's another problem caused by the same misbehavior:
> 
> alvherre=# drop table pk, fk;
> ERROR:  removing partition "pk1" violates foreign key constraint "fk_a_fkey1"
> DETALLE:  Key (a)=(1) still referenced from table "fk".
> 
> Note here we want to get rid of table 'fk' completely.  If you split it
> up in a DROP of fk, followed by a DROP of pk, it works.
> 
> I'm not yet sure what's the best way to attack this.  Maybe the
> "pre-drop check" needs a completely different approach.  The simplest
> approach is to prohibit a table drop or detach for any partitioned table
> that's referenced by a foreign key, but that seems obnoxious and
> inconvenient.

Agreed.

Will it suffice or be OK if we skipped invoking the pre-drop callback for
objects that are being "indirectly" dropped?  I came up with the attached
patch to resolve this problem, if that idea has any merit.  We also get
slightly better error message as seen the expected/foreign_key.out changes.

Thanks,
Amit

Attachment

Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
On 2019-Mar-19, Amit Langote wrote:

> Will it suffice or be OK if we skipped invoking the pre-drop callback for
> objects that are being "indirectly" dropped?  I came up with the attached
> patch to resolve this problem, if that idea has any merit.  We also get
> slightly better error message as seen the expected/foreign_key.out changes.

I don't think this works ... consider putting some partition in a
different schema and then doing DROP SCHEMA CASCADE.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: partitioned tables referenced by FKs

From
Amit Langote
Date:
On Tue, Mar 19, 2019 at 8:49 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> On 2019-Mar-19, Amit Langote wrote:
>
> > Will it suffice or be OK if we skipped invoking the pre-drop callback for
> > objects that are being "indirectly" dropped?  I came up with the attached
> > patch to resolve this problem, if that idea has any merit.  We also get
> > slightly better error message as seen the expected/foreign_key.out changes.
>
> I don't think this works ... consider putting some partition in a
> different schema and then doing DROP SCHEMA CASCADE.

Ah, I did test DROP SCHEMA CASCADE, but only tested putting the top
table into the schema, not a partition.

Thanks,
Amit


Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
On 2019-Mar-19, Amit Langote wrote:

> On Tue, Mar 19, 2019 at 8:49 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> > On 2019-Mar-19, Amit Langote wrote:
> >
> > > Will it suffice or be OK if we skipped invoking the pre-drop callback for
> > > objects that are being "indirectly" dropped?  I came up with the attached
> > > patch to resolve this problem, if that idea has any merit.  We also get
> > > slightly better error message as seen the expected/foreign_key.out changes.
> >
> > I don't think this works ... consider putting some partition in a
> > different schema and then doing DROP SCHEMA CASCADE.
> 
> Ah, I did test DROP SCHEMA CASCADE, but only tested putting the top
> table into the schema, not a partition.

:-(

I think this is fixable by using a two-step strategy where we first
compile a list of constraints being dropped, and in the second step we
know to ignore those when checking partitions that are being dropped.
Now, maybe the order of objects visited guarantees that the constraint
is seen (by drop) before each corresponding partition, and in that case
we don't need two steps, just one.  I'll have a look at that.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
On 2019-Mar-14, Robert Haas wrote:

> On Thu, Mar 14, 2019 at 3:36 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

> > In any case, since the RI
> > queries are run via SPI, any unnecessary partitions should get pruned by
> > partition pruning based on each partition's constraint.  So I'm not so
> > certain that this is a problem worth spending much time/effort/risk of
> > bugs on.
> 
> I doubt that partition pruning would just automatically DTRT in a case
> like this, but if I'm wrong, great!

Uh, it was you that came up with the partition pruning system, so of
course it DsTRT.

I did test (some of?) the queries issued by ri_triggers with partitioned
tables on the PK side, and indeed it seemed to me that partitions were
being pruned at the right times.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
On 2019-Mar-19, Alvaro Herrera wrote:

> I think this is fixable by using a two-step strategy where we first
> compile a list of constraints being dropped, and in the second step we
> know to ignore those when checking partitions that are being dropped.
> Now, maybe the order of objects visited guarantees that the constraint
> is seen (by drop) before each corresponding partition, and in that case
> we don't need two steps, just one.  I'll have a look at that.

I tried this, it's pretty horrible.  I think it's not completely
impossible, but seems to require a "suppression list" of constraints,
that is, a list of constraints that ought not to be checked because
they're also being dropped in the same command.  Here's what ugly: when
you drop a partition, its part of the FK constraint (the corresponding
pg_constraint row) is also dropped, but its parent constraint is what
remains.  We need to know to suppress the error when the *parent
constraint* is dropped, but *not* suppress the error when only the child
constraint is dropped.  I can only think of ugly data structure to
support this, and adding enough hooks in dependency.c to support this is
going to be badly received.


I think a better idea is to prevent DROP of a partition that's
referenced by a foreign key, period.  We would only allow ALTER..DETACH
(they can drop the partition once it's detached).  It seems to me that
the check can be done more naturally during detach than during drop,
because we're not constrained to do it within the dependency.c system.

However, this still doesn't solve the "DROP TABLE pk CASCADE" problem,
unless we register the dependencies differently.  I've been trying to
understand how to make that work.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: partitioned tables referenced by FKs

From
Jesper Pedersen
Date:
Hi Alvaro,

On 3/18/19 6:02 PM, Alvaro Herrera wrote:
> I spent a few hours studying this and my conclusion is the opposite of
> yours: we should make addFkRecurseReferencing the recursive one, and
> CloneFkReferencing a non-recursive caller of that.  So we end up with
> both addFkRecurseReferenced and addFkRecurseReferencing as recursive
> routines, and CloneFkReferenced and CloneFkReferencing being
> non-recursive callers of those.  With this structure change there is one
> more call to CreateConstraintEntry than before, and now there are two
> calls of tryAttachPartitionForeignKey instead of one; I think with this
> new structure things are much simpler.  I also changed
> CloneForeignKeyConstraints's API: instead of returning a list of cloned
> constraint giving its caller the responsibility of adding FK checks to
> phase 3, we now give CloneForeignKeyConstraints the 'wqueue' list, so
> that it can add the FK checks itself.  It seems much cleaner this way.
> 

Using

-- ddl.sql --
CREATE TABLE t1 (i1 INT PRIMARY KEY, i2 INT NOT NULL) PARTITION BY HASH 
(i1);
CREATE TABLE t2 (i1 INT PRIMARY KEY, i2 INT NOT NULL) PARTITION BY HASH 
(i1);

\o /dev/null
SELECT 'CREATE TABLE t1_p' || x::text || ' PARTITION OF t1
FOR VALUES WITH (MODULUS 64, REMAINDER ' || x::text || ');'
from generate_series(0,63) x;
\gexec
\o

\o /dev/null
SELECT 'CREATE TABLE t2_p' || x::text || ' PARTITION OF t2
FOR VALUES WITH (MODULUS 64, REMAINDER ' || x::text || ');'
from generate_series(0,63) x;
\gexec
\o

ALTER TABLE t1 ADD CONSTRAINT fk_t1_i2_t2_i1 FOREIGN KEY (i2) REFERENCES 
t2(i1);

ANALYZE;


with

-- select.sql --
\set a random(1, 10)
SELECT t1.i1 AS t1i1, t1.i2 AS t1i2, t2.i1 AS t2i1, t2.i2 AS t2i2 FROM 
t1, t2 WHERE t1.i1 = :a;

running

pgbench -M prepared -f select.sql ....

I'm seeing 82.64% spent in GetCachedPlan(). plan_cache_mode is auto.

Best regards,
  Jesper


Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
On 2019-Mar-18, Alvaro Herrera wrote:

> > I'm getting a failure in the pg_upgrade test:
> > 
> >  --
> > +-- Name: pk5 pk5_pkey; Type: CONSTRAINT; Schema: regress_fk; Owner:
> > jpedersen
> > +--
> > +
> > +ALTER TABLE ONLY regress_fk.pk5
> > +    ADD CONSTRAINT pk5_pkey PRIMARY KEY (a);
> > +
> > +
> > +--
> > 
> > when running check-world.
> 
> So I tested this case just now (after fixing a couple of bugs) and see a
> slightly different problem now: those excess lines are actually just
> that pg_dump chose to print the constraints in different order, as there
> are identical lines with "-" in the diff I get.  The databases actually
> *are* identical as far as I can tell.  I haven't yet figured out how to
> fix this; of course, the easiest solution is to just drop the regress_fk
> schema at the end of the foreign_key.sql regression test, but I would
> prefer a different solution.

I figured this out.  It's actually a bug in pg11, whereby we're setting
a dependency wrongly.  If you try to do pg_describe_object() the
pg_depend entries for tables set up the way the regression test does it,
it'll fail with a "cache lookup failed for constraint XYZ".  In other
words, pg_depend contains bogus data :-(

Here's one possible fix:

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index c339a2bb779..8f62b454f75 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1357,6 +1357,8 @@ index_constraint_create(Relation heapRelation,
      */
     if (OidIsValid(parentConstraintId))
     {
+        ObjectAddress    referenced;    /* XXX shadow outer variable */
+
         ObjectAddressSet(myself, ConstraintRelationId, conOid);
         ObjectAddressSet(referenced, ConstraintRelationId, parentConstraintId);
         recordDependencyOn(&myself, &referenced, DEPENDENCY_PARTITION_PRI);

I can tell the raised eyebrows from a mile away :-(

The problem (and the reason shadowing the variable solves it) is that
the outer 'referenced' is the function's return value.  This block was
clobbering the value previously set, which is what we really wanted to
return.

/me dons paper bag

I'll go figure out what a better solution is.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
On 2019-Mar-21, Alvaro Herrera wrote:

> I figured this out.  It's actually a bug in pg11, whereby we're setting
> a dependency wrongly.  If you try to do pg_describe_object() the
> pg_depend entries for tables set up the way the regression test does it,
> it'll fail with a "cache lookup failed for constraint XYZ".  In other
> words, pg_depend contains bogus data :-(

Oh, actually, it's not a problem in PG11 ... or at least, it's not THIS
problem in pg11.  The bug was introduced by 1d92a0c9f7dd which ended the
DEPENDENCY_INTERNAL_AUTO saga, so the bug is only in master, and that
explains why I hadn't seen the problem before with my posted patch
series.

I think the attached is a better solution, which I'll go push shortly.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
On 2019-Mar-18, Alvaro Herrera wrote:

> A pretty silly bug remains here.  Watch:
> 
> create table pk (a int primary key) partition by list (a);
> create table pk1 partition of pk for values in (1);
> create table fk (a int references pk);
> insert into pk values (1);
> insert into fk values (1);
> drop table pk cascade;
> 
> Note that the DROP of the main PK table is pretty aggressive: since it
> cascades, you want it to drop the constraint on the FK side.  This would
> work without a problem if 'pk' was a non-partitioned table, but in this
> case it fails:
> 
> alvherre=# drop table pk cascade;
> NOTICE:  drop cascades to constraint fk_a_fkey on table fk
> ERROR:  removing partition "pk1" violates foreign key constraint "fk_a_fkey1"
> DETALLE:  Key (a)=(1) still referenced from table "fk".

Here's v7; known problems have been addressed except the one above.
Jesper's GetCachedPlan() slowness has not been addressed either.

As I said before, I'm thinking of getting rid of the whole business of
checking partitions on the referenced side of an FK at DROP time, and
instead jut forbid the DROP completely if any FKs reference an ancestor
of that partition.  We can allow DETACH for the case, which can deal
with scanning the table referenced tuples and remove the appropriate
dependencies.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
Hi Jesper

On 2019-Mar-21, Jesper Pedersen wrote:

> running
> 
> pgbench -M prepared -f select.sql ....
> 
> I'm seeing 82.64% spent in GetCachedPlan(). plan_cache_mode is auto.

Hmm, I can't reproduce this at all ...  I don't even see GetCachedPlan
in the profile.  Do you maybe have some additional patch in your tree?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: partitioned tables referenced by FKs

From
Jesper Pedersen
Date:
Hi Alvaro,

On 3/21/19 4:49 PM, Alvaro Herrera wrote:
> I think the attached is a better solution, which I'll go push shortly.
> 

I see you pushed this, plus 0002 as well.

Thanks !

Best regards,
  Jesper


Re: partitioned tables referenced by FKs

From
Jesper Pedersen
Date:
Hi Alvaro,

On 3/21/19 6:18 PM, Alvaro Herrera wrote:
> On 2019-Mar-21, Jesper Pedersen wrote:
>> pgbench -M prepared -f select.sql ....
>>
>> I'm seeing 82.64% spent in GetCachedPlan(). plan_cache_mode is auto.
> 
> Hmm, I can't reproduce this at all ...  I don't even see GetCachedPlan
> in the profile.  Do you maybe have some additional patch in your tree?
> 

No, with 7df159a62 and v7 compiled with "-O0 -fno-omit-frame-pointer" I 
still see it.

plan_cache_mode = auto
   2394 TPS w/ GetCachePlan() @ 81.79%

plan_cache_mode = force_generic_plan
  10984 TPS w/ GetCachePlan() @ 23.52%

perf sent off-list.

Best regards,
  Jesper


Re: partitioned tables referenced by FKs

From
Amit Langote
Date:
Hi Jesper,

On 2019/03/22 22:01, Jesper Pedersen wrote:
> Hi Alvaro,
> 
> On 3/21/19 6:18 PM, Alvaro Herrera wrote:
>> On 2019-Mar-21, Jesper Pedersen wrote:
>>> pgbench -M prepared -f select.sql ....
>>>
>>> I'm seeing 82.64% spent in GetCachedPlan(). plan_cache_mode is auto.
>>
>> Hmm, I can't reproduce this at all ...  I don't even see GetCachedPlan
>> in the profile.  Do you maybe have some additional patch in your tree?
>>
> 
> No, with 7df159a62 and v7 compiled with "-O0 -fno-omit-frame-pointer" I
> still see it.
> 
> plan_cache_mode = auto
>   2394 TPS w/ GetCachePlan() @ 81.79%
> 
> plan_cache_mode = force_generic_plan
>  10984 TPS w/ GetCachePlan() @ 23.52%

Wouldn't you get the same numbers on HEAD too?  IOW, I'm not sure how the
patch here, which seems mostly about getting DDL in order to support
foreign keys on partitioned tables, would have affected the result of this
benchmark.  Can you clarify your intention of running this benchmark
against these patches?

Thanks,
Amit



Re: partitioned tables referenced by FKs

From
Amit Langote
Date:
Hi Alvaro,

On 2019/03/22 6:54, Alvaro Herrera wrote:
> Here's v7;

Needs rebasing on top of 940311e4bb3.

0001:

+        Oid            objectClass = getObjectClass(thisobj);

I guess you meant to use ObjectClass, not Oid here.

Tested 0002 a bit more and found some problems.

create table p (a int primary key) partition by list (a);
create table p1 partition of p for values in (1) partition by list (a);
create table p11 partition of p1 for values in (1);

create table q (a int) partition by list (a);
create table q1 partition of q for values in (1) partition by list (a);
create table q11 partition of q1 for values in (1);

alter table q add foreign key (a) references p;

create table p2 partition of p for values in (2);
ERROR:  index for 0 not found in partition p2

The problem appears to be that CloneFkReferenced() is stumbling across
inconsistent pg_constraint rows previously created by
addFkRecurseReferencing() when we did alter table q add foreign key:

select oid, conname, conrelid::regclass, confrelid::regclass, conindid,
conparentid from pg_constraint where conname like '%fkey%';
  oid  │  conname  │ conrelid │ confrelid │ conindid │ conparentid
───────┼───────────┼──────────┼───────────┼──────────┼─────────────
 60336 │ q_a_fkey  │ q        │ p         │    60315 │           0
 60337 │ q_a_fkey1 │ q        │ p1        │    60320 │       60336
 60338 │ q_a_fkey2 │ q        │ p11       │    60325 │       60337
 60341 │ q_a_fkey  │ q1       │ p         │        0 │       60336 <=
 60342 │ q_a_fkey  │ q11      │ p         │        0 │       60341 <=
(5 rows)

I think the last two rows contain invalid value of conindid, perhaps as a
result of a bug of addFkRecurseReferencing().  There's this code in it to
fetch the index partition of the referencing partition:

+            /*
+             * No luck finding a good constraint to reuse; create our own.
+             */
+            partIndexOid = index_get_partition(partition, indexOid);

That seems unnecessary.  Maybe, it's a remnant of the code copied from
addFkRecurseReferenced(), where using the partition index is necessary
(although without the elog for when we get an InvalidOid, which would've
caught this.)  The above index_get_partition() returns InvalidOid,
because, obviously, we don't require referencing side to have the index.
Attached a fix (addFkRecurseReferencing-fix.patch).

Once we apply the above fix, we run into another problem:

create table p2 partition of p for values in (2);

select oid, conname, conrelid::regclass, confrelid::regclass, conindid,
conparentid from pg_constraint where conname like '%fkey%';
  oid  │  conname   │ conrelid │ confrelid │ conindid │ conparentid
───────┼────────────┼──────────┼───────────┼──────────┼─────────────
 60336 │ q_a_fkey   │ q        │ p         │    60315 │           0
 60337 │ q_a_fkey1  │ q        │ p1        │    60320 │       60336
 60338 │ q_a_fkey2  │ q        │ p11       │    60325 │       60337
 60341 │ q_a_fkey   │ q1       │ p         │    60315 │       60336
 60342 │ q_a_fkey   │ q11      │ p         │    60315 │       60341
 60350 │ q_a_fkey3  │ q        │ p2        │    60348 │       60336 <=
 60353 │ q11_a_fkey │ q11      │ p2        │    60348 │       60342 <=
(7 rows)

There are 2 pg_constraint rows both targeting p2, whereas there should've
been only 1 (that is, one for q referencing p2).  However, if one adds the
foreign constraint on q *after* creating p2, there is only 1 row, which is
correct.

alter table q drop constraint q_a_fkey;
alter table q add foreign key (a) references p;

select oid, conname, conrelid::regclass, confrelid::regclass, conindid,
conparentid from pg_constraint where conname like '%fkey%';
  oid  │  conname  │ conrelid │ confrelid │ conindid │ conparentid
───────┼───────────┼──────────┼───────────┼──────────┼─────────────
 60356 │ q_a_fkey  │ q        │ p         │    60315 │           0
 60357 │ q_a_fkey1 │ q        │ p1        │    60320 │       60356
 60358 │ q_a_fkey2 │ q        │ p11       │    60325 │       60357
 60361 │ q_a_fkey3 │ q        │ p2        │    60348 │       60356 <=
 60364 │ q_a_fkey  │ q1       │ p         │    60315 │       60356
 60365 │ q_a_fkey  │ q11      │ p         │    60315 │       60364
(6 rows)

This appears to be a bug of CloneFkReferenced().  Specifically, the way it
builds the list of foreign key constraint to be cloned.  Attached a fix
(CloneFkReferenced-fix.patch).

> As I said before, I'm thinking of getting rid of the whole business of
> checking partitions on the referenced side of an FK at DROP time, and
> instead jut forbid the DROP completely if any FKs reference an ancestor
> of that partition.

Will that allow `DROP TABLE parted_pk CASCADE` to succeed even if
partitions still contain referenced data?  I suppose that's the example
you cited upthread as a bug that remains to be solved.

Thanks,
Amit

Attachment

Re: partitioned tables referenced by FKs

From
Jesper Pedersen
Date:
Hi Amit,

On 3/26/19 2:06 AM, Amit Langote wrote:
> Wouldn't you get the same numbers on HEAD too?  IOW, I'm not sure how the
> patch here, which seems mostly about getting DDL in order to support
> foreign keys on partitioned tables, would have affected the result of this
> benchmark.  Can you clarify your intention of running this benchmark
> against these patches?
> 

Yeah, you are right. As I'm also seeing this issue with a test case of

-- ddl.sql --
CREATE TABLE t1 (i1 INT PRIMARY KEY, i2 INT NOT NULL) PARTITION BY HASH 
(i1);
CREATE TABLE t2 (i1 INT PRIMARY KEY, i2 INT NOT NULL);

\o /dev/null
SELECT 'CREATE TABLE t1_p' || x::text || ' PARTITION OF t1
FOR VALUES WITH (MODULUS 64, REMAINDER ' || x::text || ');'
from generate_series(0,63) x;
\gexec
\o

ALTER TABLE t1 ADD CONSTRAINT fk_t1_i2_t2_i1 FOREIGN KEY (i2) REFERENCES 
t2(i1);

ANALYZE;

we shouldn't focus on it in this thread.

Best regards,
  Jesper


Re: partitioned tables referenced by FKs

From
Jesper Pedersen
Date:
Hi,

On 3/26/19 5:39 PM, Alvaro Herrera wrote:
>>> As I said before, I'm thinking of getting rid of the whole business of
>>> checking partitions on the referenced side of an FK at DROP time, and
>>> instead jut forbid the DROP completely if any FKs reference an ancestor
>>> of that partition.
>>
>> Will that allow `DROP TABLE parted_pk CASCADE` to succeed even if
>> partitions still contain referenced data?  I suppose that's the example
>> you cited upthread as a bug that remains to be solved.
> 
> That's the idea, yes, it should do that: only allow a DROP of a
> partition referenced by an FK if the topmost constraint is also being
> dropped.  Maybe this means I need to get rid of 0002 completely.  But I
> haven't got to doing that yet.
> 

I think that is ok, if doc/src/sgml/ref/drop_table.sgml is updated with 
a description of how CASCADE works in this scenario.

Best regards,
  Jesper



Re: partitioned tables referenced by FKs

From
Amit Langote
Date:
Hi,

On 2019/03/27 8:27, Alvaro Herrera wrote:
> On 2019-Mar-26, Alvaro Herrera wrote:
> 
>> Thanks for the thorough testing and bug analysis!  It was spot-on.  I've
>> applied your two proposed fixes, as well as added a new test setup that
>> covers both these bugs.  The attached set is rebased on 7c366ac969ce.
> 
> Attached is rebased on 126d63122232.  No further changes.

Noticed something -- I was expecting that your earlier commit 1af25ca0c2d9
would've taken care of this, but apparently it didn't.

-- pk
create table p (a int primary key) partition by list (a);
create table p1 partition of p for values in (1) partition by list (a);
create table p11 partition of p1 for values in (1);

-- fk (partitioned)
create table q (a int references p) partition by list (a);
create table q1 partition of q for values in (1) partition by list (a);
create table q11 partition of q1 for values in (1);

-- prints only the top-level constraint as expected
\d q
           Partitioned table "public.q"
 Column │  Type   │ Collation │ Nullable │ Default
────────┼─────────┼───────────┼──────────┼─────────
 a      │ integer │           │          │
Partition key: LIST (a)
Foreign-key constraints:
    "q_a_fkey" FOREIGN KEY (a) REFERENCES p(a)
Number of partitions: 1 (Use \d+ to list them.)

-- fk (regular table)
create table r (a int references p);

-- this prints all, which seems unintentional
\d r
                 Table "public.r"
 Column │  Type   │ Collation │ Nullable │ Default
────────┼─────────┼───────────┼──────────┼─────────
 a      │ integer │           │          │
Foreign-key constraints:
    "r_a_fkey" FOREIGN KEY (a) REFERENCES p(a)
    "r_a_fkey1" FOREIGN KEY (a) REFERENCES p1(a)
    "r_a_fkey2" FOREIGN KEY (a) REFERENCES p11(a)

So, 0002 needs to include some psql tweaks.

Thanks,
Amit




Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
On 2019-Mar-18, Alvaro Herrera wrote:

> A pretty silly bug remains here.  Watch:
> 
> create table pk (a int primary key) partition by list (a);
> create table pk1 partition of pk for values in (1);
> create table fk (a int references pk);
> insert into pk values (1);
> insert into fk values (1);
> drop table pk cascade;
> 
> Note that the DROP of the main PK table is pretty aggressive: since it
> cascades, you want it to drop the constraint on the FK side.

I ended up revising the dependencies that we give to the constraint in
the partition -- instead of giving it partition-type dependencies, we
give it an INTERNAL dependency.  Now when you request to drop the
partition, it says this:

create table pk (a int primary key) partition by list (a);
create table fk (a int references pk);
create table pk1 partition of pk for values in (1);

alvherre=# drop table pk1;
ERROR:  cannot drop table pk1 because other objects depend on it
DETAIL:  constraint fk_a_fkey on table fk depends on table pk1
HINT:  Use DROP ... CASCADE to drop the dependent objects too.

If you do say CASCADE, the constraint is dropped.  Not really ideal (I
would prefer that the drop is prevented completely), but at least it's
not completely bogus.  If you do "DROP TABLE pk", it works sanely.
Also, if you DETACH the partition that pg_depend row goes away, so a
subsequent drop of the partition works sanely.

Fixed the psql issue pointed out by Amit L too.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: partitioned tables referenced by FKs

From
Robert Haas
Date:
On Wed, Mar 20, 2019 at 11:58 AM Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> constraint is dropped.  I can only think of ugly data structure to
> support this, and adding enough hooks in dependency.c to support this is
> going to be badly received.

I don't know why dependency.c doesn't handle this internally.  If I
say that I want to delete a list of objects, some of which can't be
dropped without dropping certain other things, dependency.c ought to
be able to suspend judgement on what the problems are until it's
looked over that whole list.  It seems to me that we've had to fight
with this problem before, and I don't know why every individual bit of
code that calls into that file has to be responsible for working
around it individually.

> I think a better idea is to prevent DROP of a partition that's
> referenced by a foreign key, period.  We would only allow ALTER..DETACH
> (they can drop the partition once it's detached).  It seems to me that
> the check can be done more naturally during detach than during drop,
> because we're not constrained to do it within the dependency.c system.

IMHO, that sucks a lot.  Maybe we should live with it anyway, but it does suck.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: partitioned tables referenced by FKs

From
Robert Haas
Date:
On Thu, Mar 28, 2019 at 2:59 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> I ended up revising the dependencies that we give to the constraint in
> the partition -- instead of giving it partition-type dependencies, we
> give it an INTERNAL dependency.  Now when you request to drop the
> partition, it says this:
>
> create table pk (a int primary key) partition by list (a);
> create table fk (a int references pk);
> create table pk1 partition of pk for values in (1);
>
> alvherre=# drop table pk1;
> ERROR:  cannot drop table pk1 because other objects depend on it
> DETAIL:  constraint fk_a_fkey on table fk depends on table pk1
> HINT:  Use DROP ... CASCADE to drop the dependent objects too.

Hmm.  I guess that's strictly better than blocking the drop
completely, but definitely not ideal.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
On 2019-Mar-29, Robert Haas wrote:

> On Wed, Mar 20, 2019 at 11:58 AM Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
> > constraint is dropped.  I can only think of ugly data structure to
> > support this, and adding enough hooks in dependency.c to support this is
> > going to be badly received.
> 
> I don't know why dependency.c doesn't handle this internally.  If I
> say that I want to delete a list of objects, some of which can't be
> dropped without dropping certain other things, dependency.c ought to
> be able to suspend judgement on what the problems are until it's
> looked over that whole list.  It seems to me that we've had to fight
> with this problem before, and I don't know why every individual bit of
> code that calls into that file has to be responsible for working
> around it individually.

Hmm, if you think this kind of situation is more widespread than this
particular case, then maybe we can add the hooks I was talking about,
and simplify those other cases while fixing this problem.  Do you have
it in your mind what cases are those?  Nothing comes to mind, but I'll
search around and see if I can find anything.

Note that there are two separate problems here.  One is how to run
arbitrary code to see whether the DROP is allowed (in this case, the
arbitrary code to run is RI_Final_Check which searches for referenced
rows and throws ERROR if there are any).  My proposed hook system seems
to work OK for that.

The other problem is how to decide (in dependency.c terms) that the
partition can be dropped.  We have these scenarios to support:

1. ALTER TABLE fk DROP CONSTRAINT        -- should work cleanly
2. DROP referenced-partition            -- allow only if no referenced rows
3. ALTER referenced-parent-table DETACH partition -- allow only if no referenced rows
4. DROP referenced-parent-table CASCADE        -- drop the FK and partition

The patch I posted in v8 allowed cases 1, 2 and 3 but failed 4.

One possibility is to pass the whole list of objects to drop to the
hooks; so if we see (in the pg_class-specific hook) that the FK
constraint is listed as being dropped, we just don't execute
RI_Final_Check.


BTW, RI_Final_Check looks a bit like a silly name, now that I think
about it; it was only a temporary name I picked by opposition to
RI_Initial_Check.  Maybe RI_PartitionRemove_Check would be more apropos?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: partitioned tables referenced by FKs

From
Jesper Pedersen
Date:
Hi Alvaro,

On 3/28/19 2:59 PM, Alvaro Herrera wrote:
> I ended up revising the dependencies that we give to the constraint in
> the partition -- instead of giving it partition-type dependencies, we
> give it an INTERNAL dependency.  Now when you request to drop the
> partition, it says this:
> 
> create table pk (a int primary key) partition by list (a);
> create table fk (a int references pk);
> create table pk1 partition of pk for values in (1);
> 
> alvherre=# drop table pk1;
> ERROR:  cannot drop table pk1 because other objects depend on it
> DETAIL:  constraint fk_a_fkey on table fk depends on table pk1
> HINT:  Use DROP ... CASCADE to drop the dependent objects too.
> 
> If you do say CASCADE, the constraint is dropped.  Not really ideal (I
> would prefer that the drop is prevented completely), but at least it's
> not completely bogus.  If you do "DROP TABLE pk", it works sanely.
> Also, if you DETACH the partition that pg_depend row goes away, so a
> subsequent drop of the partition works sanely.
> 
> Fixed the psql issue pointed out by Amit L too.
> 

Could expand a bit on the change to DEPENDENCY_INTERNAL instead of 
DEPENDENCY_PARTITION_PRI / DEPENDENCY_PARTITION_SEC ?

If you run "DROP TABLE t2_p32 CASCADE" the foreign key constraint is 
removed from all of t1.

-- ddl.sql --
CREATE TABLE t1 (i1 INT PRIMARY KEY, i2 INT NOT NULL) PARTITION BY HASH 
(i1);
CREATE TABLE t2 (i1 INT PRIMARY KEY, i2 INT NOT NULL) PARTITION BY HASH 
(i1);

\o /dev/null
SELECT 'CREATE TABLE t1_p' || x::text || ' PARTITION OF t1
FOR VALUES WITH (MODULUS 64, REMAINDER ' || x::text || ');'
from generate_series(0,63) x;
\gexec
\o

\o /dev/null
SELECT 'CREATE TABLE t2_p' || x::text || ' PARTITION OF t2
FOR VALUES WITH (MODULUS 64, REMAINDER ' || x::text || ');'
from generate_series(0,63) x;
\gexec
\o

ALTER TABLE t1 ADD CONSTRAINT fk_t1_i2_t2_i1 FOREIGN KEY (i2) REFERENCES 
t2(i1);

INSERT INTO t2 (SELECT i, i FROM generate_series(1, 1000) AS i);
INSERT INTO t1 (SELECT i, i FROM generate_series(1, 1000) AS i);

ANALYZE;
-- ddl.sql --

Detaching the partition for DROP seems safer to me.

Thanks in advance !

Best regards,
  Jesper



Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
On 2019-Mar-29, Jesper Pedersen wrote:

> Could expand a bit on the change to DEPENDENCY_INTERNAL instead of
> DEPENDENCY_PARTITION_PRI / DEPENDENCY_PARTITION_SEC ?

The PARTITION dependencies work in a way that doesn't do what we want.
Admittedly, neither does INTERNAL, but at least it's less bad.

> If you run "DROP TABLE t2_p32 CASCADE" the foreign key constraint is removed
> from all of t1.

Yes.  CASCADE is always a dangerous tool; if you run the DROP partition
without cascade, it explicitly lists that the constraint is going to be
dropped.

If you get in the habit of added CASCADE to all your drops, you're going
to lose data pretty quickly.  In this case, no data is lost, only a
constraint.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: partitioned tables referenced by FKs

From
Jesper Pedersen
Date:
Hi,

On 3/29/19 11:22 AM, Alvaro Herrera wrote:
> On 2019-Mar-29, Jesper Pedersen wrote:
> 
>> Could expand a bit on the change to DEPENDENCY_INTERNAL instead of
>> DEPENDENCY_PARTITION_PRI / DEPENDENCY_PARTITION_SEC ?
> 
> The PARTITION dependencies work in a way that doesn't do what we want.
> Admittedly, neither does INTERNAL, but at least it's less bad.
> 
>> If you run "DROP TABLE t2_p32 CASCADE" the foreign key constraint is removed
>> from all of t1.
> 
> Yes.  CASCADE is always a dangerous tool; if you run the DROP partition
> without cascade, it explicitly lists that the constraint is going to be
> dropped.
> 
> If you get in the habit of added CASCADE to all your drops, you're going
> to lose data pretty quickly.  In this case, no data is lost, only a
> constraint.
> 

Thanks !

Maybe the "(" / ")" in the CASCADE description should be removed from 
ref/drop_table.sgml as part of this patch.

Should catalogs.sgml be updated for this case ?

Best regards,
  Jesper



Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
On 2019-Mar-29, Jesper Pedersen wrote:


> Thanks !
> 
> Maybe the "(" / ")" in the CASCADE description should be removed from
> ref/drop_table.sgml as part of this patch.

I'm not sure what text you propose to remove?

> Should catalogs.sgml be updated for this case ?

I'm not adding any new dependency type, nor modifying any existing one.
Are you looking at anything that needs changing specifically?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: partitioned tables referenced by FKs

From
Jesper Pedersen
Date:
Hi,

On 3/29/19 12:29 PM, Alvaro Herrera wrote:
> On 2019-Mar-29, Jesper Pedersen wrote:
>> Maybe the "(" / ")" in the CASCADE description should be removed from
>> ref/drop_table.sgml as part of this patch.
> 
> I'm not sure what text you propose to remove?
>

Just the attached.

>> Should catalogs.sgml be updated for this case ?
> 
> I'm not adding any new dependency type, nor modifying any existing one.
> Are you looking at anything that needs changing specifically?
> 

No, I just mentioned it if it was a requirement that the precise 
use-case for each dependency type were documented.

Best regards,
  Jesper

Attachment

Re: partitioned tables referenced by FKs

From
Jesper Pedersen
Date:
Hi Alvaro,

On 3/28/19 2:59 PM, Alvaro Herrera wrote:
> I ended up revising the dependencies that we give to the constraint in
> the partition -- instead of giving it partition-type dependencies, we
> give it an INTERNAL dependency.  Now when you request to drop the
> partition, it says this:
> 
> create table pk (a int primary key) partition by list (a);
> create table fk (a int references pk);
> create table pk1 partition of pk for values in (1);
> 
> alvherre=# drop table pk1;
> ERROR:  cannot drop table pk1 because other objects depend on it
> DETAIL:  constraint fk_a_fkey on table fk depends on table pk1
> HINT:  Use DROP ... CASCADE to drop the dependent objects too.
> 
> If you do say CASCADE, the constraint is dropped.  Not really ideal (I
> would prefer that the drop is prevented completely), but at least it's
> not completely bogus.  If you do "DROP TABLE pk", it works sanely.
> Also, if you DETACH the partition that pg_depend row goes away, so a
> subsequent drop of the partition works sanely.
> 
> Fixed the psql issue pointed out by Amit L too.
> 

I ran my test cases for this feature, and havn't seen any issues.

Therefore I'm marking 1877 as Ready for Committer. If others have 
additional feedback feel free to switch it back.

Best regards,
  Jesper



Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
On 2019-Mar-29, Jesper Pedersen wrote:

> I ran my test cases for this feature, and havn't seen any issues.
> 
> Therefore I'm marking 1877 as Ready for Committer. If others have additional
> feedback feel free to switch it back.

Thanks!

I found two issues today.  One, server side, is that during cloning for
partition attach we were not checking for concurrent deletion of
referenced tuples in partitions.  I added an isolation spec test for
this.  To fix the bug, added a find_all_inheritors() to lock all
partitions with ShareRowExclusiveLock.

Another is that psql's \d failed for versions < 12, because we were
inconditionally adding an "AND conparentid = 0" clause.

I also reworked CloneForeignKeyConstraints.  The previous style was
being forced by the old recursing method; now we can make it a lot
simpler -- it's now just two subroutine calls.

I'm satisfied with this patch now, so I intend to push early tomorrow.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: partitioned tables referenced by FKs

From
Amit Langote
Date:
Hi Alvaro,

On 2019/04/02 5:03, Alvaro Herrera wrote:
> On 2019-Mar-29, Jesper Pedersen wrote:
> 
>> I ran my test cases for this feature, and havn't seen any issues.
>>
>> Therefore I'm marking 1877 as Ready for Committer. If others have additional
>> feedback feel free to switch it back.
> 
> Thanks!
> 
> I found two issues today.  One, server side, is that during cloning for
> partition attach we were not checking for concurrent deletion of
> referenced tuples in partitions.  I added an isolation spec test for
> this.  To fix the bug, added a find_all_inheritors() to lock all
> partitions with ShareRowExclusiveLock.
> 
> Another is that psql's \d failed for versions < 12, because we were
> inconditionally adding an "AND conparentid = 0" clause.
> 
> I also reworked CloneForeignKeyConstraints.  The previous style was
> being forced by the old recursing method; now we can make it a lot
> simpler -- it's now just two subroutine calls.
> 
> I'm satisfied with this patch now, so I intend to push early tomorrow.

I tried the latest patch and found no problems as far as I tried.

I guess it's OK for the time being that users will not be able to drop a
partition while a foreign key is pointing to it, given the troubles with
that pre-DROP check.  (Of course, someone may want to improve dependency.c
in the future so that we can allow the DROP where possible.)

Thanks for working on this.

Off-topic:

I did some performance testing, which has no relation to the code being
changed by this patch, but thought the numbers hint at a need for
improving other areas that affect the scalability (in terms of number of
partitions) of the foreign key checks.

* Table setup:

Create a hash partitioned table with N partitions, followed by a regular
table referencing the aforementioned partitioned table.  Insert keys
1-1000 into ht, the primary key table.

N: 2..8192

drop table ht cascade;
create table ht (a int primary key, b int, c int) partition by hash (a);
select 'create table ht' || x::text || ' partition of ht for values with
(MODULUS N, REMAINDER ' || (x)::text || ');' from generate_series(0, N-1) x;
\gexec
drop table foo cascade;
create table foo (a int references ht);
truncate ht cascade;
insert into ht select generate_series(1, 1000);

* pgbench script (insert-foo.sql)

\set a random(1, 1000)
insert into foo values (:a);

* Run pgbench

pgbench -n -T 120 -f insert-foo.sql

* TPS with plan_cache_mode = auto (default) or force_custom_plan

2    2382.779742 <=
8    2392.514952
32    2392.682178
128    2387.828361
512    2358.325865
1024    2234.699889
4096    2030.610062
8192    1740.633117

* TPS with plan_cache_mode = force_generic_plan

2    2924.030727 <=
8    2854.460937
32    2378.630989
128    1550.235166
512    642.980148
1024    330.142430
4096    71.739272
8192    32.620403

It'd be good to work on improving the scalability of generic plan
execution in the future, because not having to re-plan the query used by
RI_FKey_check would always be better, as seen by comparing execution times
for 2 partitions (2382 tps with custom plan vs. 2924 tps with generic
plan.).  Having run-time pruning already helps quite a lot, but some other
ideas are needed, such as one proposed by David recently, but withdrawn
for PG 12 [1].

Anyway, nothing here that prevents this patch from being committed. :)

Thanks for reading.

Regards,
Amit

[1] https://commitfest.postgresql.org/22/1897/




Re: partitioned tables referenced by FKs

From
Jesper Pedersen
Date:
Hi,

On 4/1/19 4:03 PM, Alvaro Herrera wrote:
> I'm satisfied with this patch now, so I intend to push early tomorrow.
> 

Passes check-world, and my tests.

Best regards,
  Jesper



Re: partitioned tables referenced by FKs

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2019-Mar-29, Robert Haas wrote:
>> I don't know why dependency.c doesn't handle this internally.  If I
>> say that I want to delete a list of objects, some of which can't be
>> dropped without dropping certain other things, dependency.c ought to
>> be able to suspend judgement on what the problems are until it's
>> looked over that whole list.  It seems to me that we've had to fight
>> with this problem before, and I don't know why every individual bit of
>> code that calls into that file has to be responsible for working
>> around it individually.

> Hmm, if you think this kind of situation is more widespread than this
> particular case, then maybe we can add the hooks I was talking about,
> and simplify those other cases while fixing this problem.  Do you have
> it in your mind what cases are those?  Nothing comes to mind, but I'll
> search around and see if I can find anything.

FWIW, I think that the dependency mechanism is designed around the idea
that whether it's okay to drop a *database object* depends only on what
other *database objects* rely on it, and that you can always make a DROP
valid by dropping all the dependent objects.  That's not an unreasonable
design assumption, considering that the SQL standard embeds the same
assumption in its RESTRICT/CASCADE syntax.

The problem this patch is running into is that we'd like to make the
validity of dropping a table partition depend on whether there are
individual rows in some other table that FK-reference rows in the target
partition.  That's just completely outside the object-dependency model,
and I'm not sure why Robert finds that surprising.

I think the stopgap solution of requiring a separate DETACH step first
is probably fine, although I don't see any documentation explaining that
in the v10 patch?  Also the test cases in the v10 patch don't seem to
square with that reality, or at least their comments need work.

In the long run I wouldn't necessarily object to having some sort of
hooks in dependency.c to allow this type of use-case to be addressed.
But it'd likely be a good idea to have at least one other concrete
example in mind before trying to design a mechanism for that.

            regards, tom lane



Re: partitioned tables referenced by FKs

From
Robert Haas
Date:
On Tue, Apr 2, 2019 at 11:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The problem this patch is running into is that we'd like to make the
> validity of dropping a table partition depend on whether there are
> individual rows in some other table that FK-reference rows in the target
> partition.  That's just completely outside the object-dependency model,
> and I'm not sure why Robert finds that surprising.

I think it's because I was misunderstanding the problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: partitioned tables referenced by FKs

From
Alvaro Herrera
Date:
Pushed, many thanks Amit and Jesper for reviewing.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: partitioned tables referenced by FKs

From
Jesper Pedersen
Date:
On 4/3/19 1:52 PM, Alvaro Herrera wrote:
> Pushed, many thanks Amit and Jesper for reviewing.
> 

Thank you for working on this feature.

Best regards,
  Jesper




Re: partitioned tables referenced by FKs

From
Amit Langote
Date:
On 2019/04/04 4:45, Jesper Pedersen wrote:
> On 4/3/19 1:52 PM, Alvaro Herrera wrote:
>> Pushed, many thanks Amit and Jesper for reviewing.
>>
> 
> Thank you for working on this feature.

+1, thank you.

Regards,
Amit