Thread: pg_restore causing deadlocks on partitioned tables

pg_restore causing deadlocks on partitioned tables

From
Domagoj Smoljanovic
Date:

Hi all.

I tried searching for the response to this but couldn’t find any. Tried also posting to general but got no love there.

I have pg_restore running in parallel (3 or more) and processing large amount of data that is in partitioned tables. However it seems that sometime deadlock appears when one process is trying to process primary key on parent table while data still hasn’t been loaded into partitions. And acquires Exclusive Lock on the whole table. Then another process comes and tries to load one of the partitions with SharedLock but it fails.

This of course doesn’t happen always; depending on the course of actions of the pg_restore. But often enough to cause frustration.

Process 15858 waits for AccessShareLock on relation 233358134 of database 233346697; blocked by process 15861.
Process 15861 waits for AccessExclusiveLock on relation 233374757 of database 233346697; blocked by process 15858.
Process 15858: TRUNCATE TABLE ONLY myschema."myTable:2020-09-01";
Process 15861: ALTER TABLE ONLY myschema."myTable" ADD CONSTRAINT "pk_myTable" PRIMARY KEY ("ID", date);

Should this be treated as a bug or am I doing something wrong?

Disclamer: --load-via-partition-root was NOT used. Meaning that warning from the pg_dump documentation should not be applicable 😊

Thanx,
Domagoj

 

Re: pg_restore causing deadlocks on partitioned tables

From
Tom Lane
Date:
Domagoj Smoljanovic <domagoj.smoljanovic@oradian.com> writes:
> I have pg_restore running in parallel (3 or more) and processing large amount of data that is in partitioned tables.
Howeverit seems that sometime deadlock appears when one process is trying to process primary key on parent table while
datastill hasn’t been loaded into partitions. And acquires Exclusive Lock on the whole table. Then another process
comesand tries to load one of the partitions with SharedLock but it fails. 

> This of course doesn’t happen always; depending on the course of actions of the pg_restore. But often enough to cause
frustration.

> Process 15858 waits for AccessShareLock on relation 233358134 of database 233346697; blocked by process 15861.
> Process 15861 waits for AccessExclusiveLock on relation 233374757 of database 233346697; blocked by process 15858.
> Process 15858: TRUNCATE TABLE ONLY myschema."myTable:2020-09-01";
> Process 15861: ALTER TABLE ONLY myschema."myTable" ADD CONSTRAINT "pk_myTable" PRIMARY KEY ("ID", date);

Hm, this seems related to 2ba5b2db7, but not the same thing.
Alvaro, any thoughts?

            regards, tom lane



Re: pg_restore causing deadlocks on partitioned tables

From
Alvaro Herrera
Date:
On 2020-Sep-14, Tom Lane wrote:

> Domagoj Smoljanovic <domagoj.smoljanovic@oradian.com> writes:
> > I have pg_restore running in parallel (3 or more) and processing large amount of data that is in partitioned
tables.However it seems that sometime deadlock appears when one process is trying to process primary key on parent
tablewhile data still hasn’t been loaded into partitions. And acquires Exclusive Lock on the whole table. Then another
processcomes and tries to load one of the partitions with SharedLock but it fails.
 
> 
> > This of course doesn’t happen always; depending on the course of actions of the pg_restore. But often enough to
causefrustration.
 
> 
> > Process 15858 waits for AccessShareLock on relation 233358134 of database 233346697; blocked by process 15861.
> > Process 15861 waits for AccessExclusiveLock on relation 233374757 of database 233346697; blocked by process 15858.
> > Process 15858: TRUNCATE TABLE ONLY myschema."myTable:2020-09-01";
> > Process 15861: ALTER TABLE ONLY myschema."myTable" ADD CONSTRAINT "pk_myTable" PRIMARY KEY ("ID", date);
> 
> Hm, this seems related to 2ba5b2db7, but not the same thing.
> Alvaro, any thoughts?

So apparently when we go to restore the table data for the partition,
the TRUNCATE deadlocks with the PK addition ... that's pretty odd;
shouldn't the constraint restore have waited until the data had been
fully loaded?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



RE: pg_restore causing deadlocks on partitioned tables

From
Domagoj Smoljanovic
Date:
Forgot to mention the versions:
pg_restore (PostgreSQL) 12.4
source/ destination databases also 12.4

D.

-----Original Message-----
From: Alvaro Herrera <alvherre@2ndquadrant.com> 
Sent: 14. rujna 2020. 16:40
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Domagoj Smoljanovic <domagoj.smoljanovic@oradian.com>; pgsql-hackers@postgresql.org
Subject: Re: pg_restore causing deadlocks on partitioned tables

On 2020-Sep-14, Tom Lane wrote:

> Domagoj Smoljanovic <domagoj.smoljanovic@oradian.com> writes:
> > I have pg_restore running in parallel (3 or more) and processing large amount of data that is in partitioned
tables.However it seems that sometime deadlock appears when one process is trying to process primary key on parent
tablewhile data still hasn’t been loaded into partitions. And acquires Exclusive Lock on the whole table. Then another
processcomes and tries to load one of the partitions with SharedLock but it fails.
 
> 
> > This of course doesn’t happen always; depending on the course of actions of the pg_restore. But often enough to
causefrustration.
 
> 
> > Process 15858 waits for AccessShareLock on relation 233358134 of database 233346697; blocked by process 15861.
> > Process 15861 waits for AccessExclusiveLock on relation 233374757 of database 233346697; blocked by process 15858.
> > Process 15858: TRUNCATE TABLE ONLY myschema."myTable:2020-09-01"; 
> > Process 15861: ALTER TABLE ONLY myschema."myTable" ADD CONSTRAINT 
> > "pk_myTable" PRIMARY KEY ("ID", date);
> 
> Hm, this seems related to 2ba5b2db7, but not the same thing.
> Alvaro, any thoughts?

So apparently when we go to restore the table data for the partition, the TRUNCATE deadlocks with the PK addition ...
that'spretty odd; shouldn't the constraint restore have waited until the data had been fully loaded?
 

-- 
Álvaro Herrera
https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.2ndquadrant.com%2F&data=01%7C01%7Cdomagoj.smoljanovic%40oradian.com%7Cf9054c64e75a49adac3308d858bc1423%7Cc3d7e30ad09240c8b35c54a27682c60d%7C0&sdata=9pphCt1EzkEzrCuCg8CLdRywknjNiG6WLfRhR4T7qPQ%3D&reserved=0
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: pg_restore causing deadlocks on partitioned tables

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2020-Sep-14, Tom Lane wrote:
>> Hm, this seems related to 2ba5b2db7, but not the same thing.
>> Alvaro, any thoughts?

> So apparently when we go to restore the table data for the partition,
> the TRUNCATE deadlocks with the PK addition ... that's pretty odd;
> shouldn't the constraint restore have waited until the data had been
> fully loaded?

Yeah, that's certainly the design expectation.  Missing dependency?

If memory serves, which it may not given my undercaffeinated state,
we would not expect there to be a direct dependency link between the
constraint and the table data "object".  What there should be is
dependencies forcing the data to be restored before the post-data
boundary pseudo-object, and the constraint after the boundary.
I'm half guessing that that's being mucked up for partitioned tables.

            regards, tom lane



Re: pg_restore causing deadlocks on partitioned tables

From
Tom Lane
Date:
I wrote:
> If memory serves, which it may not given my undercaffeinated state,
> we would not expect there to be a direct dependency link between the
> constraint and the table data "object".  What there should be is
> dependencies forcing the data to be restored before the post-data
> boundary pseudo-object, and the constraint after the boundary.

No, that's wrong: the boundary objects only exist inside pg_dump.

Looking more closely, we have a deadlock between data restore
for a partition:

Process 15858: TRUNCATE TABLE ONLY myschema."myTable:2020-09-01";

and adding a PK to what I assume is its parent partitioned table:

Process 15861: ALTER TABLE ONLY myschema."myTable" ADD CONSTRAINT "pk_myTable" PRIMARY KEY ("ID", date);

Since that's an ALTER TABLE ONLY, it shouldn't be trying to touch the
child partitions at all; while the TRUNCATE should only be trying to touch
the child partition.  At least, that's what pg_dump is expecting.

However, the deadlock report suggests, and manual experimentation
confirms, that

(1) TRUNCATE on a partition tries to get AccessShareLock on the parent;

(2) ALTER TABLE ONLY ... ADD CONSTRAINT on a partition root tries to get
AccessExclusiveLock on all child partitions, despite the ONLY.

Each of these facts violates pg_dump's expectations about what can be
done in parallel with what.  There's no obvious reason why we need such
concurrency-killing locks for these operations, either.  So I think
what we have here are two distinct backend bugs, not a pg_dump bug.

            regards, tom lane



Re: pg_restore causing deadlocks on partitioned tables

From
Tom Lane
Date:
I wrote:
> However, the deadlock report suggests, and manual experimentation
> confirms, that

> (1) TRUNCATE on a partition tries to get AccessShareLock on the parent;

The reason for this is that

(a) ExecuteTruncateGuts calls InitResultRelInfo, because it might
need that to fire TRUNCATE triggers for the child relation.

(b) InitResultRelInfo calls RelationGetPartitionQual, which
of course(?) must access the parent table.

AFAICS, it is utterly silly for InitResultRelInfo to be forcing
a partition qual to be computed when we might not need it.
We could flush ResultRelInfo.ri_PartitionCheck altogether and
have anything that was reading it instead do
RelationGetPartitionQual(ResultRelInfo.ri_RelationDesc).

Actually it looks like most of the places reading it are
just interested in non-nullness; can't those be nuked from
orbit in favor of testing rel->rd_rel->relispartition?
There's no such thing as a partition with an empty partition
qual is there?  (Or even if it's possible, do we care about
optimizing the case?)

> (2) ALTER TABLE ONLY ... ADD CONSTRAINT on a partition root tries to get
> AccessExclusiveLock on all child partitions, despite the ONLY.

The cause of this seems to be that ATPrepSetNotNull is too dumb to
avoid recursing to all the child tables when the parent is already
attnotnull.  Or is there a reason we have to recurse anyway?

            regards, tom lane



Re: pg_restore causing deadlocks on partitioned tables

From
Alvaro Herrera
Date:
On 2020-Sep-14, Tom Lane wrote:

> > (1) TRUNCATE on a partition tries to get AccessShareLock on the parent;
> 
> The reason for this is that
> 
> (a) ExecuteTruncateGuts calls InitResultRelInfo, because it might
> need that to fire TRUNCATE triggers for the child relation.

Hmm, this seems legitimate, but of course we don't need the partition
qual.  So the reported bug would be solved with just the change to avoid
loading ri_PartitionExpr until needed.

> AFAICS, it is utterly silly for InitResultRelInfo to be forcing
> a partition qual to be computed when we might not need it.
> We could flush ResultRelInfo.ri_PartitionCheck altogether and
> have anything that was reading it instead do
> RelationGetPartitionQual(ResultRelInfo.ri_RelationDesc).

Hmm, but I presume we don't want to compute it every time.  I suggest we
would still have it, but we'd only computed it when first used.

> Actually it looks like most of the places reading it are
> just interested in non-nullness; can't those be nuked from
> orbit in favor of testing rel->rd_rel->relispartition?
> There's no such thing as a partition with an empty partition
> qual is there?  (Or even if it's possible, do we care about
> optimizing the case?)

Actually, there is one such case -- when the default partition is the
only partition, its constraint is empty.  This has caused at least one
bug.  Maybe it'd be better if we used something like constant true
instead ... since we're not likely to care much about the performance of
that case.  But I don't think that would change this patch any.

> > (2) ALTER TABLE ONLY ... ADD CONSTRAINT on a partition root tries to get
> > AccessExclusiveLock on all child partitions, despite the ONLY.
> 
> The cause of this seems to be that ATPrepSetNotNull is too dumb to
> avoid recursing to all the child tables when the parent is already
> attnotnull.  Or is there a reason we have to recurse anyway?

Hmm, looking at ATExecSetNotNull, we invoke the PostAlter hook even when
there's no change, so if we supressed the recursion early, that would
change.  But I doubt we actually care.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_restore causing deadlocks on partitioned tables

From
Tom Lane
Date:
I wrote:
>> (2) ALTER TABLE ONLY ... ADD CONSTRAINT on a partition root tries to get
>> AccessExclusiveLock on all child partitions, despite the ONLY.

> The cause of this seems to be that ATPrepSetNotNull is too dumb to
> avoid recursing to all the child tables when the parent is already
> attnotnull.  Or is there a reason we have to recurse anyway?

I wrote a quick patch for this part.  It seems pretty safe and probably
could be back-patched without fear.  (I also noticed that
ATSimpleRecursion is being unnecessarily stupid: instead of the
demonstrably not-future-proof relkind check, it could test relhassubclass,
which is not only simpler and less likely to need future changes, but
is able to save a scan of pg_inherits in a lot more cases.)

As far as I can tell in some quick testing, this fix is sufficient to
resolve the complained-of deadlock.  It'd still be a good idea to fix the
TRUNCATE side of things as well.  But that would be hard to back-patch
because removing ri_PartitionCheck, or even just failing to fill it,
seems like a potential ABI break for extensions.  So my proposal is
to back-patch this, but address the ResultRelInfo change only in HEAD.

            regards, tom lane

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index c21a309f04..4a51d79d09 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -5681,14 +5681,10 @@ ATSimpleRecursion(List **wqueue, Relation rel,
                   AlterTableUtilityContext *context)
 {
     /*
-     * Propagate to children if desired.  Only plain tables, foreign tables
-     * and partitioned tables have children, so no need to search for other
-     * relkinds.
+     * Propagate to children, if desired and if there are (or might be) any
+     * children.
      */
-    if (recurse &&
-        (rel->rd_rel->relkind == RELKIND_RELATION ||
-         rel->rd_rel->relkind == RELKIND_FOREIGN_TABLE ||
-         rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE))
+    if (recurse && rel->rd_rel->relhassubclass)
     {
         Oid            relid = RelationGetRelid(rel);
         ListCell   *child;
@@ -6698,6 +6694,35 @@ ATPrepSetNotNull(List **wqueue, Relation rel,
     if (recursing)
         return;

+    /*
+     * If the target column is already marked NOT NULL, we can skip recursing
+     * to children, because their columns should already be marked NOT NULL as
+     * well.  But there's no point in checking here unless the relation has
+     * some children; else we can just wait till execution to check.  (If it
+     * does have children, however, this can save taking per-child locks
+     * unnecessarily.  This greatly improves concurrency in some parallel
+     * restore scenarios.)
+     */
+    if (rel->rd_rel->relhassubclass)
+    {
+        HeapTuple    tuple;
+        bool        attnotnull;
+
+        tuple = SearchSysCacheAttName(RelationGetRelid(rel), cmd->name);
+
+        /* Might as well throw the error now, if name is bad */
+        if (!HeapTupleIsValid(tuple))
+            ereport(ERROR,
+                    (errcode(ERRCODE_UNDEFINED_COLUMN),
+                     errmsg("column \"%s\" of relation \"%s\" does not exist",
+                            cmd->name, RelationGetRelationName(rel))));
+
+        attnotnull = ((Form_pg_attribute) GETSTRUCT(tuple))->attnotnull;
+        ReleaseSysCache(tuple);
+        if (attnotnull)
+            return;
+    }
+
     /*
      * If we have ALTER TABLE ONLY ... SET NOT NULL on a partitioned table,
      * apply ALTER TABLE ... CHECK NOT NULL to every child.  Otherwise, use

Re: pg_restore causing deadlocks on partitioned tables

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2020-Sep-14, Tom Lane wrote:
>> AFAICS, it is utterly silly for InitResultRelInfo to be forcing
>> a partition qual to be computed when we might not need it.
>> We could flush ResultRelInfo.ri_PartitionCheck altogether and
>> have anything that was reading it instead do
>> RelationGetPartitionQual(ResultRelInfo.ri_RelationDesc).

> Hmm, but I presume we don't want to compute it every time.  I suggest we
> would still have it, but we'd only computed it when first used.

RelationGetPartitionQual already does that caching.

            regards, tom lane



Re: pg_restore causing deadlocks on partitioned tables

From
Amit Langote
Date:
On Tue, Sep 15, 2020 at 9:09 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
> >> (2) ALTER TABLE ONLY ... ADD CONSTRAINT on a partition root tries to get
> >> AccessExclusiveLock on all child partitions, despite the ONLY.
>
> > The cause of this seems to be that ATPrepSetNotNull is too dumb to
> > avoid recursing to all the child tables when the parent is already
> > attnotnull.  Or is there a reason we have to recurse anyway?
>
> I wrote a quick patch for this part.  It seems pretty safe and probably
> could be back-patched without fear.

The patch changes existing behavior in one case as shown below:

drop table if exists foo cascade;
create table foo (a int not null);
create table child () inherits (foo);
alter table child alter a drop not null;
alter table foo alter a set not null;
insert into child select null;

Currently, the last statement gives a "not null violated" error, but
no longer with the patch, because the alter statement before that now
finishes without setting NOT NULL in child.

The patch's theory that if the parent column has NOT NULL set then it
must be set in child tables too does not actually hold for plain
inheritance cases, because as shown above, NOT NULL can be dropped in
children independently of the parent.  With partitioning, dropping NOT
NULL from child partitions is prevented:

drop table if exists bar cascade;
create table bar (a int) partition by list (a);
create table bar1 partition of bar (a not null) for values in (1);
alter table bar1 alter a drop not null;
ERROR:  column "a" is marked NOT NULL in parent table

>  (I also noticed that
> ATSimpleRecursion is being unnecessarily stupid: instead of the
> demonstrably not-future-proof relkind check, it could test relhassubclass,
> which is not only simpler and less likely to need future changes, but
> is able to save a scan of pg_inherits in a lot more cases.)

+1

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com



Re: pg_restore causing deadlocks on partitioned tables

From
Amit Langote
Date:
On Tue, Sep 15, 2020 at 7:28 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
> > However, the deadlock report suggests, and manual experimentation
> > confirms, that
>
> > (1) TRUNCATE on a partition tries to get AccessShareLock on the parent;
>
> The reason for this is that
>
> (a) ExecuteTruncateGuts calls InitResultRelInfo, because it might
> need that to fire TRUNCATE triggers for the child relation.
>
> (b) InitResultRelInfo calls RelationGetPartitionQual, which
> of course(?) must access the parent table.
>
> AFAICS, it is utterly silly for InitResultRelInfo to be forcing
> a partition qual to be computed when we might not need it.
> We could flush ResultRelInfo.ri_PartitionCheck altogether and
> have anything that was reading it instead do
> RelationGetPartitionQual(ResultRelInfo.ri_RelationDesc).
>
> Actually it looks like most of the places reading it are
> just interested in non-nullness; can't those be nuked from
> orbit in favor of testing rel->rd_rel->relispartition?

Yeah, makes sense.  Please see attached a patch to do that.

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachment

Re: pg_restore causing deadlocks on partitioned tables

From
Tom Lane
Date:
Amit Langote <amitlangote09@gmail.com> writes:
> On Tue, Sep 15, 2020 at 9:09 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I wrote a quick patch for this part.  It seems pretty safe and probably
>> could be back-patched without fear.

> The patch's theory that if the parent column has NOT NULL set then it
> must be set in child tables too does not actually hold for plain
> inheritance cases, because as shown above, NOT NULL can be dropped in
> children independently of the parent.

Ah, right.  That seems like a bug but we have not attempted to fix it.
But we could restrict the optimization to partitioned tables, where
the assumption does hold, no?

            regards, tom lane



Re: pg_restore causing deadlocks on partitioned tables

From
Amit Langote
Date:
On Tue, Sep 15, 2020 at 10:47 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Amit Langote <amitlangote09@gmail.com> writes:
> > On Tue, Sep 15, 2020 at 9:09 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> I wrote a quick patch for this part.  It seems pretty safe and probably
> >> could be back-patched without fear.
>
> > The patch's theory that if the parent column has NOT NULL set then it
> > must be set in child tables too does not actually hold for plain
> > inheritance cases, because as shown above, NOT NULL can be dropped in
> > children independently of the parent.
>
> Ah, right.  That seems like a bug but we have not attempted to fix it.

IIRC, when this behavior was brought up as a bug in past discussions,
it was decided that it will be fixed when NOT NULL constraints are
represented using pg_constraint entries.

> But we could restrict the optimization to partitioned tables, where
> the assumption does hold, no?

Yeah, seems safe in their case.

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com



Re: pg_restore causing deadlocks on partitioned tables

From
Tom Lane
Date:
Amit Langote <amitlangote09@gmail.com> writes:
> On Tue, Sep 15, 2020 at 10:47 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Ah, right.  That seems like a bug but we have not attempted to fix it.

> IIRC, when this behavior was brought up as a bug in past discussions,
> it was decided that it will be fixed when NOT NULL constraints are
> represented using pg_constraint entries.

Yeah, that matches my recollection too.

>> But we could restrict the optimization to partitioned tables, where
>> the assumption does hold, no?

> Yeah, seems safe in their case.

And that's sufficient to cover pg_restore's issue, since it's
only going to be trying to do this for partitioned tables.

I'll wait till tomorrow to push this, since we're still in
freeze mode for the v13 branch.

            regards, tom lane



Re: pg_restore causing deadlocks on partitioned tables

From
Tom Lane
Date:
Amit Langote <amitlangote09@gmail.com> writes:
> On Tue, Sep 15, 2020 at 7:28 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> AFAICS, it is utterly silly for InitResultRelInfo to be forcing
>> a partition qual to be computed when we might not need it.
>> We could flush ResultRelInfo.ri_PartitionCheck altogether and
>> have anything that was reading it instead do
>> RelationGetPartitionQual(ResultRelInfo.ri_RelationDesc).

> Yeah, makes sense.  Please see attached a patch to do that.

Just eyeballing this, this bit seems bogus:

@@ -1904,7 +1903,7 @@ ExecConstraints(ResultRelInfo *resultRelInfo,
     Bitmapset  *insertedCols;
     Bitmapset  *updatedCols;
 
-    Assert(constr || resultRelInfo->ri_PartitionCheck);
+    Assert(constr);
 
     if (constr && constr->has_not_null)
     {

It does look like all the call sites check for the rel having constraints
before calling, so the modified Assert may not be failing ... but why
are we asserting and then also making a run-time test?

My inclination is to just drop the Assert as useless.  There's no
particular reason for this function to make it a hard requirement
that callers optimize away unnecessary calls.

I'm suspicious of the business in ExecPartitionCheck about constructing
a constant-true expression.  I think executing that is likely to add
more cycles than you save by not running through this code each time;
once relcache has cached the knowledge that the partition expression
is empty, all the steps here are pretty darn cheap ... which no doubt
is why there wasn't a comparable optimization already.  If you're
really concerned about that it'd be better to add a separate
"bool ri_PartitionCheckExprValid" flag.  (Perhaps that's worth doing
to avoid impacts from relcache flushes; though I remain unconvinced
that optimizing for the empty-expression case is useful.)

            regards, tom lane



Re: pg_restore causing deadlocks on partitioned tables

From
Amit Langote
Date:
On Wed, Sep 16, 2020 at 2:41 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Amit Langote <amitlangote09@gmail.com> writes:
> > On Tue, Sep 15, 2020 at 7:28 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> AFAICS, it is utterly silly for InitResultRelInfo to be forcing
> >> a partition qual to be computed when we might not need it.
> >> We could flush ResultRelInfo.ri_PartitionCheck altogether and
> >> have anything that was reading it instead do
> >> RelationGetPartitionQual(ResultRelInfo.ri_RelationDesc).
>
> > Yeah, makes sense.  Please see attached a patch to do that.
>
> Just eyeballing this, this bit seems bogus:
>
> @@ -1904,7 +1903,7 @@ ExecConstraints(ResultRelInfo *resultRelInfo,
>         Bitmapset  *insertedCols;
>         Bitmapset  *updatedCols;
>
> -       Assert(constr || resultRelInfo->ri_PartitionCheck);
> +       Assert(constr);
>
>         if (constr && constr->has_not_null)
>         {
>
> It does look like all the call sites check for the rel having constraints
> before calling, so the modified Assert may not be failing ... but why
> are we asserting and then also making a run-time test?
>
> My inclination is to just drop the Assert as useless.  There's no
> particular reason for this function to make it a hard requirement
> that callers optimize away unnecessary calls.

Yeah, the Assert seems pretty pointless at this point.

> I'm suspicious of the business in ExecPartitionCheck about constructing
> a constant-true expression.  I think executing that is likely to add
> more cycles than you save by not running through this code each time;
> once relcache has cached the knowledge that the partition expression
> is empty, all the steps here are pretty darn cheap ... which no doubt
> is why there wasn't a comparable optimization already.

Ah, you're right.

>  If you're
> really concerned about that it'd be better to add a separate
> "bool ri_PartitionCheckExprValid" flag.  (Perhaps that's worth doing
> to avoid impacts from relcache flushes; though I remain unconvinced
> that optimizing for the empty-expression case is useful.)

Agreed that it's not really necessary to optimize that case.

Updated patch attached.

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachment

Re: pg_restore causing deadlocks on partitioned tables

From
Tom Lane
Date:
Amit Langote <amitlangote09@gmail.com> writes:
> Updated patch attached.

Pushed with a little bit of fooling about.  After looking at the
git history, I saw that the Assert we were wondering about used to
be just "Assert(constr)", and there were not run-time checks on
whether constr is null.  That was changed when f0e44751d added
partition constraint checking into ExecConstraints' responsibilities.
At some later point that code was removed from ExecConstraints,
but we failed to undo the other changes in ExecConstraints, leaving
it looking pretty silly.  So I reverted this to the way it was,
with just an Assert and no regular checks.

I also did a bit more work on the comments.  (Speaking of which,
is there a better place to put the commentary you removed from
InitResultRelInfo?  It was surely wildly out of place there,
but I'm wondering if maybe we have a README that should cover it.)

I pushed this to HEAD only, and the other patch as far back as
v12, so we will have a solution to the deadlock problem in v12.

            regards, tom lane



RE: pg_restore causing deadlocks on partitioned tables

From
Domagoj Smoljanovic
Date:
Thanx Tom and Amit for the effort.

Looking forward to try it out.
D.

-----Original Message-----
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: 16. rujna 2020. 20:41
To: Amit Langote <amitlangote09@gmail.com>
Cc: Alvaro Herrera <alvherre@2ndquadrant.com>; Domagoj Smoljanovic <domagoj.smoljanovic@oradian.com>;
pgsql-hackers@postgresql.org
Subject: Re: pg_restore causing deadlocks on partitioned tables

Amit Langote <amitlangote09@gmail.com> writes:
> Updated patch attached.

Pushed with a little bit of fooling about.  After looking at the git history, I saw that the Assert we were wondering
aboutused to be just "Assert(constr)", and there were not run-time checks on whether constr is null.  That was changed
whenf0e44751d added partition constraint checking into ExecConstraints' responsibilities. 
At some later point that code was removed from ExecConstraints, but we failed to undo the other changes in
ExecConstraints,leaving it looking pretty silly.  So I reverted this to the way it was, with just an Assert and no
regularchecks. 

I also did a bit more work on the comments.  (Speaking of which, is there a better place to put the commentary you
removedfrom InitResultRelInfo?  It was surely wildly out of place there, but I'm wondering if maybe we have a README
thatshould cover it.) 

I pushed this to HEAD only, and the other patch as far back as v12, so we will have a solution to the deadlock problem
inv12. 

            regards, tom lane



Re: pg_restore causing deadlocks on partitioned tables

From
Amit Langote
Date:
On Thu, Sep 17, 2020 at 3:40 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Amit Langote <amitlangote09@gmail.com> writes:
> > Updated patch attached.
>
> Pushed with a little bit of fooling about.

Thank you.

>  After looking at the
> git history, I saw that the Assert we were wondering about used to
> be just "Assert(constr)", and there were not run-time checks on
> whether constr is null.  That was changed when f0e44751d added
> partition constraint checking into ExecConstraints' responsibilities.
> At some later point that code was removed from ExecConstraints,
> but we failed to undo the other changes in ExecConstraints, leaving
> it looking pretty silly.  So I reverted this to the way it was,
> with just an Assert and no regular checks.
>
> I also did a bit more work on the comments.  (Speaking of which,
> is there a better place to put the commentary you removed from
> InitResultRelInfo?  It was surely wildly out of place there,
> but I'm wondering if maybe we have a README that should cover it.)

Actually, the two points of interest in that now removed comment,
which was this:

-    * Partition constraint, which also includes the partition constraint of
-    * all the ancestors that are partitions.  Note that it will be checked
-    * even in the case of tuple-routing where this table is the target leaf
-    * partition, if there any BR triggers defined on the table.  Although
-    * tuple-routing implicitly preserves the partition constraint of the
-    * target partition for a given row, the BR triggers may change the row
-    * such that the constraint is no longer satisfied, which we must fail for
-    * by checking it explicitly.
-    *
-    * If this is a partitioned table, the partition constraint (if any) of a
-    * given row will be checked just before performing tuple-routing.

are also mentioned, although in less words, where they are relevant:

In ExecInsert():

        /*
         * Also check the tuple against the partition constraint, if there is
         * one; except that if we got here via tuple-routing, we don't need to
         * if there's no BR trigger defined on the partition.
         */
        if (resultRelationDesc->rd_rel->relispartition &&
            (resultRelInfo->ri_PartitionRoot == NULL ||
             (resultRelInfo->ri_TrigDesc &&
              resultRelInfo->ri_TrigDesc->trig_insert_before_row)))
            ExecPartitionCheck(resultRelInfo, slot, estate, true);

In ExecFindPartition():

    /*
     * First check the root table's partition constraint, if any.  No point in
     * routing the tuple if it doesn't belong in the root table itself.
     */
    if (rootResultRelInfo->ri_RelationDesc->rd_rel->relispartition)
        ExecPartitionCheck(rootResultRelInfo, slot, estate, true);

Maybe that's enough?

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com



Re: pg_restore causing deadlocks on partitioned tables

From
Tom Lane
Date:
Amit Langote <amitlangote09@gmail.com> writes:
> On Thu, Sep 17, 2020 at 3:40 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I also did a bit more work on the comments.  (Speaking of which,
>> is there a better place to put the commentary you removed from
>> InitResultRelInfo?  It was surely wildly out of place there,
>> but I'm wondering if maybe we have a README that should cover it.)

> Actually, the two points of interest in that now removed comment,
> which was this:
> ...
> are also mentioned, although in less words, where they are relevant:

Ah, good.

> Maybe that's enough?

Works for me.  Thanks for checking that.

            regards, tom lane