Thread: [HACKERS] BEFORE trigger can cause undetected partition constraint violation
I just discovered that a BEFORE trigger can allow data into a partition that violates the relevant partition constraint. This is bad. Here is an example: rhaas=# create or replace function t() returns trigger as $$begin new.a := 2; return new; end$$ language plpgsql; CREATE FUNCTION rhaas=# create table foo (a int, b text) partition by list (a); CREATE TABLE rhaas=# create table foo1 partition of foo for values in (1); CREATE TABLE rhaas=# create trigger x before insert on foo1 for each row execute procedure t(); CREATE TRIGGER rhaas=# insert into foo values (1, 'hi there'); INSERT 0 1 rhaas=# select tableoid::regclass, * from foo;tableoid | a | b ----------+---+----------foo1 | 2 | hi there (1 row) That row violates the partition constraint, which requires that a = 1. You can see that by trying to insert the same row into the partition directly: rhaas=# insert into foo1 values (2, 'hi there'); ERROR: new row for relation "foo1" violates partition constraint DETAIL: Failing row contains (2, hi there). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] BEFORE trigger can cause undetected partitionconstraint violation
From
Jeevan Ladhe
Date:
I tried to debug this, and I see that while the target partition index is correctly
found in ExecInsert(), somehow the resultRelInfo->ri_PartitionCheck is NIL, this
is extracted from array mstate->mt_partitions.
This prevents execution of constraints in following code snippet in ExecInsert():
/*
* Check the constraints of the tuple
*/
if (resultRelationDesc->rd_att->constr || resultRelInfo->ri_PartitionCheck)
ExecConstraints(resultRelInfo, slot, estate);
I couldn't debug it further today.
Regards,
Jeevan Ladhe
On Fri, Jun 2, 2017 at 1:21 AM, Robert Haas <robertmhaas@gmail.com> wrote:
I just discovered that a BEFORE trigger can allow data into a
partition that violates the relevant partition constraint. This is
bad.
Here is an example:
rhaas=# create or replace function t() returns trigger as $$begin
new.a := 2; return new; end$$ language plpgsql;
CREATE FUNCTION
rhaas=# create table foo (a int, b text) partition by list (a);
CREATE TABLE
rhaas=# create table foo1 partition of foo for values in (1);
CREATE TABLE
rhaas=# create trigger x before insert on foo1 for each row execute
procedure t();
CREATE TRIGGER
rhaas=# insert into foo values (1, 'hi there');
INSERT 0 1
rhaas=# select tableoid::regclass, * from foo;
tableoid | a | b
----------+---+----------
foo1 | 2 | hi there
(1 row)
That row violates the partition constraint, which requires that a = 1.
You can see that by trying to insert the same row into the partition
directly:
rhaas=# insert into foo1 values (2, 'hi there');
ERROR: new row for relation "foo1" violates partition constraint
DETAIL: Failing row contains (2, hi there).
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BEFORE trigger can cause undetected partition constraint violation
From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes: > I just discovered that a BEFORE trigger can allow data into a > partition that violates the relevant partition constraint. This is > bad. Without having checked the code, I imagine the reason for this is that BEFORE triggers are fired after tuple routing occurs. Re-ordering that seems problematic, because what if you have different triggers on different partitions? We'd have to forbid that, probably by saying that only the parent table's BEFORE ROW triggers are fired, but that seems not very nice. The alternative is to forbid BEFORE triggers on partitions to alter the routing result, probably by rechecking the partition constraint after trigger firing. We have always checked ordinary CHECK constraints after BEFORE triggers fire, precisely because a trigger might change the data to make it fail (or pass!) a constraint. I take it somebody decided that wasn't necessary for partition constraints. Penny wise and pound foolish? regards, tom lane
Re: [HACKERS] BEFORE trigger can cause undetected partitionconstraint violation
From
Robert Haas
Date:
On Thu, Jun 1, 2017 at 6:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Without having checked the code, I imagine the reason for this is > that BEFORE triggers are fired after tuple routing occurs. Yep. > Re-ordering that seems problematic, because what if you have different > triggers on different partitions? We'd have to forbid that, probably > by saying that only the parent table's BEFORE ROW triggers are fired, > but that seems not very nice. The parent hasn't got any triggers; that's forbidden. > The alternative is to forbid BEFORE triggers on partitions to alter > the routing result, probably by rechecking the partition constraint > after trigger firing. That's what we need to do. Until we do tuple routing, we don't know which partition we're addressing, so we don't know which triggers we're firing. So the only way to prevent this is to recheck. Which I think is supposed to work already, but apparently doesn't. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] BEFORE trigger can cause undetected partitionconstraint violation
From
Amit Langote
Date:
On 2017/06/02 10:36, Robert Haas wrote: > On Thu, Jun 1, 2017 at 6:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Without having checked the code, I imagine the reason for this is >> that BEFORE triggers are fired after tuple routing occurs. > > Yep. > >> Re-ordering that seems problematic, because what if you have different >> triggers on different partitions? We'd have to forbid that, probably >> by saying that only the parent table's BEFORE ROW triggers are fired, >> but that seems not very nice. > > The parent hasn't got any triggers; that's forbidden. > >> The alternative is to forbid BEFORE triggers on partitions to alter >> the routing result, probably by rechecking the partition constraint >> after trigger firing. > > That's what we need to do. Until we do tuple routing, we don't know > which partition we're addressing, so we don't know which triggers > we're firing. So the only way to prevent this is to recheck. Which I > think is supposed to work already, but apparently doesn't. That has to do with the assumption written down in the following portion of a comment in InitResultRelInfo(): /* * If partition_root has been specified, that means we are building the * ResultRelInfo for one of its leaf partitions. In that case, we need * *not* initialize the leaf partition's constraint, but rather the * partition_root's (if any). which, as it turns out, is wrong. It completely disregards the fact that BR triggers are executed after tuple-routing and can change the row. Attached patch makes InitResultRelInfo() *always* initialize the partition's constraint, that is, regardless of whether insert/copy is through the parent or directly on the partition. I'm wondering if ExecInsert() and CopyFrom() should check if it actually needs to execute the constraint? I mean it's needed if there exists BR insert triggers which may change the row, but not otherwise. The patch currently does not implement that check. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] BEFORE trigger can cause undetected partitionconstraint violation
From
Robert Haas
Date:
On Fri, Jun 2, 2017 at 12:51 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > Attached patch makes InitResultRelInfo() *always* initialize the > partition's constraint, that is, regardless of whether insert/copy is > through the parent or directly on the partition. I'm wondering if > ExecInsert() and CopyFrom() should check if it actually needs to execute > the constraint? I mean it's needed if there exists BR insert triggers > which may change the row, but not otherwise. The patch currently does not > implement that check. I think it should. I mean, otherwise we're leaving a probably-material amount of performance on the table. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] BEFORE trigger can cause undetected partitionconstraint violation
From
Amit Langote
Date:
On 2017/06/03 1:56, Robert Haas wrote: > On Fri, Jun 2, 2017 at 12:51 AM, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> Attached patch makes InitResultRelInfo() *always* initialize the >> partition's constraint, that is, regardless of whether insert/copy is >> through the parent or directly on the partition. I'm wondering if >> ExecInsert() and CopyFrom() should check if it actually needs to execute >> the constraint? I mean it's needed if there exists BR insert triggers >> which may change the row, but not otherwise. The patch currently does not >> implement that check. > > I think it should. I mean, otherwise we're leaving a > probably-material amount of performance on the table. I agree. Updated the patch to implement the check. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] BEFORE trigger can cause undetected partitionconstraint violation
From
Robert Haas
Date:
On Mon, Jun 5, 2017 at 1:02 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > On 2017/06/03 1:56, Robert Haas wrote: >> On Fri, Jun 2, 2017 at 12:51 AM, Amit Langote >> <Langote_Amit_f8@lab.ntt.co.jp> wrote: >>> Attached patch makes InitResultRelInfo() *always* initialize the >>> partition's constraint, that is, regardless of whether insert/copy is >>> through the parent or directly on the partition. I'm wondering if >>> ExecInsert() and CopyFrom() should check if it actually needs to execute >>> the constraint? I mean it's needed if there exists BR insert triggers >>> which may change the row, but not otherwise. The patch currently does not >>> implement that check. >> >> I think it should. I mean, otherwise we're leaving a >> probably-material amount of performance on the table. > > I agree. Updated the patch to implement the check. OK, so this isn't quite right. Consider the following test case: rhaas=# create table x (a int, b int, c int) partition by list (a); CREATE TABLE rhaas=# create table y partition of x for values in (1) partition by list (b); CREATE TABLE rhaas=# create table z partition of y for values in (1); CREATE TABLE rhaas=# insert into y values (2,1,1); INSERT 0 1 The absence of the before-trigger is treated as evidence that z need not revalidate the partition constraint, but actually it (or something) still needs to enforce the part of it that originates from y's ancestors. In short, this patch would reintroduce the bug that was fixed in 39162b2030fb0a35a6bb28dc636b5a71b8df8d1c, and would do so by removing the exact same code that was added (by you!) in that patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] BEFORE trigger can cause undetected partitionconstraint violation
From
Amit Langote
Date:
On 2017/06/07 1:19, Robert Haas wrote: > On Mon, Jun 5, 2017 at 1:02 AM, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> On 2017/06/03 1:56, Robert Haas wrote: >>> On Fri, Jun 2, 2017 at 12:51 AM, Amit Langote >>> <Langote_Amit_f8@lab.ntt.co.jp> wrote: >>>> Attached patch makes InitResultRelInfo() *always* initialize the >>>> partition's constraint, that is, regardless of whether insert/copy is >>>> through the parent or directly on the partition. I'm wondering if >>>> ExecInsert() and CopyFrom() should check if it actually needs to execute >>>> the constraint? I mean it's needed if there exists BR insert triggers >>>> which may change the row, but not otherwise. The patch currently does not >>>> implement that check. >>> >>> I think it should. I mean, otherwise we're leaving a >>> probably-material amount of performance on the table. >> >> I agree. Updated the patch to implement the check. > > OK, so this isn't quite right. Consider the following test case: > > rhaas=# create table x (a int, b int, c int) partition by list (a); > CREATE TABLE > rhaas=# create table y partition of x for values in (1) partition by list (b); > CREATE TABLE > rhaas=# create table z partition of y for values in (1); > CREATE TABLE > rhaas=# insert into y values (2,1,1); > INSERT 0 1 Gah! > The absence of the before-trigger is treated as evidence that z need > not revalidate the partition constraint, but actually it (or > something) still needs to enforce the part of it that originates from > y's ancestors. In short, this patch would reintroduce the bug that > was fixed in 39162b2030fb0a35a6bb28dc636b5a71b8df8d1c, and would do so > by removing the exact same code that was added (by you!) in that > patch. I think we will have to go for the "or something" here, which is the way I should have originally proposed it (I mean the patch that went in as 39162b2030fb0a35a6bb28dc636b5a71b8df8d1c). :-( How about we export ExecPartitionCheck() out of execMain.c and call it just before ExecFindPartition() using the root table's ResultRelInfo? If the root table is a partition, its ResultRelInfo.ri_PartitionCheck must have been initialized, which ExecPartitionCheck will check. Since there cannot be any BR triggers on the root table (because partitioned), we can safely do that. After tuple-routing, if the target leaf partition has BR triggers, any violating changes they might make will be checked by ExecConstraints() using the leaf partition's ResultRelInfo, whose ri_PartitionCheck consists of its own partition constraints plus that of any of its ancestors that are partitions. If the leaf partition does not have any BR triggers we need not check any partition constraints just as the patch does. (Remember that we already checked the root table's partition constraint before we began routing the tuple, as the proposed updated patch will do, and we don't need to worry about any of intermediate ancestors, because if the tuple does not satisfy the constraint of any one of those, tuple-routing will fail anyway.) I proposed a similar thing in the hash partitioning thread [1], where Dilip was complaining about name of the table that appears in the "violates partition constraint" error message. Even if the tuple failed an ancestor table's partition constraint, since the ResultRelInfo passed to ExecConstraints() is the leaf partition's, the name shown is also leaf partition's. ISTM, showing the leaf partition's name is fine as long as it's a NOT NULL or a CHECK constraint failing, because they are explicitly attached to the leaf table, but maybe not fine when an implicitly inherited internal partition constraint fails. Attached updated patch that implements the change described above, in addition to fixing the originally reported bug. With the patch: create table x (a int, b int, c int) partition by list (a); create table y partition of x for values in (1) partition by list (b); create table z partition of y for values in (1); insert into y values (2,1,1); ERROR: new row for relation "y" violates partition constraint DETAIL: Failing row contains (2, 1, 1). -- whereas on HEAD, it shows the leaf partition's name insert into y values (2,1,1); ERROR: new row for relation "z" violates partition constraint DETAIL: Failing row contains (2, 1, 1). Thoughts? Thanks, Amit [1] https://www.postgresql.org/message-id/0ded7a50-0b35-1805-232b-f8edcf4cbadb%40lab.ntt.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] BEFORE trigger can cause undetected partitionconstraint violation
From
Amit Langote
Date:
On 2017/06/07 11:57, Amit Langote wrote: > How about we export ExecPartitionCheck() out of execMain.c and call it > just before ExecFindPartition() using the root table's ResultRelInfo? Turns out there wasn't a need to export ExecPartitionCheck after all. Instead of calling it from execModifyTable.c and copy.c, it's better to call it at the beginning of ExecFindPartition() itself. That way, there is no need to add the same code both in CopyFrom() and ExecInsert(), nor is there need to make ExecPartitionCheck() public. That's how the patch attached with the previous email does it anyway. Thanks, Amit
Re: [HACKERS] BEFORE trigger can cause undetected partitionconstraint violation
From
Robert Haas
Date:
On Wed, Jun 7, 2017 at 1:23 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > On 2017/06/07 11:57, Amit Langote wrote: >> How about we export ExecPartitionCheck() out of execMain.c and call it >> just before ExecFindPartition() using the root table's ResultRelInfo? > > Turns out there wasn't a need to export ExecPartitionCheck after all. > Instead of calling it from execModifyTable.c and copy.c, it's better to > call it at the beginning of ExecFindPartition() itself. That way, there > is no need to add the same code both in CopyFrom() and ExecInsert(), nor > is there need to make ExecPartitionCheck() public. That's how the patch > attached with the previous email does it anyway. Cool. I think this is a sensible approach, and have committed the patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] BEFORE trigger can cause undetected partitionconstraint violation
From
Amit Langote
Date:
On 2017/06/08 2:07, Robert Haas wrote: > On Wed, Jun 7, 2017 at 1:23 AM, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> On 2017/06/07 11:57, Amit Langote wrote: >>> How about we export ExecPartitionCheck() out of execMain.c and call it >>> just before ExecFindPartition() using the root table's ResultRelInfo? >> >> Turns out there wasn't a need to export ExecPartitionCheck after all. >> Instead of calling it from execModifyTable.c and copy.c, it's better to >> call it at the beginning of ExecFindPartition() itself. That way, there >> is no need to add the same code both in CopyFrom() and ExecInsert(), nor >> is there need to make ExecPartitionCheck() public. That's how the patch >> attached with the previous email does it anyway. > > Cool. I think this is a sensible approach, and have committed the patch. Thanks a lot. Regards, Amit