Thread: Segfault on logical replication to partitioned table with foreign children

Segfault on logical replication to partitioned table with foreign children

From
ilya.v.gladyshev@gmail.com
Date:
Hi,

Right now it is possible to add a partitioned table with foreign tables
as its children as a target of a subscription. It can lead to an assert
(or a segfault, if compiled without asserts) on a logical replication
worker when the worker attempts to insert the data received via
replication into the foreign table. Reproduce with caution, the worker
is going to crash and restart indefinitely. The setup:

Publisher on 5432 port:

CREATE TABLE parent (id int, num int);
CREATE PUBLICATION parent_pub FOR TABLE parent;

Subscriber on 5433 port:

CREATE EXTENSION postgres_fdw;
CREATE SERVER loopback foreign data wrapper postgres_fdw options (host
'127.0.0.1', port '5433', dbname 'postgres');
CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
CREATE TABLE parent (id int, num int) partition by range (id);
CREATE FOREIGN TABLE p1 PARTITION OF parent DEFAULT SERVER loopback;
CREATE TABLE p1_loc(id int, num int);
CREATE SUBSCRIPTION parent_sub CONNECTION 'host=127.0.0.1 port=5432
dbname=postgres' PUBLICATION parent_pub;

Then run an insert on the publisher: INSERT INTO parent VALUES (1, 1);

This will cause a segfault or raise an assert, because inserting into
foreign tables via logical replication is not possible. The solution I
propose is to add recursive checks of relkind for children of a target,
if the target is a partitioned table. I have attached a patch for this
and managed to reproduce this on REL_14_STABLE as well, not sure if a
patch for that version is also needed.

Kind Regards,
Ilya Gladyshev


Attachment

Re: Segfault on logical replication to partitioned table with foreign children

From
Dilip Kumar
Date:
On Sat, Oct 29, 2022 at 1:01 AM <ilya.v.gladyshev@gmail.com> wrote:
>
> Right now it is possible to add a partitioned table with foreign tables
> as its children as a target of a subscription. It can lead to an assert
> (or a segfault, if compiled without asserts) on a logical replication
> worker when the worker attempts to insert the data received via
> replication into the foreign table. Reproduce with caution, the worker
> is going to crash and restart indefinitely. The setup:

Yes, this looks like a bug and your fix seems correct to me.  It would
be nice to add a test case for this scenario.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Dilip Kumar <dilipbalaut@gmail.com> writes:
> Yes, this looks like a bug and your fix seems correct to me.  It would
> be nice to add a test case for this scenario.

A test case doesn't seem that exciting to me.  If we were trying to
make it actually work, then yeah, but throwing an error isn't that
useful to test.  The code will be exercised by replication to a
regular partitioned table (I assume we do have tests for that).

What I'm wondering about is whether we can refactor this code
to avoid so many usually-useless catalog lookups.  Pulling the
namespace name, in particular, is expensive and we generally
are not going to need the result.  In the child-rel case it'd
be much better to pass the opened relation and let the error-check
subroutine work from that.  Maybe we should just do it like that
at the start of the recursion, too?  Or pass the relid and let
the subroutine look up the names only in the error case.

A completely different line of thought is that this doesn't seem
like a terribly bulletproof fix, since children could get added to
a partitioned table after we look.  Maybe it'd be better to check
the relkind at the last moment before we do something that depends
on a child table being a relation.

            regards, tom lane



Re: Segfault on logical replication to partitioned table with foreign children

From
Alvaro Herrera
Date:
On 2022-Oct-28, ilya.v.gladyshev@gmail.com wrote:

> This will cause a segfault or raise an assert, because inserting into
> foreign tables via logical replication is not possible. The solution I
> propose is to add recursive checks of relkind for children of a target,
> if the target is a partitioned table.

However, I imagine that the only reason we don't support this is that
the code hasn't been written yet. I think it would be better to write
that code, so that we don't have to raise any error at all (unless the
foreign table is something that doesn't support DML, in which case we
would have to raise an error).  Of course, we still have to fix it in
backbranches, but we can just do it as a targeted check at the moment of
insert/update, not at the moment of subscription create/alter.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"The eagle never lost so much time, as
when he submitted to learn of the crow." (William Blake)



RE: Segfault on logical replication to partitioned table with foreign children

From
"shiy.fnst@fujitsu.com"
Date:
On Sun, Oct 30, 2022 9:39 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> What I'm wondering about is whether we can refactor this code
> to avoid so many usually-useless catalog lookups.  Pulling the
> namespace name, in particular, is expensive and we generally
> are not going to need the result.  In the child-rel case it'd
> be much better to pass the opened relation and let the error-check
> subroutine work from that.  Maybe we should just do it like that
> at the start of the recursion, too?  Or pass the relid and let
> the subroutine look up the names only in the error case.
>
> A completely different line of thought is that this doesn't seem
> like a terribly bulletproof fix, since children could get added to
> a partitioned table after we look.  Maybe it'd be better to check
> the relkind at the last moment before we do something that depends
> on a child table being a relation.
>

I agree. So maybe we can add this check in apply_handle_tuple_routing().

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 5250ae7f54..e941b68e4b 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -2176,6 +2176,10 @@ apply_handle_tuple_routing(ApplyExecutionData *edata,
        Assert(partrelinfo != NULL);
        partrel = partrelinfo->ri_RelationDesc;

+       /* Check for supported relkind. */
+       CheckSubscriptionRelkind(partrel->rd_rel->relkind,
+                                                        relmapentry->remoterel.nspname,
relmapentry->remoterel.relname);
+
        /*
         * To perform any of the operations below, the tuple must match the
         * partition's rowtype. Convert if needed or just copy, using a dedicated


Regards,
Shi yu



Re: Segfault on logical replication to partitioned table with foreign children

From
Dilip Kumar
Date:
On Sun, Oct 30, 2022 at 7:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Dilip Kumar <dilipbalaut@gmail.com> writes:
> > Yes, this looks like a bug and your fix seems correct to me.  It would
> > be nice to add a test case for this scenario.
>
> A test case doesn't seem that exciting to me.  If we were trying to
> make it actually work, then yeah, but throwing an error isn't that
> useful to test.  The code will be exercised by replication to a
> regular partitioned table (I assume we do have tests for that).

That's true, but we missed this case because of the absence of the
test case so I thought at least we can add it now to catch any future
bug in case of any behavior change.

> A completely different line of thought is that this doesn't seem
> like a terribly bulletproof fix, since children could get added to
> a partitioned table after we look.  Maybe it'd be better to check
> the relkind at the last moment before we do something that depends
> on a child table being a relation.

+1

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Segfault on logical replication to partitioned table with foreign children

From
ilya.v.gladyshev@gmail.com
Date:
On Sun, 2022-10-30 at 09:39 -0400, Tom Lane wrote:
> Dilip Kumar <dilipbalaut@gmail.com> writes:
> > Yes, this looks like a bug and your fix seems correct to me.  It
> > would
> > be nice to add a test case for this scenario.
> 
> A test case doesn't seem that exciting to me.  If we were trying to
> make it actually work, then yeah, but throwing an error isn't that
> useful to test.  The code will be exercised by replication to a
> regular partitioned table (I assume we do have tests for that).
> 
> What I'm wondering about is whether we can refactor this code
> to avoid so many usually-useless catalog lookups.  Pulling the
> namespace name, in particular, is expensive and we generally
> are not going to need the result.  In the child-rel case it'd
> be much better to pass the opened relation and let the error-check
> subroutine work from that.  Maybe we should just do it like that
> at the start of the recursion, too?  Or pass the relid and let
> the subroutine look up the names only in the error case.

Sure, I think passing in the opened relation is a good idea.

> A completely different line of thought is that this doesn't seem
> like a terribly bulletproof fix, since children could get added to
> a partitioned table after we look.  Maybe it'd be better to check
> the relkind at the last moment before we do something that depends
> on a child table being a relation.

These checks are run both on subscription DDL commands, which is good
to get some early feedback, and inside logical_rel_open(), right before
something useful is about to get done to the relation, so we should be
good here. I think some tests would actually be nice to verify this,
but I don't really have a strong opinion about it.

I'll refactor the patch and post a bit later.





Re: Segfault on logical replication to partitioned table with foreign children

From
Ilya Gladyshev
Date:
On Sun, 2022-10-30 at 16:52 +0100, Alvaro Herrera wrote:
> On 2022-Oct-28, ilya.v.gladyshev@gmail.com wrote:
> 
> > This will cause a segfault or raise an assert, because inserting
> > into
> > foreign tables via logical replication is not possible. The
> > solution I
> > propose is to add recursive checks of relkind for children of a
> > target,
> > if the target is a partitioned table.
> 
> However, I imagine that the only reason we don't support this is that
> the code hasn't been written yet. I think it would be better to write
> that code, so that we don't have to raise any error at all (unless
> the
> foreign table is something that doesn't support DML, in which case we
> would have to raise an error).  Of course, we still have to fix it in
> backbranches, but we can just do it as a targeted check at the moment
> of
> insert/update, not at the moment of subscription create/alter.
> 

Sure, this patch is just a quick fix. A proper implementation of
logical replication into foreign tables would be a much more difficult
undertaking. I think this patch is simple enough, the checks in the
patch are performed both on subscription DDL and when the relation is
opened for logical replication, so it gives both early feedback and
last-minute checks as well. All the code infrastructure for these kinds
of checks is already in place, so I think it's a good idea to use it.

P.S. sorry, duplicating the message, forgot to cc the mailing list the
first time




Re: Segfault on logical replication to partitioned table with foreign children

From
Ilya Gladyshev
Date:
On Mon, 2022-10-31 at 03:20 +0000, shiy.fnst@fujitsu.com wrote:
> On Sun, Oct 30, 2022 9:39 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > 
> > What I'm wondering about is whether we can refactor this code
> > to avoid so many usually-useless catalog lookups.  Pulling the
> > namespace name, in particular, is expensive and we generally
> > are not going to need the result.  In the child-rel case it'd
> > be much better to pass the opened relation and let the error-check
> > subroutine work from that.  Maybe we should just do it like that
> > at the start of the recursion, too?  Or pass the relid and let
> > the subroutine look up the names only in the error case.
> > 
> > A completely different line of thought is that this doesn't seem
> > like a terribly bulletproof fix, since children could get added to
> > a partitioned table after we look.  Maybe it'd be better to check
> > the relkind at the last moment before we do something that depends
> > on a child table being a relation.
> > 
> 
> I agree. So maybe we can add this check in
> apply_handle_tuple_routing().
> 
> diff --git a/src/backend/replication/logical/worker.c
> b/src/backend/replication/logical/worker.c
> index 5250ae7f54..e941b68e4b 100644
> --- a/src/backend/replication/logical/worker.c
> +++ b/src/backend/replication/logical/worker.c
> @@ -2176,6 +2176,10 @@ apply_handle_tuple_routing(ApplyExecutionData
> *edata,
>         Assert(partrelinfo != NULL);
>         partrel = partrelinfo->ri_RelationDesc;
> 
> +       /* Check for supported relkind. */
> +       CheckSubscriptionRelkind(partrel->rd_rel->relkind,
> +                                                        relmapentry-
> >remoterel.nspname, relmapentry->remoterel.relname);
> +
>         /*
>          * To perform any of the operations below, the tuple must
> match the
>          * partition's rowtype. Convert if needed or just copy, using
> a dedicated
> 
> 
> Regards,
> Shi yu

I have verified that the current patch handles the attaching of new
partitions to the target partitioned table by throwing an error on
attempt to insert into a foreign table inside the logical replication
worker. I have refactored the code to minimize cache lookups, but I am
yet to write the tests for this. See the attached patch for the
refactored version.



Attachment
Ilya Gladyshev <ilya.v.gladyshev@gmail.com> writes:
> [ v2-0001-check-relkind-of-subscription-target-recursively.patch ]

Hmm.  I like Shi yu's way better (formal patch attached).  Checking
at CREATE/ALTER SUBSCRIPTION is much more complicated, and it's really
insufficient, because what if someone adds a new partition after
setting up the subscription?

I get the argument about it being a useful check for simple mistakes,
but I don't entirely buy that argument, because I think there are
potential use-cases that it'd disallow needlessly.  Imagine a
partitioned table that receives replication updates, but only into
the "current" partition; older partitions are basically static.
Now suppose you'd like to offload some of that old seldom-used data
to another server, and make those partitions into foreign tables
so you can still access it at need.  Such a setup will work perfectly
fine today, but this patch would break it.

So I think what we want is to check when we identify the partition.
Maybe Shi yu missed a place or two to check, but I verified that the
attached stops the crash.

There'd still be value in refactoring to avoid premature lookup
of the namespace name, but that's just micro-optimization.

            regards, tom lane

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 5250ae7f54..17b9c42ecf 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -2176,6 +2176,15 @@ apply_handle_tuple_routing(ApplyExecutionData *edata,
     Assert(partrelinfo != NULL);
     partrel = partrelinfo->ri_RelationDesc;

+    /*
+     * Check for supported relkind.  We need this since partitions might be of
+     * unsupported relkinds; and the set of partitions can change, so checking
+     * at CREATE/ALTER SUBSCRIPTION would be insufficient.
+     */
+    CheckSubscriptionRelkind(partrel->rd_rel->relkind,
+                             get_namespace_name(RelationGetNamespace(partrel)),
+                             RelationGetRelationName(partrel));
+
     /*
      * To perform any of the operations below, the tuple must match the
      * partition's rowtype. Convert if needed or just copy, using a dedicated

Since we're getting pretty close to the next set of back-branch releases,
I went ahead and pushed a minimal fix along the lines suggested by Shi Yu.
(I realized that there's a second ExecFindPartition call in worker.c that
also needs a check.)  We still can at leisure think about refactoring
CheckSubscriptionRelkind to avoid unnecessary lookups.  I think that
is something we should do only in HEAD; it'll just be a marginal savings,
not worth the risks of API changes in stable branches.  The other loose
end is whether to worry about a regression test case.  I'm inclined not
to bother.  The only thing that isn't getting exercised is the actual
ereport, which probably isn't in great need of routine testing.

            regards, tom lane



Re: Segfault on logical replication to partitioned table with foreign children

From
Ilya Gladyshev
Date:
On Wed, 2022-11-02 at 12:37 -0400, Tom Lane wrote:
> Since we're getting pretty close to the next set of back-branch
> releases,
> I went ahead and pushed a minimal fix along the lines suggested by
> Shi Yu.
> (I realized that there's a second ExecFindPartition call in worker.c
> that
> also needs a check.)  We still can at leisure think about refactoring
> CheckSubscriptionRelkind to avoid unnecessary lookups.  I think that
> is something we should do only in HEAD; it'll just be a marginal
> savings,
> not worth the risks of API changes in stable branches.  The other
> loose
> end is whether to worry about a regression test case.  I'm inclined
> not
> to bother.  The only thing that isn't getting exercised is the actual
> ereport, which probably isn't in great need of routine testing.
> 
>                         regards, tom lane

I agree that early checks limit some of the functionality that was
available before, so I guess the only way to preserve it is to do only
the last-minute checks after routing, fair enough. As for the
refactoring of the premature lookup, I have done some work on that in
the previous patch, I think we can just use it. Attached a separate
patch with the refactoring.

Attachment