Re: [HACKERS] Logical replication and inheritance - Mailing list pgsql-hackers

From Amit Langote
Subject Re: [HACKERS] Logical replication and inheritance
Date
Msg-id 295a7e00-0038-ea5a-fff1-69588fb725ea@lab.ntt.co.jp
Whole thread Raw
In response to Re: [HACKERS] Logical replication and inheritance  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
Responses Re: [HACKERS] Logical replication and inheritance  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
List pgsql-hackers
On 2017/04/14 21:44, Petr Jelinek wrote:
> On 14/04/17 06:14, Amit Langote wrote:
>> On 2017/04/14 10:57, Petr Jelinek wrote:
>>> For me the current behavior with inherited tables is correct.
>>
>> By the way, what do you think about the pg_dump example/issue I mentioned?
>>  Is that a pg_dump problem or backend?  To reiterate, if you add an
>> inheritance parent to a publication, dump the database, and restore it
>> into another, an error occurs.  Why?  Because every child table is added
>> *twice* because of the way publication tables are dumped.  Once by itself
>> and again via inheritance expansion when the parent is added.  Adding a
>> table again to the same publication is currently an error, which I was
>> wondering if it could be made a no-op instead.
>>
> 
> That's good question. I think it's fair to treat membership of table in
> publication is "soft object" or "property" rather than real object where
> we would enforce error on ADD of something that's already there. So I am
> not against changing it to no-op (like doing alter sequence owned by to
> column which is the current owner already).

The patch committed by Peter should be enough for the pg_dump issue I was
talking about (a pg_dump fix after all).  It seems that at least Tom and
Robert are not too excited about making duplicate membership addition a
no-op, so I don't intend to push for it.

>>> What I would like partitioned tables support to look like is that if we
>>> add partitioned table, the data decoded from any of the partitions would
>>> be sent as if the change was for that partitioned table so that the
>>> partitioning scheme on subscriber does not need to be same as on
>>> publisher. That's nontrivial to do though probably.
>>
>> I agree that it'd be nontrivial.  I'm not sure if you're also implying
>> that a row decoded from a partition is *never* published as having been
>> inserted into the partition itself.  A row can end up in a partition via
>> both the inserts into the partitioned table and the partition itself.
>> Also, AFAICT, obviously the output pluggin would have to have a dedicated
>> logic to determine which table to publish a given row as coming from
>> (possibly the root partitioned table), since nothing decode-able from WAL
>> is going to have that information.
>>
> 
> Well, yes that what I mean by nontrivial, IMHO the hard part is defining
> behavior (the coding is the easy part here). I think there are more or
> less 3 options, a) either partitions can't be added to publication
> individually or b) they will always publish their data as their main
> partitioned table (which for output plugin means same thing, ie we'll
> only see the rows as changes to partitioned tables) or alternatively c)
> if partition is added and partitioned table is added we publish changes
> twice, but that seems like quite bad option to me.

Note that (a) will amount to reversing what v10 will support, that is, can
publish individual leaf partitions, but not the partitioned tables.

About (b): sounds good, but not sure of the interface.

About (c): agreed that a bad option

> This was BTW the reason why I was saying in the original partitioning
> thread that it's unclear to me from documentation what is general
> guiding principle in terms of threating partitions as individual objects
> or not. Currently it's mixed bag, structure is treated as global for
> whole partitioned tree, but things like indexes can be defined
> separately on individual partitions. Also existing tables retain some of
> their differences when they are being added as partitions but other
> differences are strictly checked and will result in error. I don't quite
> understand if this is current implementation limitation and we'll
> eventually "lock down" the differences (when we have global indexes and
> such) or if it's intended long term to allow differences between
> partitions and what will be the rules for what's allowed and what not.
> 
>> Also, with the current state of partitioning, if a row decoded and
>> published as coming from the partitioned table had no corresponding
>> partition defined on the destination server, an error will occur in the
>> subscription worker I'd guess.  Or may be we don't assume anything about
>> whether the table on the subscription end is partitioned or not.
>>
>> Anyway, that perhaps also means that for time being, we might need to go
>> with the following option that Robert mentioned (I suppose strictly in the
>> context of partitioned tables, not general inheritance):
>>
>> (1) That's an error; the user should publish the partitions instead.
>>
> 
> Yes I agree with Robert's statement and that's how it should behave already.

Yes, it does.

> 
>> That is, we should make adding a partitioned table to a publication a user
>> error (feature not supported).
>>
> 
> It already should be error no? (Unless some change was committed that I
> missed, I definitely wrote it as error in original patch).

Oh, you're right.  Although, the error message could be spelled a bit
differently, IMHO, which currently is:

create table p (a int, b char) partition by list (a);
create publication mypub for table p;
ERROR:  "p" is not a table
DETAIL:  Only tables can be added to publications.

We could be more explicit here and say the following instead:

create publication mypub for table p;
ERROR:  "p" is a partitioned table
DETAIL:  Adding partitioned tables to publications is not supported.

Thoughts?  (a patch is attached for consideration)

I think it could be argued that quite a few instances "%s is not a table"
could be preceded by such explicit check for if partitioned, but that
seems like a topic for a different thread.  Only a subset of those sites
are such that a table's *being partitioned* prevents it from being
processed, for which it might make sense to add the explicit check.

Thanks,
Amit

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: [HACKERS] Comments not allowed on partitioned table columns
Next
From: Masahiko Sawada
Date:
Subject: Re: [HACKERS] Quorum commit for multiple synchronous replication.