Re: Logical Replication WIP - Mailing list pgsql-hackers

From Petr Jelinek
Subject Re: Logical Replication WIP
Date
Msg-id 17b4b439-74e3-8a00-ac14-d02ee3a761e6@2ndquadrant.com
Whole thread Raw
In response to Re: Logical Replication WIP  (Stas Kelvich <s.kelvich@postgrespro.ru>)
Responses Re: Logical Replication WIP  (Stas Kelvich <s.kelvich@postgrespro.ru>)
List pgsql-hackers
Hi,

On 11/08/16 13:34, Stas Kelvich wrote:
>
> * max_logical_replication_workers mentioned everywhere in docs, but guc.c defines
> variable called max_logical_replication_processes for postgresql.conf
>

Ah changed it in code but not in docs, will fix.

> * Since pg_subscription already shared across the cluster, it can be also handy to
> share pg_publications too and allow publication of tables from different databases. That
> is rare scenarios but quite important for virtual hosting use case — tons of small databases
> in a single postgres cluster.

You can't decode changes from multiple databases in one slot so I don't 
see the usefulness there. The pg_subscription is currently shared 
because it's technical necessity (as in I don't see how to solve the 
need to access the catalog from launcher in any other way) not because I 
think it's great design :)

>
> * There is no way to see attachecd tables/schemas to publication through \drp
>

That's mostly intentional as publications for table are visible in \d, 
but I am not against adding it to \drp.

> * As far as I understand there is no way to add table/tablespace right in CREATE
> PUBLICATION and one need explicitly do ALTER PUBLICATION right after creation.
> May be add something like WITH TABLE/TABLESPACE to CREATE?
>

Yes, as I said to Masahiko Sawada, it's just not there yet but I plan to 
have that.

> * So binary protocol goes into core. Is it still possible to use it as decoding plugin for
> manually created walsender? May be also include json as it was in pglogical? While
> i’m not arguing that it should be done, i’m interested about your opinion on that.
>

Well the plugin is bit more integrated into the publication infra so if 
somebody would want to use it directly they'd have to use that part as 
well. OTOH the protocol itself is provided as API so it's reusable by 
other plugins if needed.

JSON plugin is something that would be nice to have in core as well, but 
I don't think it's part of this patch.

> * Also I’ve noted that you got rid of reserved byte (flags) in protocol comparing to
> pglogical_native. It was very handy to use it for two phase tx decoding (0 — usual
> commit, 1 — prepare, 2 — commit prepared), because both prepare and commit
> prepared generates commit record in xlog.

Hmm maybe commit message could get it back. PGLogical has them sprinkled 
all around the protocol which I don't really like so I want to limit 
them to the places where they are actually useful.

>
>> On 05 Aug 2016, at 18:00, Petr Jelinek <petr@2ndquadrant.com> wrote:
>>
>> - DDL, I see several approaches we could do here for 10.0. a) don't
>>   deal with DDL at all yet, b) provide function which pushes the DDL
>>   into replication queue and then executes on downstream (like
>>   londiste, slony, pglogical do), c) capture the DDL query as text
>>   and allow user defined function to be called with that DDL text on
>>   the subscriber
>
> * Since here DDL is mostly ALTER / CREATE / DROP TABLE (or am I wrong?) may be
> we can add something like WITH SUBSCRIBERS to statements?
>

Not sure I follow. How does that help?

> * Talking about exact mechanism of DDL replication I like you variant b), but since we
> have transactional DDL, we can do two phase commit here. That will require two phase
> decoding and some logic about catching prepare responses through logical messages. If that
> approach sounds interesting i can describe proposal in more details and create a patch.
>

I'd think that such approach is somewhat more interesting with c) 
honestly. The difference between b) and c) is mostly about explicit vs 
implicit. I definitely would like to see the 2PC patch updated to work 
with this. But maybe it's wise to wait a while until the core of the 
patch stabilizes during the discussion.

> * Also I wasn’t able actually to run replication itself =) While regression tests passes, TAP
> tests and manual run stuck. pg_subscription_rel.substate never becomes ‘r’. I’ll investigate
> that more and write again.

Interesting, please keep me posted. It's possible for tables to stay in 
's' state for some time if there is nothing happening on the server, but 
that should not mean anything is stuck.

>
> * As far as I understand sync starts automatically on enabling publication. May be split that
> logic into a different command with some options? Like don’t sync at all for example.
>

I think SYNC should be option of subscription creation just like 
INITIALLY ENABLED/DISABLED is. And then there should be interface to 
resync a table manually (like pglogical has). Not yet sure how that 
interface should look like in terms of DDL though.

> * When I’m trying to create subscription to non-existent publication, CREATE SUBSRITION
> creates replication slot and do not destroys it:
>
> # create subscription sub connection 'host=127.0.0.1 dbname=postgres' publication mypub;
> NOTICE:  created replication slot "sub" on provider
> ERROR:  could not receive list of replicated tables from the provider: ERROR:  cache lookup failed for publication 0
> CONTEXT:  slot "sub", output plugin "pgoutput", in the list_tables callback
>
> after that:
>
> postgres=# drop subscription sub;
> ERROR:  subscription "sub" does not exist
> postgres=# create subscription sub connection 'host=127.0.0.1 dbname=postgres' publication pub;
> ERROR:  could not crate replication slot "sub": ERROR:  replication slot "sub" already exists
>

See the TODO in CreateSubscription function :)

> * Also can’t drop subscription:
>
> postgres=# \drs
>                           List of subscriptions
>  Name | Database | Enabled | Publication |            Conninfo
> ------+----------+---------+-------------+--------------------------------
>  sub  | postgres | t       | {mypub}     | host=127.0.0.1 dbname=postgres
> (1 row)
>
> postgres=# drop subscription sub;
> ERROR:  unrecognized object class: 6102

Yes that has been already reported.

>
> * Several time i’ve run in a situation where provider's postmaster ignores Ctrl-C until subscribed
> node is switched off.
>

Hmm I guess there is bug in signal processing code somewhere.


> * Patch with small typos fixed attached.
>
> I’ll do more testing, just want to share what i have so far.
>

Thanks for both.

--   Petr Jelinek                  http://www.2ndQuadrant.com/  PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Shay Rojansky
Date:
Subject: Re: Slowness of extended protocol
Next
From: Tobias Bussmann
Date:
Subject: Re: extract text from XML