Re: Logical Replication WIP - Mailing list pgsql-hackers
| From | Petr Jelinek |
|---|---|
| Subject | Re: Logical Replication WIP |
| Date | |
| Msg-id | e48834c5-1db9-b381-3b7e-8f1ecb04dddd@2ndquadrant.com Whole thread Raw |
| In response to | Re: Logical Replication WIP (Steve Singer <steve@ssinger.info>) |
| List | pgsql-hackers |
On 05/09/16 23:35, Steve Singer wrote:
> On 09/05/2016 03:58 PM, Steve Singer wrote:
>> On 08/31/2016 04:51 PM, Petr Jelinek wrote:
>>> Hi,
>>>
>>> and one more version with bug fixes, improved code docs and couple
>>> more tests, some general cleanup and also rebased on current master
>>> for the start of CF.
>>>
>>>
>>>
>>
>
> A few more things I noticed when playing with the patches
>
> 1, Creating a subscription to yourself ends pretty badly,
> the 'CREATE SUBSCRIPTION' command seems to get stuck, and you can't kill
> it. The background process seems to be waiting for a transaction to
> commit (I assume the create subscription command). I had to kill -9 the
> various processes to get things to stop. Getting confused about
> hostnames and ports is a common operator error.
>
Hmm I guess there is missing interrupts check, will look. It would be
great to detect it properly but I am not really sure how to do that as
afaik there is no accurate way to detect that the connection is to yourself.
> 2. Failures during the initial subscription aren't recoverable
>
> For example
>
> on db1
> create table a(id serial4 primary key,b text);
> insert into a(b) values ('1');
> create publication testpub for table a;
>
> on db2
> create table a(id serial4 primary key,b text);
> insert into a(b) values ('1');
> create subscription testsub connection 'host=localhost port=5440
> dbname=test' publication testpub;
>
> I then get in my db2 log
>
> ERROR: duplicate key value violates unique constraint "a_pkey"
> DETAIL: Key (id)=(1) already exists.
> LOG: worker process: logical replication worker 16396 sync 16387 (PID
> 10583) exited with exit code 1
> LOG: logical replication sync for subscription testsub, table a started
> ERROR: could not crate replication slot "testsub_sync_a": ERROR:
> replication slot "testsub_sync_a" already exists
>
>
> LOG: worker process: logical replication worker 16396 sync 16387 (PID
> 10585) exited with exit code 1
> LOG: logical replication sync for subscription testsub, table a started
> ERROR: could not crate replication slot "testsub_sync_a": ERROR:
> replication slot "testsub_sync_a" already exists
>
>
> and it keeps looping.
> If I then truncate "a" on db2 it doesn't help. (I'd expect at that point
> the initial subscription to work)
Hmm, looks like the error case does not cleanup correctly after itself.
>
> If I then do on db2
> drop subscription testsub cascade;
>
> I still see a slot in use on db1
>
> select * FROM pg_replication_slots ;
> slot_name | plugin | slot_type | datoid | database | active |
> active_pid | xmin | catalog_xmin | rest
> art_lsn | confirmed_flush_lsn
> ----------------+----------+-----------+--------+----------+--------+------------+------+--------------+-----
>
> --------+---------------------
> testsub_sync_a | pgoutput | logical | 16384 | test | f
> | | | 1173 | 0/15
> 66E08 | 0/1566E40
>
Same as above.
-- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services
pgsql-hackers by date: