Re: Logical Replication WIP - Mailing list pgsql-hackers

From Steve Singer
Subject Re: Logical Replication WIP
Date
Msg-id 57CDE524.3050607@ssinger.info
Whole thread Raw
In response to Re: Logical Replication WIP  (Steve Singer <steve@ssinger.info>)
Responses Re: Logical Replication WIP
List pgsql-hackers
On 09/05/2016 03:58 PM, Steve Singer wrote:
> On 08/31/2016 04:51 PM, Petr Jelinek wrote:
>> Hi,
>>
>> and one more version with bug fixes, improved code docs and couple 
>> more tests, some general cleanup and also rebased on current master 
>> for the start of CF.
>>
>>
>>
>

A few more things I noticed when playing with the patches

1, Creating a subscription to yourself ends pretty badly,
the 'CREATE SUBSCRIPTION' command seems to get stuck, and you can't kill 
it.  The background process seems to be waiting for a transaction to 
commit (I assume the create subscription command).  I had to kill -9 the 
various processes to get things to stop.  Getting confused about 
hostnames and ports is a common operator error.

2. Failures during the initial subscription  aren't recoverable

For example

on db1  create table a(id serial4 primary key,b text);  insert into a(b) values ('1');  create publication testpub for
tablea;
 

on db2  create table a(id serial4 primary key,b text);  insert into a(b) values ('1');  create subscription testsub
connection'host=localhost port=5440 
 
dbname=test' publication testpub;

I then get in my db2 log

ERROR:  duplicate key value violates unique constraint "a_pkey"
DETAIL:  Key (id)=(1) already exists.
LOG:  worker process: logical replication worker 16396 sync 16387 (PID 
10583) exited with exit code 1
LOG:  logical replication sync for subscription testsub, table a started
ERROR:  could not crate replication slot "testsub_sync_a": ERROR: 
replication slot "testsub_sync_a" already exists


LOG:  worker process: logical replication worker 16396 sync 16387 (PID 
10585) exited with exit code 1
LOG:  logical replication sync for subscription testsub, table a started
ERROR:  could not crate replication slot "testsub_sync_a": ERROR: 
replication slot "testsub_sync_a" already exists


and it keeps looping.
If I then truncate "a" on db2 it doesn't help. (I'd expect at that point 
the initial subscription to work)

If I then do on db2 drop subscription testsub cascade;

I still see a slot in use on db1

select * FROM pg_replication_slots ;   slot_name    |  plugin  | slot_type | datoid | database | active | 
active_pid | xmin | catalog_xmin | rest
art_lsn | confirmed_flush_lsn
----------------+----------+-----------+--------+----------+--------+------------+------+--------------+-----
--------+--------------------- testsub_sync_a | pgoutput | logical   |  16384 | test     | f 
|            |      |         1173 | 0/15
66E08   | 0/1566E40







pgsql-hackers by date:

Previous
From: Claudio Freire
Date:
Subject: Re: Vacuum: allow usage of more than 1GB of work mem
Next
From: Peter Geoghegan
Date:
Subject: Bug in 9.6 tuplesort batch memory growth logic