Re: [HACKERS] Logical replication existing data copy - Mailing list pgsql-hackers

From Erik Rijkers
Subject Re: [HACKERS] Logical replication existing data copy
Date
Msg-id 479dceb8c0acc8026ced81caf9cce750@xs4all.nl
Whole thread Raw
In response to Re: [HACKERS] Logical replication existing data copy  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
Responses Re: [HACKERS] Logical replication existing data copy
Re: [HACKERS] Logical replication existing data copy
List pgsql-hackers
On 2017-02-08 23:25, Petr Jelinek wrote:

> 0001-Use-asynchronous-connect-API-in-libpqwalreceiver-v2.patch
> 0002-Always-initialize-stringinfo-buffers-in-walsender-v2.patch
> 0003-Fix-after-trigger-execution-in-logical-replication-v2.patch
> 0004-Add-RENAME-support-for-PUBLICATIONs-and-SUBSCRIPTION-v2.patch
> 0001-Logical-replication-support-for-initial-data-copy-v4.patch

Apart from the failing one make check test (test 'object_address') which 
I reported earlier, I find it is easy to 'confuse' the replication.

I attach a script that intends to test the default COPY DATA.   There 
are two instances, initially without any replication.  The script inits 
pgbench on the master, adds a serial column to pgbench_history, and 
dump-restores the 4 pgbench-tables to the future replica.  It then 
empties the 4 pgbench-tables on the 'replica'.  The idea is that when 
logrep is initiated, data will be replicated from master, with the end 
result being that there are 4 identical tables on master and replica.

This often works but it also fails far too often (in my hands).  I test 
whether the tables are identical by comparing an md5 from an ordered 
resultset, from both replica and master.  I estimate that 1 in 5 tries 
fail; 'fail'  being a somewhat different table on replica (compared to 
mater), most often pgbench_accounts (typically there are 10-30 differing 
rows).  No errors or warnings in either logfile.   I'm not sure but I 
think testing on faster machines seem to be doing somewhat better 
('better' being less replication error).

Another, probably unrelated, problem occurs (but much more rarely) when 
executing 'DROP SUBSCRIPTION sub1' on the replica (see the beginning of 
the script).  Sometimes that command hangs, and refuses to accept 
shutdown of the server.  I don't know how to recover from this -- I just 
have to kill the replica server (master server still obeys normal 
shutdown) and restart the instances.

The script accepts 2 parameters, scale and clients (used in pgbench -s 
resp. -c)

I don't think I've managed to successfully run the script with more than 
1 client yet.

Can you have a look whether this is reproducible elsewhere?

thanks,

Erik Rijkers







-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
Next
From: Ryan Murphy
Date:
Subject: [HACKERS] Access inside pg_node_tree from query?