Re: Sync Rep v19 - Mailing list pgsql-hackers

From Yeb Havinga
Subject Re: Sync Rep v19
Date
Msg-id 4D70CBD6.2020004@gmail.com
Whole thread Raw
In response to Sync Rep v19  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: Sync Rep v19
Re: Sync Rep v19
List pgsql-hackers
On 2011-03-03 11:53, Simon Riggs wrote:
> Latest version of Sync Rep, which includes substantial internal changes
> and simplifications from previous version. (25-30 changes).
Testing more with the post v19 version from github with HEAD

commit 009875662e1b47012e1f4b7d30eb9e238d1937f6
Author: Simon Riggs <simon@2ndquadrant.com>
Date:   Fri Mar 4 06:13:43 2011 +0000
    Allow SIGTERM messages in ProcessInterrupts() even when interrupts are    held, if WaitingForSyncRep


1) unexpected behaviour
- master has synchronous_standby_names = 'standby1,standby2,standby3'
- standby with 'standby2' connects first.
- LOG:  00000: standby "standby2" is now the synchronous standby with 
priority 2

I'm still confused by the priority numbers. At first I thought that 
priority 1 meant: this is the one that is currently waited for. Now I'm 
not sure if this is the first potential standby that is not used, or 
that it is actually the one waited for.
What I expected was that it would be connected with priority 1. And then 
if the standby1 connect, it would become the one with prio1 and standby2 
with prio2.

2) unexpected behaviour
- continued from above
- standby with 'asyncone' name connects next
- no log message on master

I expected a log message along the lines 'standby "asyncone" is now an 
asynchronous standby'

3) more about log messages
- didn't get a log message that the asyncone standby stopped
- didn't get a log message that standby1 connected with priority 1
- after stop / start master, again only got a log that standby2 
connectied with priority 2
- pg_stat_replication showed both standb1 and standby2 with correct prio#

4) More about the priority stuff. At this point I figured out prio 2 can 
also be 'the real sync'. Still I'd prefer in pg_stat_replication a 
boolean that clearly shows 'this is the one', with a source that is 
intimately connected to the syncrep implemenation, instead of a 
different implementation of 'if lowest connected priority and > 0, then 
sync is true. If there are two different implementations, there is room 
for differences, which doesn't feel right.

5) performance.
Seems to have dropped a a few dozen %. With v17 I earlier got ~650 tps 
and after some more tuning over 900 tps. Now with roughly the same setup 
I get ~ 550 tps. Both versions on the same hardware, both compiled 
without debugging, and I used the same postgresql.conf start config.

I'm currently thinking about a failure test that would check if a commit 
has really waited for the standby. What's the worst thing to do to a 
master server? Ideas are welcome :-)

#!/bin/sh
psql -c "create a big table with generate_series"
echo 1 > /proc/sys/kernel/sysrq ; echo b > /proc/sysrq-trigger

regards,
Yeb Havinga



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Sync Rep v19
Next
From: Simon Riggs
Date:
Subject: Re: Sync Rep v19