Re: Check the number of potential synchronous standbys - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Check the number of potential synchronous standbys
Date
Msg-id 17703.1566852805@sss.pgh.pa.us
Whole thread Raw
In response to Check the number of potential synchronous standbys  ("张文杰" <757634191@qq.com>)
List pgsql-hackers
"=?gb18030?B?1cXOxL3c?=" <757634191@qq.com> writes:
> When the number of potential synchronous standbys is smaller than num_sync, such as 'FIRST 3 (1,2)', 'ANY 4 (1,2,3)'
inthe synchronous_standby_names, the processes will wait for synchronous replication forever.  
> Obviously, it's not expected. I think return false and a error message may be better. And attached is a patch that
implementsthe simple check.  

Well, it's not *that* simple; this patch rejects cases like "ANY 2(*)"
which need to be accepted.  That causes the src/test/recovery tests
to fail (you should have tried check-world).

I also observe that there's a test case in 007_sync_rep.pl which is
actually exercising the case you want to reject:

# Check that sync_state of each standby is determined correctly
# when num_sync exceeds the number of names of potential sync standbys
# specified in synchronous_standby_names.
test_sync_state(
    $node_master, qq(standby1|0|async
standby2|4|sync
standby3|3|sync
standby4|1|sync),
    'num_sync exceeds the num of potential sync standbys',
    '6(standby4,standby0,standby3,standby2)');

So it can't be said that nobody thought about this at all.

Now, I'm not convinced that this represents a useful use-case as-is.
However, because we can't know how many standbys may match "*",
it's clear that the code has to do something other than just
abort when the situation happens.  Conceivably we could fail at
runtime (not GUC parse time) if the number of required standbys
exceeds the number available, rather than waiting indefinitely.
However, if standbys can come online dynamically, a wait involving
"*" might be satisfiable after awhile even if it isn't immediately.

On the whole, given the fuzziness around "*", I'm not sure that
it's easy to make this much better.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: subscriptionCheck failures on nightjar
Next
From: Tom Lane
Date:
Subject: Re: old_snapshot_threshold vs indexes