Home > mailing lists

Re: Improve handling of parameter differences in physical replication - Mailing list pgsql-hackers

From	Fujii Masao
Subject	Re: Improve handling of parameter differences in physical replication
Date	February 27, 2020 13:13:54
Msg-id	27894a76-d498-f3fd-d77f-7c03140fbfe9@oss.nttdata.com Whole thread Raw
In response to	Improve handling of parameter differences in physical replication (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Responses	Re: Improve handling of parameter differences in physical replication (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
List	pgsql-hackers

Tree view


On 2020/02/27 17:23, Peter Eisentraut wrote:
> When certain parameters are changed on a physical replication primary,   this is communicated to standbys using the
XLOG_PARAMETER_CHANGEWAL record.  The standby then checks whether its own settings are at least as big as the ones on
theprimary.  If not, the standby shuts down with a fatal error.
 
> 
> The correspondence of settings between primary and standby is required because those settings influence certain
sharedmemory sizings that are required for processing WAL records that the primary might send.  For example, if the
primarysends a prepared transaction, the standby must have had max_prepared_transaction set appropriately or it won't
beable to process those WAL records.
 
> 
> However, fatally shutting down the standby immediately upon receipt of the parameter change record might be a bit of
anoverreaction.  The resources related to those settings are not required immediately at that point, and might never be
requiredif the activity on the primary does not exhaust all those resources.  An extreme example is raising
max_prepared_transactionson the primary but never actually using prepared transactions.
 
> 
> Where this becomes a serious problem is if you have many standbys and you do a failover.  If the newly promoted
standbyhappens to have a higher setting for one of the relevant parameters, all the other standbys that have followed
itthen shut down immediately and won't be able to continue until you change all their settings.
 
> 
> If we didn't do the hard shutdown and we just let the standby roll on with recovery, nothing bad will happen and it
willeventually produce an appropriate error when those resources are required (e.g., "maximum number of prepared
transactionsreached").
 
> 
> So I think there are better ways to handle this.  It might be reasonable to provide options.  The attached patch
doesn'tdo that but it would be pretty easy.  What the attached patch does is:
 
> 
> Upon receipt of XLOG_PARAMETER_CHANGE, we still check the settings but only issue a warning and set a global flag if
thereis a problem.  Then when we actually hit the resource issue and the flag was set, we issue another warning message
withrelevant information.  Additionally, at that point we pause recovery instead of shutting down, so a hot standby
remainsusable.  (That could certainly be configurable.)
 

+1
> Btw., I think the current setup is slightly buggy.  The MaxBackends value that is used to size shared memory is
computedas MaxConnections + autovacuum_max_workers + 1 + max_worker_processes + max_wal_senders, but we don't track
autovacuum_max_workersin WAL.
 

Maybe this is because autovacuum doesn't work during recovery?

Regards,

-- 
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

pgsql-hackers by date:

From: Sergei Kornilov
Date: 27 February 2020, 12:48:14
Subject: Re: Improve handling of parameter differences in physical replication

From: Asif Rehman
Date: 27 February 2020, 13:57:09
Subject: Re: Online verification of checksums

Re: Improve handling of parameter differences in physical replication - Mailing list pgsql-hackers

Previous

Next