Re: Sync Rep Design - Mailing list pgsql-hackers

From Stefan Kaltenbrunner
Subject Re: Sync Rep Design
Date
Msg-id 4D1CEEAF.8050209@kaltenbrunner.cc
Whole thread Raw
In response to Re: Sync Rep Design  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: Sync Rep Design
Re: Sync Rep Design
List pgsql-hackers
On 12/30/2010 08:04 PM, Simon Riggs wrote:
> On Thu, 2010-12-30 at 18:42 +0100, Stefan Kaltenbrunner wrote:
>
>> it would help if this would just be a simple text-only description of
>> the design that people can actually comment on inline. I don't think
>> sending technical design proposals as a pdf (which seems to be written
>> in doc-style as well) is a good idea to encourage discussion on -hackers :(
>
> 25.2.6. Synchronous Replication
> Streaming replication is by default asynchronous. Transactions on the
> primary server write commit records to WAL, yet do not know whether or
> when a standby has received and processed those changes. So with
> asynchronous replication, if the primary crashes, transactions committed
> on the primary might not have been received by any standby. As a result,
> failover from primary to standby could cause data loss because
> transaction completions are absent, relative to the primary. The amount
> of data loss is proportional to the replication delay at the time of
> failover.
>
> Synchronous replication offers the ability to guarantee that all changes
> made by a transaction have been transferred to at least one remote
> standby server. This is an extension to the standard level of durability
> offered by a transaction commit. This is referred to as semi-synchronous
> replication.
>
> When synchronous replication is requested, the commit of a write
> transaction will wait until confirmation that the commit record has been
> transferred successfully to at least one standby server. Waiting for
> confirmation increases the user's confidence that the changes will not
> be lost in the event of server crashes but it also necessarily increases
> the response time for the requesting transaction. The minimum wait time
> is the roundtrip time from primary to standby.

hmm this is one of the main problems I see with the proposed "master is 
sometimes aware of the standby"(as in the feedback mode) concept this 
proposal has. If it waits for only one of the standbys there is some 
issue with the terminology. As a DBA I would expect the master to only 
return if ALL of the "sync replication" declared nodes replied ok.


>
> Read only transactions and transaction rollbacks need not wait for
> replies from standby servers. Subtransaction commits do not wait for
> responses from standby servers, only final top-level commits. Long
> running actions such as data loading or index building do not wait until
> the very final commit message.
>
>
> 25.2.6.1. Basic Configuration
> Synchronous replication must be enabled on both the primary and at least
> one standby server. If synchronous replication is disabled on the
> master, or enabled on the primary but not enabled on any slaves, the
> primary will use asynchronous replication by default.
>
> We use a single parameter to enable synchronous replication, set in
> postgresql.conf on both primary and standby servers:

this reads as if you can only set it there

>
> synchronous_replication = off (default) | on
>
> On the primary, synchronous_replication can be set for particular users
> or databases, or dynamically by applications programs.

this says otherwise

>
> If more than one standby server specifies synchronous_replication, then
> whichever standby replies first will release waiting commits.

see above for why I think this violates the configuration promise - if I 
say "this is a sync standby" I better expect it to be...

>
> Turning this setting off for a standby allows the administrator to
> exclude certain standby servers from releasing waiting transactions.
> This is useful if not all standby servers are designated as potential
> future primary servers. On the standby, this parameter only takes effect
> at server start.
>
>
> 25.2.6.2. Planning for Performance
> Synchronous replication usually requires carefully planned and placed
> standby servers to ensure applications perform acceptably. Waiting
> doesn't utilise system resources, but transaction locks continue to be
> held until the transfer is confirmed. As a result, incautious use of
> synchronous replication will reduce performance for database
> applications because of increased response times and higher contention.
>
> PostgreSQL allows the application developer to specify the durability
> level required via replication. This can be specified for the system
> overall, though it can also be specified for specific users or
> connections, or even individual transactions.
>
> For example, an application workload might consist of: 10% of changes
> are important customer details, while 90% of changes are less important
> data that the business can more easily survive if it is lost, such as
> chat messages between users.
>
> With synchronous replication options specified at the application level
> (on the master) we can offer sync rep for the most important changes,
> without slowing down the bulk of the total workload. Application level
> options are an important and practical tool for allowing the benefits of
> synchronous replication for high performance applications. This feature
> is unique to PostgreSQL.

that seems to be a bit too much marketing for a reference level document

>
>
> 25.2.6.3. Planning for High Availability
> The easiest and safest method of gaining High Availability using
> synchronous replication is to configure at least two standby servers. To
> understand why, we need to examine what can happen when you lose all
> standby servers.
>
> Commits made when synchronous_replication is set will wait until at
> least one standby responds. The response may never occur if the last, or
> only, standby should crash or the network drops. What should we do in
> that situation?
>
> Sitting and waiting will typically cause operational problems because it
> is an effective outage of the primary server. Allowing the primary
> server to continue processing in the absence of a standby puts those
> latest data changes at risk. How we handle this situation is controlled
> by allow_standalone_primary. The default setting is on, allowing
> processing to continue, though there is no recommended setting. Choosing
> the best setting for allow_standalone_primary is a difficult decision
> and best left to those with combined business responsibility for both
> data and applications. The difficulty of this choice is the reason why
> we recommend that you reduce the possibility of this situation occurring
> by using multiple standby servers.

if there is no recommended setting what will be the default?

[...]
>
> 25.5.2. Handling query conflicts
> ….
>
>
> Remedial possibilities exist if the number of standby-query
> cancellations is found to be unacceptable. Typically the best option is
> to enable hot_standby_feedback. This prevents VACUUM from removing
> recently-dead rows and so cleanup conflicts do not occur. If you do
> this, you should note that this will delay cleanup of dead rows on the
> primary, which may result in undesirable table bloat. However, the
> cleanup situation will be no worse than if the standby queries were
> running directly on the primary server. You are still getting the
> benefit of off-loading execution onto the standby and the query may
> complete faster than it would have done on the primary server.
> max_standby_archive_delay must be kept large in this case, because
> delayed WAL files might already contain entries that conflict with the
> desired standby queries.
>
>
> …
>
>
> 18.5.6. Standby Servers
> These settings control the behavior of a standby server that is to
> receive replication data.
>
>
> hot_standby (boolean)
>          Specifies whether or not you can connect and run queries during
>          recovery, as described in Section 25.5. The default value is
>          off. This parameter can only be set at server start. It only has
>          effect during archive recovery or in standby mode.
> hot_standby_feedback (boolean)
>          Specifies whether or not a hot standby will send feedback to the
>          primary about queries currently executing on the standby. This
>          parameter can be used to eliminate query cancels caused by
>          cleanup records, though it can cause database bloat on the
>          primary for some workloads. The default value is off. This
>          parameter can only be set at server start. It only has effect if
>          hot_standby is enabled.

so if this is enabled - suddenly the master becomes (kinda) aware of the 
specifics of a given standby - but what happens when one of the standby 
is offline for a while how does the master know that?
What I'm really missing with that proposal is how people expect that 
solution to be managed - given there is only sometimes a feedback 
channel into the master you can't do the monitoring.
Even if you could (which we really need!) there is nothing in the 
proposal yet that will help to determine on what the most recent standby 
(in the case of more >1 sync standby) might be.
It also does not address the more general (not sync rep specific) 
problem of how to deal with max_keep_segments which is a wart and I was 
hoping we could get rid of in 9.1 - but it would require a real standby 
registration or at least standby management possibility on the master 
not a halfway done one - so do we really need hot_standby_feedback as 
part of the inital sync-rep patch?


Stefan


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Sync Rep Design
Next
From: Aidan Van Dyk
Date:
Subject: Re: Sync Rep Design