Home > mailing lists

Re: Transaction-controlled robustness for replication - Mailing list pgsql-hackers

From	Markus Wanner
Subject	Re: Transaction-controlled robustness for replication
Date	July 23, 2008 06:21:01
Msg-id	4886CAD2.7020103@bluegap.ch Whole thread Raw
In response to	Re: Transaction-controlled robustness for replication (Jens-Wolfhard Schicke <drahflow@gmx.de>)
List	pgsql-hackers

Tree view

Hi,

Jens-Wolfhard Schicke wrote:
>> * Does WAL get forced to disk on primary at commit time?
>> * Does WAL get forced across link to standby at commit time?
>> * Does WAL get forced to disk on standby at commit time?
> * Does WAL get applied [and synced] to disk on standby at commit time?

I think that's what Simon means by his question no 3. It wouldn't make 
much sense to me otherwise.

I'm assuming the standby node has it's own physical format, so the 
changes from the remote WAL need to be transformed to a local WAL, which 
then needs to be written to disc. For Postgres, this pretty much means 
applying the changes and committing them. You never need to store the 
remote WAL on physical storage, what would that be good for?

> I think that questions 2 and 3 are trivially bundled together. Once the
> user can specify 2, implementing 3 should be trivial and vice versa.

That might well be, yes. The code to collect changes from a transaction 
and then apply them remotely is pretty much the same, no matter when it 
is being executed. But it certainly makes a difference in the balance 
between performance and availability, which is a decision the user 
should be able to make for his specific application (or even better, per 
transaction, as proposed here and in Postgres-R).

> I am not even convinced that these need to be two different parameters.

Consider a standby heavily loaded (i/o) with some OLAP queries. Why 
should the master wait until the standby has written anything to disk 
for him?

> Also please note that an answer of "yes" to 3 means that 2 must also
> be answered "yes".

Agreed. There's no 'AS' mode possible, only 'SS', 'SA' and 'SS'.

> How about creating named modes? This would give the user the ability to
> define more fine-grained control especially in larger clusters of fail-over/read-only
> servers without totally clogging the parameter space and application code.
> Whether this should be done SQL-style or in some config file is not so clear to me,
> although I'd prefer SQL-style like
> 
> CREATE SYNCHRONIZING MODE immediate_readonly AS
>   LOCAL        SYNCHRONOUS APPLY
>   192.168.0.10 SYNCHRONOUS APPLY        -- read-only slave
>   192.168.0.11 SYNCHRONOUS APPLY        -- read-only slave
>   192.168.0.20 SYNCHRONOUS SHIP         -- backup-server
>   192.168.0.21 SYNCHRONOUS SHIP         -- backup-server
>   192.168.0.30 SYNHCRONOUS FSYNC        -- backup-server with fast disks
> ;

Hm.. that's an interesting idea. Especially considering the number of 
options that arise with more than two or three nodes, where you maybe 
also want to specify how many nodes must have written the changes to 
disk before confirming the commit.

In Postgres-R, I've added a TRANSACTION REPLICATION LEVEL, which can be 
either SYNC, EAGER or LAZY. Maybe that's not quite sufficient. On the 
other hand, I don't think any other option here makes any sense. (Above, 
you yourself doubt that sync is different enough from eager).

Regards

Markus

pgsql-hackers by date:

From: Markus Wanner
Date: 23 July 2008, 05:42:58
Subject: Re: Postgres-R: tuple serialization

From: Simon Riggs
Date: 23 July 2008, 07:01:28
Subject: Re: [PATCH]-hash index improving

Re: Transaction-controlled robustness for replication - Mailing list pgsql-hackers

Previous

Next