Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node
Date
Msg-id 4FE17043.60403@enterprisedb.com
Whole thread Raw
In response to Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Responses Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On 20.06.2012 01:27, Kevin Grittner wrote:
> Andres Freund<andres@2ndquadrant.com>  wrote:
>
>> Yes, thats definitely a valid use-case. But that doesn't preclude
>> the other - also not uncommon - use-case where you want to have
>> different master which all contain up2date data.
>
> I agree.  I was just saying that while one requires an origin_id,
> the other doesn't.  And those not doing MM replication definitely
> don't need it.

I think it would be helpful to list down a few concrete examples of 
this. The stereotypical multi-master scenario is that you have a single 
table that's replicated to two servers, and you can insert/update/delete 
on either server. Conflict resolution stretegies vary.

The reason we need an origin id in this scenario is that otherwise this 
will happen:

1. A row is updated on node A
2. Node B receives the WAL record from A, and updates the corresponding 
row in B. This generates a new WAL record.
3. Node A receives the WAL record from B, and updates the rows again. 
This again generates a new WAL record, which is replicated to A, and you 
loop indefinitely.

If each WAL record carries an origin id, node A can use it to refrain 
from applying the WAL record it receives from B, which breaks the loop.

However, note that in this simple scenario, if the logical log replay / 
conflict resolution is smart enough to recognize that the row has 
already been updated, because the old and the new rows are identical, 
the loop is broken at step 3 even without the origin id. That works for 
the newest-update-wins and similar strategies. So the origin id is not 
absolutely necessary in this case.

Another interesting scenario is that you maintain a global counter, like 
in an inventory system, and conflicts are resolved by accumulating the 
updates. For example, if you do "UPDATE SET counter = counter + 1" 
simultaneously on two nodes, the result is that the counter is 
incremented by two. The avoid-update-if-already-identical optimization 
doesn't help in this case, the origin id is necessary.

Now, let's take the inventory system example further. There are actually 
two ways to update a counter. One is when an item is checked in or out 
of the warehouse, ie. "UPDATE counter = counter + 1". Those updates 
should accumulate. But another operation resets the counter to a 
specific value, "UPDATE counter = 10", like when taking an inventory. 
That should not accumulate with other changes, but should be 
newest-update-wins. The origin id is not enough for that, because by 
looking at the WAL record and the origin id, you don't know which type 
of an update it was.

So, I don't like the idea of adding the origin id to the record header. 
It's only required in some occasions, and on some record types. And I'm 
worried it might not even be enough in more complicated scenarios.

Perhaps we need a more generic WAL record annotation system, where a 
plugin can tack arbitrary information to WAL records. The extra 
information could be stored in the WAL record after the rmgr payload, 
similar to how backup blocks are stored. WAL replay could just ignore 
the annotations, but a replication system could use it to store the 
origin id or whatever extra information it needs.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node
Next
From: Simon Riggs
Date:
Subject: Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node