Re: Exposing the Xact commit order to the user - Mailing list pgsql-hackers

From Jan Wieck
Subject Re: Exposing the Xact commit order to the user
Date
Msg-id 4BFD8575.30406@Yahoo.com
Whole thread Raw
In response to Re: Exposing the Xact commit order to the user  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: Exposing the Xact commit order to the user
List pgsql-hackers
On 5/26/2010 3:16 PM, Heikki Linnakangas wrote:
> On 26/05/10 21:43, Jan Wieck wrote:
>> On 5/26/2010 1:17 PM, Heikki Linnakangas wrote:
>>> It would not get called during recovery, but I believe that would be
>>> sufficient for Slony. You could always batch commits that you don't
>>> know when they committed as if they committed simultaneously.
>>
>> Here you are mistaken. If the origin crashes but can recover not yet
>> flushed to xlog-commit-order transactions, then the consumer has no idea
>> about the order of those commits, which throws us back to the point
>> where we require a non cacheable global sequence to replay the
>> individual actions of those "now batched" transactions in an agreeable
>> order.
>>
>> The commit order data needs to be covered by crash recovery.
> 
> Perhaps I'm missing something, 

Apparently, more about that at the end.

> I'm thinking that the commit-order log would contain two kinds of records:
> 
> a) Transaction with XID X committed
> b) All transactions with XID < X committed

If that was true then long running transactions would delay all commits 
for transactions that started after them. Do they?

> 
> During normal operation we write the 1st kind of record at every commit. 
> After crash recovery (perhaps at the first commit after recovery or when 
> the slon daemon first polls the server, as there's no hook for 
> end-of-recovery), we write the 2nd kind of record.

I think the callback is also called during backend startup, which means 
that it could record the first XID to come which is known from the 
control file and in that case, all < XID's are committed or aborted.

Which leads us to your missing piece above, the need for the global non 
cacheable sequence.

Consider two transactions A and B that due to transaction batching 
between snapshots get applied together. Let the order of actions be

1. A starts
2. B starts
3. B selects a row for update, then updates the row
4. A tries to do the same and blocks
5. B commits
6. A gets the lock, the row, does the update
7. A commits

If Slony (or Londiste) would not record the exact order of those 
individual row actions, then it would not have any idea if within that 
batch the action of B (higher XID) actually came first. Without that 
knowledge there is a 50/50 chance of getting your replica out of sync 
with that simple conflict.


Jan

-- 
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin


pgsql-hackers by date:

Previous
From: Dimitri Fontaine
Date:
Subject: Re: Synchronization levels in SR
Next
From: "Kevin Grittner"
Date:
Subject: Re: Exposing the Xact commit order to the user