Re: Hot standby, slot ids and stuff - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: Hot standby, slot ids and stuff |
Date | |
Msg-id | 1231580456.18005.608.camel@ebony.2ndQuadrant Whole thread Raw |
In response to | Hot standby, slot ids and stuff (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>) |
Responses |
Re: Hot standby, slot ids and stuff
Re: Hot standby, slot ids and stuff Re: Hot standby, slot ids and stuff |
List | pgsql-hackers |
On Thu, 2009-01-08 at 20:30 +0200, Heikki Linnakangas wrote: > Since I've been whining about that for some time, I figured I have to > put my money where my mouth is, so here's a patch based on v6a that > eliminates the concept of slotids, as well as the new xl_info2 field in > XLogRecord. This seems much simpler to me. I haven't given it much > testing, but seems to work. There's a whole bunch of comments marked > with XXX that need resolving, though. There's a confusion in the patch between top level xid and parent xid. The xl field is named parentxid but actually contains top level. That distinction is important because the code now uses the top level xid to locate the recovery proc, formerly the role of the slotid. This leads to an error when we SubTransSetParent(child_xid, top_xid); since this assumes that the top_xid is the parent, which it is not. Mostly you wouldn't notice unless you were looking up the subtrans status for an xid that had committed but was the child of an aborted subtransaction, with the top level xid having > 64 subtransactions. It's possible the confusion leads to other bugs in UnobservedXid processing, but I didn't look too hard at that. AFAICS we need both parent and top xids. Or put another way, we need the parent xid and other info that allows us to uniquely determine the proc we need to update. Now the "other info..." could be top xid or it could also be slotid, which then avoids later zig-zagging to look up the proc. I'm wasn't looking for ways to reintroduce slotid, but it seems more logical to keep slotid in light of the above. However, you will probably view this as intransigence, so I will await your input. I'm very happy that GetStandbyInfoForTransaction() and all the XLR2 flags have bitten the dust and will sleep for eternity. For xl_rel_lock you add a field called xid and then ask /* xid of the *parent* transaction. XXX why parent? */. You've done this because it replaced slotid. But looking at that, I think the 6a patch had a bug there: a subtransaction abort record would release locks for the whole top level xact. So we need to pass both top level xid (or slotid) and xid for each lock, then release using the actual xid only. You also ask: Shouldn't we call StartupSUBTRANS() and the other startup functions like we do below, before letting anyone in? My answer is that the current role of StartupSUBTRANS and friends is not appropriate at that point, since they zero out those structures. I left those routines in place thinking "startup" meant "moving to normal running". If we did have a "startupsubtrans" at the point you note, it would currently be empty: we don't keep track of the latest page during recovery. Perhaps we should, but then we'd need to do the equivalent of ExtendSubtrans etc, which it seemed easier to avoid. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
pgsql-hackers by date: