Thread: Hot standby, running xacts, subtransactions
When we take the snapshot of running transactions in the master, in GetRunningTransactionData(), it only includes top-level xids and those subxids that are in the subxid caches. Overflowed subxids are not included. Isn't that a problem? When the standby initializes the recovery procs using the running xacts information, pg_subtrans doesn't isn't set for the overflowed xids, because that information is not included in the WAL record. If you're lucky, the information is there already, but we don't generally guarantee pg_subtrans to survive crash or restart. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Wed, 2009-02-25 at 22:39 +0200, Heikki Linnakangas wrote: > When we take the snapshot of running transactions in the master, in > GetRunningTransactionData(), it only includes top-level xids and those > subxids that are in the subxid caches. Overflowed subxids are not > included. Isn't that a problem? When the standby initializes the > recovery procs using the running xacts information, pg_subtrans doesn't > isn't set for the overflowed xids, because that information is not > included in the WAL record. If you're lucky, the information is there > already, but we don't generally guarantee pg_subtrans to survive crash > or restart. That is exactly the reason why we don't treat an overflowed snapshot as a valid starting point. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Simon Riggs wrote: > On Wed, 2009-02-25 at 22:39 +0200, Heikki Linnakangas wrote: > >> When we take the snapshot of running transactions in the master, in >> GetRunningTransactionData(), it only includes top-level xids and those >> subxids that are in the subxid caches. Overflowed subxids are not >> included. Isn't that a problem? When the standby initializes the >> recovery procs using the running xacts information, pg_subtrans doesn't >> isn't set for the overflowed xids, because that information is not >> included in the WAL record. If you're lucky, the information is there >> already, but we don't generally guarantee pg_subtrans to survive crash >> or restart. > > That is exactly the reason why we don't treat an overflowed snapshot as > a valid starting point. We don't? I don't see anything stopping it. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Wed, 2009-02-25 at 23:08 +0200, Heikki Linnakangas wrote: > > > > That is exactly the reason why we don't treat an overflowed snapshot as > > a valid starting point. > > We don't? I don't see anything stopping it. In GetRunningTransactionData() we explicitly set latestRunningXid to InvalidTransactionId if the snapshot is overflowed. That prevents the snapshot from being used to initialise the recovery procs. I'll document that better. You raised that as an annoyance previously because it means that connection in hot standby mode may be delayed in cases of heavy, repeated use of significant numbers of subtransactions. My answer was that there is a way to avoid that but it complicates things and I'm trying my best to avoid complexity in the first release, yet still have it work (this decade :-)) -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
> You raised that as an annoyance previously because it means that > connection in hot standby mode may be delayed in cases of heavy, > repeated use of significant numbers of subtransactions. While most users still don't use explicit subtransactions at all, wouldn't this also affect users who use large numbers of stored procedures? --Josh Berkus
On Wed, 2009-02-25 at 13:33 -0800, Josh Berkus wrote: > > You raised that as an annoyance previously because it means that > > connection in hot standby mode may be delayed in cases of heavy, > > repeated use of significant numbers of subtransactions. > > While most users still don't use explicit subtransactions at all, > wouldn't this also affect users who use large numbers of stored procedures? If they regularly use more than 64 levels of nested EXCEPTION clauses *and* they start their base backups during heavy usage of those stored procedures, then yes. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
On Wednesday 25 February 2009 16:43:54 Simon Riggs wrote: > On Wed, 2009-02-25 at 13:33 -0800, Josh Berkus wrote: > > > You raised that as an annoyance previously because it means that > > > connection in hot standby mode may be delayed in cases of heavy, > > > repeated use of significant numbers of subtransactions. > > > > While most users still don't use explicit subtransactions at all, > > wouldn't this also affect users who use large numbers of stored > > procedures? > > If they regularly use more than 64 levels of nested EXCEPTION clauses > *and* they start their base backups during heavy usage of those stored > procedures, then yes. > We have stored procedrues that loop over thousands of records, with begin...exception blocks in that loop, so I think we do that. AFAICT there's no way to tell if you have it wrong until you fire up the standby (ie. you can't tell at the time you make your base backup), right ? -- Robert Treat Conjecture: http://www.xzilla.net Consulting: http://www.omniti.com
On Mon, 2009-03-02 at 21:11 -0500, Robert Treat wrote: > On Wednesday 25 February 2009 16:43:54 Simon Riggs wrote: > > On Wed, 2009-02-25 at 13:33 -0800, Josh Berkus wrote: > > > > You raised that as an annoyance previously because it means that > > > > connection in hot standby mode may be delayed in cases of heavy, > > > > repeated use of significant numbers of subtransactions. > > > > > > While most users still don't use explicit subtransactions at all, > > > wouldn't this also affect users who use large numbers of stored > > > procedures? > > > > If they regularly use more than 64 levels of nested EXCEPTION clauses > > *and* they start their base backups during heavy usage of those stored > > procedures, then yes. > > > > We have stored procedrues that loop over thousands of records, with > begin...exception blocks in that loop, so I think we do that. AFAICT there's > no way to tell if you have it wrong until you fire up the standby (ie. you > can't tell at the time you make your base backup), right ? That was supposed to be a simplification for phase one, not a barrier for all time. I'm changing that now, though the effect will be that in some cases we take longer before we accept connections. The initialisation requirements are that we have full knowledge of transactions in progress before we allow snapshots to be taken. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
On Tuesday 03 March 2009 03:22:30 Simon Riggs wrote: > On Mon, 2009-03-02 at 21:11 -0500, Robert Treat wrote: > > On Wednesday 25 February 2009 16:43:54 Simon Riggs wrote: > > > On Wed, 2009-02-25 at 13:33 -0800, Josh Berkus wrote: > > > > > You raised that as an annoyance previously because it means that > > > > > connection in hot standby mode may be delayed in cases of heavy, > > > > > repeated use of significant numbers of subtransactions. > > > > > > > > While most users still don't use explicit subtransactions at all, > > > > wouldn't this also affect users who use large numbers of stored > > > > procedures? > > > > > > If they regularly use more than 64 levels of nested EXCEPTION clauses > > > *and* they start their base backups during heavy usage of those stored > > > procedures, then yes. > > > > We have stored procedrues that loop over thousands of records, with > > begin...exception blocks in that loop, so I think we do that. AFAICT > > there's no way to tell if you have it wrong until you fire up the standby > > (ie. you can't tell at the time you make your base backup), right ? > > That was supposed to be a simplification for phase one, not a barrier > for all time. > Understood; I only mention it because it's usually good to know how quickly we run into some of these cases that we don't think will be common. > I'm changing that now, though the effect will be that in some cases we > take longer before we accept connections. The initialisation > requirements are that we have full knowledge of transactions in progress > before we allow snapshots to be taken. > That seems pretty reasonable; hopefully people aren't setting up hot standy machines as an emergency scaling technique :-) -- Robert Treat Conjecture: http://www.xzilla.net Consulting: http://www.omniti.com