Thread: Hot standby, recent changes

Hot standby, recent changes

From
Heikki Linnakangas
Date:
1. The XLogFlush() call you added to dbase_redo doesn't help where it
is. You need to call XLogFlush() after the *commit* record of the DROP
DATABASE. The idea is minimize the window where the files have already
been deleted but the entry in pg_database is still visible, if someone
kills the standby and starts it up as a new master. This isn't really
hot standby's fault, you have the same window with PITR, so I think it
would be better to handle this as a separate patch. Also, please handle
all the commands that use ForceSyncCommit().

2. "Allow RULEs ON INSERT, ON UPDATE and ON DELETE iff they generate
only SELECT statements that act INSTEAD OF the actual event." This
affects any read-only transaction, not just hot standby, so I think this
should be discussed and changed as separate patch.

3. The "Out of lock mem killer" in StandbyAcquireAccessExclusiveLock is
quite harsh. It aborts all read-only transactions. It should be enough
to kill just one random one, or maybe the one that's holding most locks.
Also, if there still isn't enough shared memory after killing all
backends, it goes into an infinite loop. I guess that shouldn't happen,
but it seems a bit squishy anyway. It would be nice to differentiate
between "out of shared mem" and "someone else is holding the lock" more
accurately. Maybe add a new return value to LockAcquire() for "out of
shared mem".

4. Need to handle the case where master is started up with
wal_standby_info=true, shut down, and restarted with
wal_standby_info=false, while the standby server runs continuously. And
the code in StartupXLog() to initialize recovery snapshot from a
shutdown checkpoint needs to check that too.

5. You removed this comment from KnownAssignedXidsAdd:

-   /*
-    * XXX: We should check that we don't exceed maxKnownAssignedXids.
-    * Even though the hash table might hold a few more entries than that,
-    * we use fixed-size arrays of that size elsewhere and expected all
-    * entries in the hash table to fit.
-    */

I think the issue still exists. The comment refers to
KnownAssignedXidsGet, which takes as an argument an array that has to be
large enough to hold all entries in the known-assigned hash table. If
there's more entries than expected in the hash table,
KnownAssignedXidsGet will overrun the array. Perhaps
KnownAssignedXidsGet should return a palloc'd array instead or
something, but I don't think it's fine as it is.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: Hot standby, recent changes

From
Simon Riggs
Date:
On Sun, 2009-12-06 at 12:32 +0200, Heikki Linnakangas wrote:

> 2. "Allow RULEs ON INSERT, ON UPDATE and ON DELETE iff they generate
> only SELECT statements that act INSTEAD OF the actual event." This
> affects any read-only transaction, not just hot standby, so I think this
> should be discussed and changed as separate patch.

OK

> 4. Need to handle the case where master is started up with
> wal_standby_info=true, shut down, and restarted with
> wal_standby_info=false, while the standby server runs continuously. And
> the code in StartupXLog() to initialize recovery snapshot from a
> shutdown checkpoint needs to check that too.

I don't really understand the use case for shutting down the server and
then using it as a HS base backup. Why would anyone do that? Why would
they have their server down for potentially hours, when they can take
the backup while the server is up? If the server is idle, it can be
up-and-idle just as easily as down-and-idle, in which case we wouldn't
need to support this at all. Adding yards of code for this capability
isn't important to me. I'd rather strip the whole lot out than keep
fiddling with a low priority area. Please justify this as a real world
solution before we continue trying to support it.

> 5. You removed this comment from KnownAssignedXidsAdd:
> 
> -   /*
> -    * XXX: We should check that we don't exceed maxKnownAssignedXids.
> -    * Even though the hash table might hold a few more entries than that,
> -    * we use fixed-size arrays of that size elsewhere and expected all
> -    * entries in the hash table to fit.
> -    */
> 
> I think the issue still exists. The comment refers to
> KnownAssignedXidsGet, which takes as an argument an array that has to be
> large enough to hold all entries in the known-assigned hash table. If
> there's more entries than expected in the hash table,
> KnownAssignedXidsGet will overrun the array. Perhaps
> KnownAssignedXidsGet should return a palloc'd array instead or
> something, but I don't think it's fine as it is.

Same issue perhaps, different place.

-- Simon Riggs           www.2ndQuadrant.com



Re: Hot standby, recent changes

From
Simon Riggs
Date:
On Sun, 2009-12-06 at 12:32 +0200, Heikki Linnakangas wrote:
> 1. The XLogFlush() call you added to dbase_redo doesn't help where it
> is. You need to call XLogFlush() after the *commit* record of the DROP
> DATABASE. The idea is minimize the window where the files have already
> been deleted but the entry in pg_database is still visible, if someone
> kills the standby and starts it up as a new master. This isn't really
> hot standby's fault, you have the same window with PITR, so I think it
> would be better to handle this as a separate patch. Also, please handle
> all the commands that use ForceSyncCommit().

I think I'll just add a flag to the commit record to do this. That way
anybody that sets ForceSyncCommit will do as you propose.

-- Simon Riggs           www.2ndQuadrant.com



Re: Hot standby, recent changes

From
Simon Riggs
Date:
On Sun, 2009-12-06 at 12:32 +0200, Heikki Linnakangas wrote:

> 3. The "Out of lock mem killer" in StandbyAcquireAccessExclusiveLock is
> quite harsh. It aborts all read-only transactions. It should be enough
> to kill just one random one, or maybe the one that's holding most locks.
> Also, if there still isn't enough shared memory after killing all
> backends, it goes into an infinite loop. I guess that shouldn't happen,
> but it seems a bit squishy anyway. It would be nice to differentiate
> between "out of shared mem" and "someone else is holding the lock" more
> accurately. Maybe add a new return value to LockAcquire() for "out of
> shared mem".

OK, will abort infinite loop by adding new return value.

If people don't like having everything killed, they can add more lock
memory. It isn't worth adding more code because its hard to test and
unlikely to be an issue in real usage, and it has a clear workaround
that is already mentioned in a hint.

-- Simon Riggs           www.2ndQuadrant.com



Re: Hot standby, recent changes

From
Simon Riggs
Date:
On Sun, 2009-12-06 at 11:20 +0000, Simon Riggs wrote:
> On Sun, 2009-12-06 at 12:32 +0200, Heikki Linnakangas wrote:
> 
> > 3. The "Out of lock mem killer" in StandbyAcquireAccessExclusiveLock is
> > quite harsh. It aborts all read-only transactions. It should be enough
> > to kill just one random one, or maybe the one that's holding most locks.
> > Also, if there still isn't enough shared memory after killing all
> > backends, it goes into an infinite loop. I guess that shouldn't happen,
> > but it seems a bit squishy anyway. It would be nice to differentiate
> > between "out of shared mem" and "someone else is holding the lock" more
> > accurately. Maybe add a new return value to LockAcquire() for "out of
> > shared mem".
> 
> OK, will abort infinite loop by adding new return value.

You had me for a minute, but there is no infinite loop. Once we start
killing people it reverts to throwing an ERROR on an out of memory
condition.

-- Simon Riggs           www.2ndQuadrant.com



Re: Hot standby, recent changes

From
Simon Riggs
Date:
On Sun, 2009-12-06 at 10:51 +0000, Simon Riggs wrote:

> > 5. You removed this comment from KnownAssignedXidsAdd:
> > 
> > -   /*
> > -    * XXX: We should check that we don't exceed maxKnownAssignedXids.
> > -    * Even though the hash table might hold a few more entries than that,
> > -    * we use fixed-size arrays of that size elsewhere and expected all
> > -    * entries in the hash table to fit.
> > -    */
> > 
> > I think the issue still exists. The comment refers to
> > KnownAssignedXidsGet, which takes as an argument an array that has to be
> > large enough to hold all entries in the known-assigned hash table. If
> > there's more entries than expected in the hash table,
> > KnownAssignedXidsGet will overrun the array. Perhaps
> > KnownAssignedXidsGet should return a palloc'd array instead or
> > something, but I don't think it's fine as it is.
> 
> Same issue perhaps, different place.

Well, its not same issue. One was about entering things into a shared
memory hash table, one is about reading values into an locally malloc'd
array. The array is sized as the max size of KnownAssignedXids, so there
is no danger that there would ever be an overrun.

One caller does not set the max size explicitly, so I will add a
comment/code changes to make that clearer.

-- Simon Riggs           www.2ndQuadrant.com



Re: Hot standby, recent changes

From
Heikki Linnakangas
Date:
Simon Riggs wrote:
> On Sun, 2009-12-06 at 12:32 +0200, Heikki Linnakangas wrote:
>> 4. Need to handle the case where master is started up with
>> wal_standby_info=true, shut down, and restarted with
>> wal_standby_info=false, while the standby server runs continuously. And
>> the code in StartupXLog() to initialize recovery snapshot from a
>> shutdown checkpoint needs to check that too.
> 
> I don't really understand the use case for shutting down the server and
> then using it as a HS base backup. Why would anyone do that? Why would
> they have their server down for potentially hours, when they can take
> the backup while the server is up? If the server is idle, it can be
> up-and-idle just as easily as down-and-idle, in which case we wouldn't
> need to support this at all. Adding yards of code for this capability
> isn't important to me. I'd rather strip the whole lot out than keep
> fiddling with a low priority area. Please justify this as a real world
> solution before we continue trying to support it.

This affects using a shutdown checkpoint as a recovery start point
(restore point) in general, not just a fresh backup taken from a shut
down database.

Consider this scenario:

0. You have a master and a standby configured properly, and up and running.
1. You shut down master for some reason.
2. You restart standby. For some reason. Maybe by accident, or you want
to upgrade minor version or whatever.
3. Standby won't accept connections until the master is started too.
Admin says "WTF?"

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: Hot standby, recent changes

From
Simon Riggs
Date:
On Sun, 2009-12-06 at 20:32 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Sun, 2009-12-06 at 12:32 +0200, Heikki Linnakangas wrote:
> >> 4. Need to handle the case where master is started up with
> >> wal_standby_info=true, shut down, and restarted with
> >> wal_standby_info=false, while the standby server runs continuously. And
> >> the code in StartupXLog() to initialize recovery snapshot from a
> >> shutdown checkpoint needs to check that too.
> > 
> > I don't really understand the use case for shutting down the server and
> > then using it as a HS base backup. Why would anyone do that? Why would
> > they have their server down for potentially hours, when they can take
> > the backup while the server is up? If the server is idle, it can be
> > up-and-idle just as easily as down-and-idle, in which case we wouldn't
> > need to support this at all. Adding yards of code for this capability
> > isn't important to me. I'd rather strip the whole lot out than keep
> > fiddling with a low priority area. Please justify this as a real world
> > solution before we continue trying to support it.
> 
> This affects using a shutdown checkpoint as a recovery start point
> (restore point) in general, not just a fresh backup taken from a shut
> down database.
> 
> Consider this scenario:
> 
> 0. You have a master and a standby configured properly, and up and running.
> 1. You shut down master for some reason.
> 2. You restart standby. For some reason. Maybe by accident, or you want
> to upgrade minor version or whatever.
> 3. Standby won't accept connections until the master is started too.
> Admin says "WTF?"

OK, I accept that as a possible scenario.

I'm concerned that the number of possible scenarios we are trying to
support in the first version is too high, increasing the likelihood that
some of them do not work correctly because the test scenarios didn't
re-create them exactly.

In the above scenario, if it is of serious concern the admin can
failover to the standby and then access the standby for queries. If the
master was down for any length of time that's exactly what we'd expect
to happen anyway. 

So I don't see the scenario as very likely; I'm sure it will happen to
somebody. In any case, it is in the opposite direction to the main
feature, which is a High Availability server, so any admin that argued
that he wanted to have his HA standby stay as a standby in this case
would likely be in a minority.

I would rather document it as a known caveat and be done.

-- Simon Riggs           www.2ndQuadrant.com



Re: Hot standby, recent changes

From
Robert Haas
Date:
On Sun, Dec 6, 2009 at 3:06 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Sun, 2009-12-06 at 20:32 +0200, Heikki Linnakangas wrote:
>> Simon Riggs wrote:
>> > On Sun, 2009-12-06 at 12:32 +0200, Heikki Linnakangas wrote:
>> >> 4. Need to handle the case where master is started up with
>> >> wal_standby_info=true, shut down, and restarted with
>> >> wal_standby_info=false, while the standby server runs continuously. And
>> >> the code in StartupXLog() to initialize recovery snapshot from a
>> >> shutdown checkpoint needs to check that too.
>> >
>> > I don't really understand the use case for shutting down the server and
>> > then using it as a HS base backup. Why would anyone do that? Why would
>> > they have their server down for potentially hours, when they can take
>> > the backup while the server is up? If the server is idle, it can be
>> > up-and-idle just as easily as down-and-idle, in which case we wouldn't
>> > need to support this at all. Adding yards of code for this capability
>> > isn't important to me. I'd rather strip the whole lot out than keep
>> > fiddling with a low priority area. Please justify this as a real world
>> > solution before we continue trying to support it.
>>
>> This affects using a shutdown checkpoint as a recovery start point
>> (restore point) in general, not just a fresh backup taken from a shut
>> down database.
>>
>> Consider this scenario:
>>
>> 0. You have a master and a standby configured properly, and up and running.
>> 1. You shut down master for some reason.
>> 2. You restart standby. For some reason. Maybe by accident, or you want
>> to upgrade minor version or whatever.
>> 3. Standby won't accept connections until the master is started too.
>> Admin says "WTF?"
>
> OK, I accept that as a possible scenario.
>
> I'm concerned that the number of possible scenarios we are trying to
> support in the first version is too high, increasing the likelihood that
> some of them do not work correctly because the test scenarios didn't
> re-create them exactly.
>
> In the above scenario, if it is of serious concern the admin can
> failover to the standby and then access the standby for queries. If the
> master was down for any length of time that's exactly what we'd expect
> to happen anyway.
>
> So I don't see the scenario as very likely; I'm sure it will happen to
> somebody. In any case, it is in the opposite direction to the main
> feature, which is a High Availability server, so any admin that argued
> that he wanted to have his HA standby stay as a standby in this case
> would likely be in a minority.
>
> I would rather document it as a known caveat and be done.

For what it's worth, this doesn't seem particularly unlikely or
unusual to me.  But on the other hand, I do really want to get this
committed soon, and I don't have time to help at the moment, so maybe
I should keep my mouth shut.

...Robert


Re: Hot standby, recent changes

From
Dimitri Fontaine
Date:
Le 6 déc. 2009 à 23:26, Robert Haas a écrit :
>>> Consider this scenario:
>>>
>>> 0. You have a master and a standby configured properly, and up and running.
>>> 1. You shut down master for some reason.
>>> 2. You restart standby. For some reason. Maybe by accident, or you want
>>> to upgrade minor version or whatever.
>>> 3. Standby won't accept connections until the master is started too.
>>> Admin says "WTF?"
>>
>> I would rather document it as a known caveat and be done.

+1

> For what it's worth, this doesn't seem particularly unlikely or
> unusual to me.

I'm sorry to have to disagree here. Shutting down the master in my book means you're upgrading it, minor or major or
thebox under it. It's not a frequent event. More frequent than a crash but still. 

Now the master is offline, you have a standby, and you're restarting it too, but you don't mean that as a switchover.
I'mwith Simon here, the WTF is "are you in search of trouble?" more than anything else. I don't think shutting down the
standbywhile the master is offline is considered as a good practice if your goal is HA... 

Regards,
--
dim

Re: Hot standby, recent changes

From
Simon Riggs
Date:
On Sun, 2009-12-06 at 17:26 -0500, Robert Haas wrote:

> For what it's worth, this doesn't seem particularly unlikely or
> unusual to me.

I don't know many people who shutdown both nodes of a highly available
application at the same time. If they did, I wouldn't expect them to
complain they couldn't run queries on the standby when an two obvious
and simple workarounds exist to allow them to access their data: start
the master again, or make the standby switchover, both of which are part
of standard operating procedures.

It doesn't seem very high up the list of additional features, at least.

-- Simon Riggs           www.2ndQuadrant.com



Re: Hot standby, recent changes

From
Heikki Linnakangas
Date:
Simon Riggs wrote:
> On Sun, 2009-12-06 at 17:26 -0500, Robert Haas wrote:
> 
>> For what it's worth, this doesn't seem particularly unlikely or
>> unusual to me.
> 
> I don't know many people who shutdown both nodes of a highly available
> application at the same time. If they did, I wouldn't expect them to
> complain they couldn't run queries on the standby when an two obvious
> and simple workarounds exist to allow them to access their data: start
> the master again, or make the standby switchover, both of which are part
> of standard operating procedures.

It might not be common or expected in a typical HA installation, but it
would be a very strange limitation in my mind. It might well happen e.g
in a standby used for reporting, or when you do PITR.

> It doesn't seem very high up the list of additional features, at least.

Well, it's in the patch now. I'm just asking you to not break it.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: Hot standby, recent changes

From
Simon Riggs
Date:
On Mon, 2009-12-07 at 10:02 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Sun, 2009-12-06 at 17:26 -0500, Robert Haas wrote:
> > 
> >> For what it's worth, this doesn't seem particularly unlikely or
> >> unusual to me.
> > 
> > I don't know many people who shutdown both nodes of a highly available
> > application at the same time. If they did, I wouldn't expect them to
> > complain they couldn't run queries on the standby when an two obvious
> > and simple workarounds exist to allow them to access their data: start
> > the master again, or make the standby switchover, both of which are part
> > of standard operating procedures.
> 
> It might not be common or expected in a typical HA installation, but it
> would be a very strange limitation in my mind. It might well happen e.g
> in a standby used for reporting, or when you do PITR.

Yes, its possible that the standby can be shutdown while disconnected
from a master. If that occurs, all we are saying is that we cannot use
shutdown checkpoints as starting points. If the primary was set up in
default mode, then wal_standby_info = on and so there will be
running_xact WAL records immediately after each checkpoint to use as
starting points. We don't need shutdown checkpoints as well.

So adding shutdown checkpoints as a valid starting place does not help
in this case, it just makes the code harder to test.

> > It doesn't seem very high up the list of additional features, at least.
> 
> Well, it's in the patch now. I'm just asking you to not break it.

There's been many things in the patch that have been removed. This is by
far the least important of the things removed in the name of getting
this done as soon as possible, with good quality. I see no reason for
your eagerness to include this feature. I have removed it; as you say,
its in the repo if someone wants to add it in later.

-- Simon Riggs           www.2ndQuadrant.com