Thread: recovery_connections cannot start (was Re: master in standby mode croaks)

recovery_connections cannot start (was Re: master in standby mode croaks)

From
Robert Haas
Date:
On Sat, Apr 17, 2010 at 6:52 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sat, Apr 17, 2010 at 6:41 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> On Sat, 2010-04-17 at 17:44 -0400, Robert Haas wrote:
>>
>>> > I will change the error message.
>>>
>>> I gave a good deal of thought to trying to figure out a cleaner
>>> solution to this problem than just changing the error message and
>>> failed.  So let's change the error message.  Of course I'm not quite
>>> sure what we should change it TO, given that the situation is the
>>> result of an interaction between three different GUCs and we have no
>>> way to distinguish which one(s) are the problem.
>>
>> "You need all three" covers it.
>
> Actually you need standby_connections and either archive_mode=on or
> max_wal_senders>0, I think.

One way we could fix this is use 2 bits rather than 1 for
XLogStandbyInfoMode.  One bit could indicate that either
archive_mode=on or max_wal_senders>0, and the second bit could
indicate that recovery_connections=on.  If the second bit is unset, we
could emit the existing complaint:

recovery connections cannot start because the recovery_connections
parameter is disabled on the WAL source server

If the other bit is unset, then we could instead complain:

recovery connections cannot start because archive_mode=off and
max_wal_senders=0 on the WAL source server

If we don't want to use two bits there, it's hard to really describe
all the possibilities in a reasonable number of characters.  The only
thing I can think of is to print a message and a hint:

recovery_connections cannot start due to incorrect settings on the WAL
source server
HINT: make sure recovery_connections=on and either archive_mode=on or
max_wal_senders>0

I haven't checked whether the hint would be displayed in the log on
the standby, but presumably we could make that be the case if it's not
already.

I think the first way is better because it gives the user more
specific information about what they need to fix.  Thinking about how
each case might happen, since the default for recovery_connections is
'on', it seems that recovery_connections=off will likely only be an
issue if the user has explicitly turned it off.  The other case, where
archive_mode=off and max_wal_senders=0, will likely only occur if
someone takes a snapshot of the master without first setting up
archiving or SR.  Both of these will probably happen relatively
rarely, but since we're burning a whole byte for XLogStandbyInfoMode
(plus 3 more bytes of padding?), it seems like we might as well snag
one more bit for clarity.

Thoughts?

...Robert


On Fri, Apr 23, 2010 at 1:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> One way we could fix this is use 2 bits rather than 1 for
> XLogStandbyInfoMode.  One bit could indicate that either
> archive_mode=on or max_wal_senders>0, and the second bit could
> indicate that recovery_connections=on.  If the second bit is unset, we
> could emit the existing complaint:
>
> recovery connections cannot start because the recovery_connections
> parameter is disabled on the WAL source server
>
> If the other bit is unset, then we could instead complain:
>
> recovery connections cannot start because archive_mode=off and
> max_wal_senders=0 on the WAL source server
>
> If we don't want to use two bits there, it's hard to really describe
> all the possibilities in a reasonable number of characters.  The only
> thing I can think of is to print a message and a hint:
>
> recovery_connections cannot start due to incorrect settings on the WAL
> source server
> HINT: make sure recovery_connections=on and either archive_mode=on or
> max_wal_senders>0
>
> I haven't checked whether the hint would be displayed in the log on
> the standby, but presumably we could make that be the case if it's not
> already.
>
> I think the first way is better because it gives the user more
> specific information about what they need to fix.  Thinking about how
> each case might happen, since the default for recovery_connections is
> 'on', it seems that recovery_connections=off will likely only be an
> issue if the user has explicitly turned it off.  The other case, where
> archive_mode=off and max_wal_senders=0, will likely only occur if
> someone takes a snapshot of the master without first setting up
> archiving or SR.  Both of these will probably happen relatively
> rarely, but since we're burning a whole byte for XLogStandbyInfoMode
> (plus 3 more bytes of padding?), it seems like we might as well snag
> one more bit for clarity.
>
> Thoughts?

I like the second choice since it's  simpler and enough for me.
But I have no objection to the first.

When we encounter the error, we would need to not only change
those parameter values but also take a fresh base backup and
restart the standby using it. The description of this required
procedure needs to be in the document or error message, I think.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: recovery_connections cannot start (was Re: master in standby mode croaks)

From
Heikki Linnakangas
Date:
Fujii Masao wrote:
> On Fri, Apr 23, 2010 at 1:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> One way we could fix this is use 2 bits rather than 1 for
>> XLogStandbyInfoMode.  One bit could indicate that either
>> archive_mode=on or max_wal_senders>0, and the second bit could
>> indicate that recovery_connections=on.  If the second bit is unset, we
>> could emit the existing complaint:
>>
>> recovery connections cannot start because the recovery_connections
>> parameter is disabled on the WAL source server
>>
>> If the other bit is unset, then we could instead complain:
>>
>> recovery connections cannot start because archive_mode=off and
>> max_wal_senders=0 on the WAL source server
>>
>> If we don't want to use two bits there, it's hard to really describe
>> all the possibilities in a reasonable number of characters.  The only
>> thing I can think of is to print a message and a hint:
>>
>> recovery_connections cannot start due to incorrect settings on the WAL
>> source server
>> HINT: make sure recovery_connections=on and either archive_mode=on or
>> max_wal_senders>0
>>
>> I haven't checked whether the hint would be displayed in the log on
>> the standby, but presumably we could make that be the case if it's not
>> already.
>>
>> I think the first way is better because it gives the user more
>> specific information about what they need to fix.  Thinking about how
>> each case might happen, since the default for recovery_connections is
>> 'on', it seems that recovery_connections=off will likely only be an
>> issue if the user has explicitly turned it off.  The other case, where
>> archive_mode=off and max_wal_senders=0, will likely only occur if
>> someone takes a snapshot of the master without first setting up
>> archiving or SR.  Both of these will probably happen relatively
>> rarely, but since we're burning a whole byte for XLogStandbyInfoMode
>> (plus 3 more bytes of padding?), it seems like we might as well snag
>> one more bit for clarity.
>>
>> Thoughts?
> 
> I like the second choice since it's  simpler and enough for me.
> But I have no objection to the first.
> 
> When we encounter the error, we would need to not only change
> those parameter values but also take a fresh base backup and
> restart the standby using it. The description of this required
> procedure needs to be in the document or error message, I think.

I quite liked Robert's proposal to add an explicit GUC to control what
extra information is logged
(http://archives.postgresql.org/pgsql-hackers/2010-04/msg00509.php). It
is quite difficult to explain the current behavior, a simple explicit
wal_mode GUC would be a lot simpler. It wouldn't add any extra steps to
setting the system up, you currently need to set archive_mode='on'
anyway to enable archiving. You would just set wal_mode='archive' or
wal_mode='standby' instead, depending on what you want to do with the WAL.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


On Fri, Apr 23, 2010 at 5:24 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> Fujii Masao wrote:
>> On Fri, Apr 23, 2010 at 1:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> One way we could fix this is use 2 bits rather than 1 for
>>> XLogStandbyInfoMode.  One bit could indicate that either
>>> archive_mode=on or max_wal_senders>0, and the second bit could
>>> indicate that recovery_connections=on.  If the second bit is unset, we
>>> could emit the existing complaint:
>>>
>>> recovery connections cannot start because the recovery_connections
>>> parameter is disabled on the WAL source server
>>>
>>> If the other bit is unset, then we could instead complain:
>>>
>>> recovery connections cannot start because archive_mode=off and
>>> max_wal_senders=0 on the WAL source server
>>>
>>> If we don't want to use two bits there, it's hard to really describe
>>> all the possibilities in a reasonable number of characters.  The only
>>> thing I can think of is to print a message and a hint:
>>>
>>> recovery_connections cannot start due to incorrect settings on the WAL
>>> source server
>>> HINT: make sure recovery_connections=on and either archive_mode=on or
>>> max_wal_senders>0
>>>
>>> I haven't checked whether the hint would be displayed in the log on
>>> the standby, but presumably we could make that be the case if it's not
>>> already.
>>>
>>> I think the first way is better because it gives the user more
>>> specific information about what they need to fix.  Thinking about how
>>> each case might happen, since the default for recovery_connections is
>>> 'on', it seems that recovery_connections=off will likely only be an
>>> issue if the user has explicitly turned it off.  The other case, where
>>> archive_mode=off and max_wal_senders=0, will likely only occur if
>>> someone takes a snapshot of the master without first setting up
>>> archiving or SR.  Both of these will probably happen relatively
>>> rarely, but since we're burning a whole byte for XLogStandbyInfoMode
>>> (plus 3 more bytes of padding?), it seems like we might as well snag
>>> one more bit for clarity.
>>>
>>> Thoughts?
>>
>> I like the second choice since it's  simpler and enough for me.
>> But I have no objection to the first.
>>
>> When we encounter the error, we would need to not only change
>> those parameter values but also take a fresh base backup and
>> restart the standby using it. The description of this required
>> procedure needs to be in the document or error message, I think.
>
> I quite liked Robert's proposal to add an explicit GUC to control what
> extra information is logged
> (http://archives.postgresql.org/pgsql-hackers/2010-04/msg00509.php). It
> is quite difficult to explain the current behavior, a simple explicit
> wal_mode GUC would be a lot simpler. It wouldn't add any extra steps to
> setting the system up, you currently need to set archive_mode='on'
> anyway to enable archiving. You would just set wal_mode='archive' or
> wal_mode='standby' instead, depending on what you want to do with the WAL.

I liked it, too, but I sort of decided it didn't buy much.  There are
three separate sets of things that need to be controlled:

1. What WAL to emit - (a) just enough for crash recovery, (b) enough
for log shipping, (c) enough for log shipping with recovery
connections.

2. Whether to run the archiver.

3. Whether to allow streaming replication connections (and if so, how many).

If the answer to (1) is "just enough for crash recovery", then (2) and
(3) must be "no".  But if (1) is either of the other two options, then
any combination of answers for (2) and (3) is seemingly sensible,
though having both (2) and (3) as no is probably of limited utility.
But at a mimium, you could certainly have:

crash recovery/no archiver/no SR
log shipping/archiver/no SR
log shipping/no archiver/SR
log shipping/archiver/SR
recovery connections/archiver/no SR
recovery connections/no archiver/SR
recovery connections/archiver/SR

I don't see any reasonable way to package all of that up in a single
GUC.  Thoughts?

...Robert


Re: recovery_connections cannot start (was Re: master in standby mode croaks)

From
Heikki Linnakangas
Date:
Robert Haas wrote:
> On Fri, Apr 23, 2010 at 5:24 AM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>> I quite liked Robert's proposal to add an explicit GUC to control what
>> extra information is logged
>> (http://archives.postgresql.org/pgsql-hackers/2010-04/msg00509.php). It
>> is quite difficult to explain the current behavior, a simple explicit
>> wal_mode GUC would be a lot simpler. It wouldn't add any extra steps to
>> setting the system up, you currently need to set archive_mode='on'
>> anyway to enable archiving. You would just set wal_mode='archive' or
>> wal_mode='standby' instead, depending on what you want to do with the WAL.
> 
> I liked it, too, but I sort of decided it didn't buy much.  There are
> three separate sets of things that need to be controlled:
> 
> 1. What WAL to emit - (a) just enough for crash recovery, (b) enough
> for log shipping, (c) enough for log shipping with recovery
> connections.
> 
> 2. Whether to run the archiver.
> 
> 3. Whether to allow streaming replication connections (and if so, how many).

Streaming replication needs the same information in the WAL as archiving
does, there's no difference between 2 and 3. (the "how many" aspect of 3
is controlled by max_wal_senders).

Let's have these three settings:

wal_mode = crash/archive/standby (replaces archive_mode)
archive_command
max_wal_senders

If wal_mode is set to 'crash', you can't set archive_command or
max_wal_senders>0. If it's set to 'archive', you can set archive_command
and/or max_wal_senders for archiving and streaming replication, but the
standby server won't allow queries. If you set it to 'standby', it will
(assuming you've set recovery_connections=on in the standby).

Note that "wal_mode=standby" replaces "recovery_connections=on" in the
primary.

I think this would be much easier to understand than the current
situation. I'm not wedded to the GUC name or values, though, maybe it
should be archive_mode=off/on/standby, or wal_mode=minimal/archive/full.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


On Fri, Apr 23, 2010 at 7:12 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> Robert Haas wrote:
>> On Fri, Apr 23, 2010 at 5:24 AM, Heikki Linnakangas
>> <heikki.linnakangas@enterprisedb.com> wrote:
>>> I quite liked Robert's proposal to add an explicit GUC to control what
>>> extra information is logged
>>> (http://archives.postgresql.org/pgsql-hackers/2010-04/msg00509.php). It
>>> is quite difficult to explain the current behavior, a simple explicit
>>> wal_mode GUC would be a lot simpler. It wouldn't add any extra steps to
>>> setting the system up, you currently need to set archive_mode='on'
>>> anyway to enable archiving. You would just set wal_mode='archive' or
>>> wal_mode='standby' instead, depending on what you want to do with the WAL.
>>
>> I liked it, too, but I sort of decided it didn't buy much.  There are
>> three separate sets of things that need to be controlled:
>>
>> 1. What WAL to emit - (a) just enough for crash recovery, (b) enough
>> for log shipping, (c) enough for log shipping with recovery
>> connections.
>>
>> 2. Whether to run the archiver.
>>
>> 3. Whether to allow streaming replication connections (and if so, how many).
>
> Streaming replication needs the same information in the WAL as archiving
> does,

True.

> there's no difference between 2 and 3. (the "how many" aspect of 3
> is controlled by max_wal_senders).

False.

I thought what you think too, but discovered otherwise when I read the
code.  Some uses of archive_mode are used to control what WAL is
generated, but others control a *process* called the archiver.

...Robert


Re: recovery_connections cannot start (was Re: master in standby mode croaks)

From
Heikki Linnakangas
Date:
Robert Haas wrote:
> On Fri, Apr 23, 2010 at 7:12 AM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>> Robert Haas wrote:
>>> On Fri, Apr 23, 2010 at 5:24 AM, Heikki Linnakangas
>>> <heikki.linnakangas@enterprisedb.com> wrote:
>>>> I quite liked Robert's proposal to add an explicit GUC to control what
>>>> extra information is logged
>>>> (http://archives.postgresql.org/pgsql-hackers/2010-04/msg00509.php). It
>>>> is quite difficult to explain the current behavior, a simple explicit
>>>> wal_mode GUC would be a lot simpler. It wouldn't add any extra steps to
>>>> setting the system up, you currently need to set archive_mode='on'
>>>> anyway to enable archiving. You would just set wal_mode='archive' or
>>>> wal_mode='standby' instead, depending on what you want to do with the WAL.
>>> I liked it, too, but I sort of decided it didn't buy much.  There are
>>> three separate sets of things that need to be controlled:
>>>
>>> 1. What WAL to emit - (a) just enough for crash recovery, (b) enough
>>> for log shipping, (c) enough for log shipping with recovery
>>> connections.
>>>
>>> 2. Whether to run the archiver.
>>>
>>> 3. Whether to allow streaming replication connections (and if so, how many).
>> Streaming replication needs the same information in the WAL as archiving
>> does,
> 
> True.
> 
>> there's no difference between 2 and 3. (the "how many" aspect of 3
>> is controlled by max_wal_senders).
> 
> False.
> 
> I thought what you think too, but discovered otherwise when I read the
> code.  Some uses of archive_mode are used to control what WAL is
> generated, but others control a *process* called the archiver.

Hmm, never mind the archiver process, we could just launch it always and
it would just sit idle if archive_command was not set. But a more
serious concern is that if you set "archive_mode=on", and
"archive_command=''", we retain all WAL indefinitely, because it's not
being archived, until you set archive_command to something that succeeds
again. You're right, with the wal_mode='crash/archive/standby" there
would be no way to distinguish "archiving is temporarily disabled, keep
all accumulated WAL around" and "we're not archiving, but
wal_mode='archive' to enable streaming replication".

Ok, that brings us back to square one. We could still add the wal_mode
GUC to explicitly control how much WAL is written (replacing
recovery_connections in the primary), I think it would still make the
system easier to explain. But it would add an extra hurdle to enabling
archiving, you'd have to set wal_mode='archive', archive_mode='on', and
archive_command. I'm not sure if that would be better or worse than the
current situation.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: recovery_connections cannot start (was Re: master in standby mode croaks)

From
Florian Pflug
Date:
On Apr 23, 2010, at 13:12 , Heikki Linnakangas wrote:
> Let's have these three settings:
>
> wal_mode = crash/archive/standby (replaces archive_mode)
> archive_command
> max_wal_senders
>
> If wal_mode is set to 'crash', you can't set archive_command or
> max_wal_senders>0. If it's set to 'archive', you can set archive_command
> and/or max_wal_senders for archiving and streaming replication, but the
> standby server won't allow queries. If you set it to 'standby', it will
> (assuming you've set recovery_connections=on in the standby).
>
> Note that "wal_mode=standby" replaces "recovery_connections=on" in the
> primary.
>
> I think this would be much easier to understand than the current
> situation. I'm not wedded to the GUC name or values, though, maybe it
> should be archive_mode=off/on/standby, or wal_mode=minimal/archive/full.

Hm, but but that would preclude the possibility of running master and (log-shipping) slave off the same configuration,
sinceone would need wal_mode=standby and the other recovery_connections=on. 

Whereas with the current GUCs, i"archive_mode=on, recovery_connections=on, archive_command=..." should be a valid
configurationfor both master and slave, no? 

best regards,
Florian Pflug



On Fri, Apr 23, 2010 at 7:40 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> Ok, that brings us back to square one. We could still add the wal_mode
> GUC to explicitly control how much WAL is written (replacing
> recovery_connections in the primary), I think it would still make the
> system easier to explain. But it would add an extra hurdle to enabling
> archiving, you'd have to set wal_mode='archive', archive_mode='on', and
> archive_command. I'm not sure if that would be better or worse than the
> current situation.

I wasn't either, that's why I gave up.  It didn't seem worth doing a
major GUC reorganization on the eve of beta unless there was a clear
win.  I think there may be a way to improve this but I don't think
it's we should take the time now to figure out what it is.  Let's
revisit it for 9.1, and just improve the error reporting for now.

...Robert


On Fri, Apr 23, 2010 at 8:54 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Apr 23, 2010 at 7:40 AM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>> Ok, that brings us back to square one. We could still add the wal_mode
>> GUC to explicitly control how much WAL is written (replacing
>> recovery_connections in the primary), I think it would still make the
>> system easier to explain. But it would add an extra hurdle to enabling
>> archiving, you'd have to set wal_mode='archive', archive_mode='on', and
>> archive_command. I'm not sure if that would be better or worse than the
>> current situation.
>
> I wasn't either, that's why I gave up.  It didn't seem worth doing a
> major GUC reorganization on the eve of beta unless there was a clear
> win.  I think there may be a way to improve this but I don't think
> it's we should take the time now to figure out what it is.  Let's
> revisit it for 9.1, and just improve the error reporting for now.

+1

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Apr 23, 2010 at 7:12 AM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>> Streaming replication needs the same information in the WAL as archiving
>> does,

> True.

FWIW, I still don't believe that claim, and I think it's complete folly
to set the assumption in stone by choosing a user-visible GUC API that
depends on it being true.
        regards, tom lane


On Fri, 2010-04-23 at 07:54 -0400, Robert Haas wrote:
> Let's
> revisit it for 9.1, and just improve the error reporting for now.

+1

-- Simon Riggs           www.2ndQuadrant.com



On Fri, Apr 23, 2010 at 12:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Fri, Apr 23, 2010 at 7:12 AM, Heikki Linnakangas
>> <heikki.linnakangas@enterprisedb.com> wrote:
>>> Streaming replication needs the same information in the WAL as archiving
>>> does,
>
>> True.
>
> FWIW, I still don't believe that claim, and I think it's complete folly
> to set the assumption in stone by choosing a user-visible GUC API that
> depends on it being true.

Huh?   We're clearly talking about two different things here, because
that doesn't make any sense.  Archiving and streaming replication are
just two means of transporting WAL records from point A to point B.
By definition, any two manners of moving a byte stream around are
isomorphic and can't possibly affect what that byte stream does or
does not need to contain.  What affects the WAL that must be emitted
is the purpose for which it is to be used.  As to that, I believe
everyone (including the code) is in agreement that a minimum amount of
WAL is always needed for crash recovery, plus if we want to do archive
recovery on another server there are some additional bits that must be
emitted (XLogIsNeeded) and plus if further want to process queries on
the standby then there are a few more bits beyond that
(XLogStandbyInfoActive).

...Robert


Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Apr 23, 2010 at 12:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> FWIW, I still don't believe that claim, and I think it's complete folly
>> to set the assumption in stone by choosing a user-visible GUC API that
>> depends on it being true.

> Huh?   We're clearly talking about two different things here, because
> that doesn't make any sense.  Archiving and streaming replication are
> just two means of transporting WAL records from point A to point B.

Sorry, not enough caffeine.  What I should have said was that Hot
Standby could put stronger requirements on what gets put into WAL than
archiving for recovery does.  Heikki's proposal upthread was
wal_mode='standby' versus wal_mode='archive' (versus 'off'), which
seemed sensible to me.

We realized some time ago that it was a good idea to separate
archive_mode (what to put in WAL) from archive_command (whether we are
actually archiving right now).  If we fail to apply that same principle
to Hot Standby, I think we'll come to regret it.
        regards, tom lane


Re: recovery_connections cannot start (was Re: master in standby mode croaks)

From
Heikki Linnakangas
Date:
Tom Lane wrote:
> We realized some time ago that it was a good idea to separate
> archive_mode (what to put in WAL) from archive_command (whether we are
> actually archiving right now).  If we fail to apply that same principle
> to Hot Standby, I think we'll come to regret it.

The recovery_connections GUC does that. If you enable it, the extra
information required for hot standby is written to the WAL, otherwise
it's not.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> Tom Lane wrote:
>> We realized some time ago that it was a good idea to separate
>> archive_mode (what to put in WAL) from archive_command (whether we are
>> actually archiving right now).  If we fail to apply that same principle
>> to Hot Standby, I think we'll come to regret it.

> The recovery_connections GUC does that. If you enable it, the extra
> information required for hot standby is written to the WAL, otherwise
> it's not.

No, driving it off recovery_connections is exactly NOT that.  It's
confusing the transport mechanism with the desired WAL contents.
I maintain that this design is exactly isomorphic to our original PITR
GUC design wherein what got written to WAL was determined by the current
state of archive_command.  We eventually realized that was a bad idea.
So is this.

As a concrete example, there is nothing logically wrong with driving
a hot standby slave from WAL records shipped via old-style pg_standby.
Or how about wanting to turn off recovery_connections temporarily, but
not wanting the archived WAL to be unable to support HS?
        regards, tom lane


On Fri, Apr 23, 2010 at 2:36 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> Tom Lane wrote:
>>> We realized some time ago that it was a good idea to separate
>>> archive_mode (what to put in WAL) from archive_command (whether we are
>>> actually archiving right now).  If we fail to apply that same principle
>>> to Hot Standby, I think we'll come to regret it.
>
>> The recovery_connections GUC does that. If you enable it, the extra
>> information required for hot standby is written to the WAL, otherwise
>> it's not.
>
> No, driving it off recovery_connections is exactly NOT that.  It's
> confusing the transport mechanism with the desired WAL contents.
> I maintain that this design is exactly isomorphic to our original PITR
> GUC design wherein what got written to WAL was determined by the current
> state of archive_command.  We eventually realized that was a bad idea.
> So is this.
>
> As a concrete example, there is nothing logically wrong with driving
> a hot standby slave from WAL records shipped via old-style pg_standby.
> Or how about wanting to turn off recovery_connections temporarily, but
> not wanting the archived WAL to be unable to support HS?

You're all confused about what the different GUCs actually do.  Which
is probably not a good sign for their usability.  But yeah, that's one
of the things that concerned me, too.  If you turn off
max_wal_senders, it doesn't just make it so that no WAL senders can
connect: it actually changes what gets WAL-logged.

...Robert


Re: recovery_connections cannot start (was Re: master in standby mode croaks)

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> As a concrete example, there is nothing logically wrong with
> driving a hot standby slave from WAL records shipped via old-style
> pg_standby.  Or how about wanting to turn off recovery_connections
> temporarily, but not wanting the archived WAL to be unable to
> support HS?
As one more concrete example, we are likely to find SR beneficial if
it can feed into a warm standby, but only if we can also do
traditional WAL file archiving from the same source at the same
time.  The extra logging for HS would be useless for us in any
event.
+1 for *not* tying WAL contents to the transport mechanism.
-Kevin


On Fri, 2010-04-23 at 13:45 -0400, Robert Haas wrote:

> Archiving and streaming replication are
> just two means of transporting WAL records from point A to point B.

> By definition, any two manners of moving a byte stream around are
> isomorphic and can't possibly affect what that byte stream does or
> does not need to contain.

It is currently true, but there is no benefit in us constraining future
implementation routes without good reason.

-- Simon Riggs           www.2ndQuadrant.com



On Fri, Apr 23, 2010 at 2:43 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>> As a concrete example, there is nothing logically wrong with
>> driving a hot standby slave from WAL records shipped via old-style
>> pg_standby.  Or how about wanting to turn off recovery_connections
>> temporarily, but not wanting the archived WAL to be unable to
>> support HS?
>
> As one more concrete example, we are likely to find SR beneficial if
> it can feed into a warm standby, but only if we can also do
> traditional WAL file archiving from the same source at the same
> time.  The extra logging for HS would be useless for us in any
> event.
>
> +1 for *not* tying WAL contents to the transport mechanism.

OK.  Well, it's a shame we didn't get this settled last week when I
first brought it up, but it's not too late to try to straighten it out
if we have a consensus behind changing it, which it's starting to
sound like we do.

...Robert


On Fri, 2010-04-23 at 15:05 -0400, Robert Haas wrote:
> we have a consensus behind changing it, which it's starting to
> sound like we do.

I think you misread the +1s from Masao and myself.

Those confusing things are options and I want them to remain optional,
not compressed into a potentially too simple model based upon how the
world looks right now.

-- Simon Riggs           www.2ndQuadrant.com



On Fri, Apr 23, 2010 at 3:11 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Fri, 2010-04-23 at 15:05 -0400, Robert Haas wrote:
>> we have a consensus behind changing it, which it's starting to
>> sound like we do.
>
> I think you misread the +1s from Masao and myself.
>
> Those confusing things are options and I want them to remain optional,
> not compressed into a potentially too simple model based upon how the
> world looks right now.

I didn't, but Heikki, Kevin and Tom seem to be on the other side, so
we at least have to consider where to go with it.  We're going to need
a bunch of GUCs any way we slice it.  The issue is whether there's a
way to slice it that involves fewer AND and OR operators that have to
be understood by users.  I'm still unconvinced of our ability to come
up with a solid design in the time we have, but I think it would make
sense to listen to proposals people want to make.  I poked some holes
in Heikki's design from this morning (which was, more or less, my
design from last week) but that doesn't mean they can't be plugged.

...Robert


Simon Riggs <simon@2ndQuadrant.com> writes:
> Those confusing things are options and I want them to remain optional,
> not compressed into a potentially too simple model based upon how the
> world looks right now.

What are you arguing is too simple?  What *I* think is too simple is
what we have got now, namely a GUC that controls both the availability
of replication connections and the contents of WAL.
        regards, tom lane


Robert Haas <robertmhaas@gmail.com> writes:
> ...  I'm still unconvinced of our ability to come
> up with a solid design in the time we have, but I think it would make
> sense to listen to proposals people want to make.  I poked some holes
> in Heikki's design from this morning (which was, more or less, my
> design from last week) but that doesn't mean they can't be plugged.

The only hole I saw poked was the one about how archive_mode is used to
decide whether to start the archiver process.  I think we could
reasonably deal with that by starting the archiver iff wal_mode > 'crash'.
There's no point in archiving otherwise, and the overhead of an idle
archiver is small enough that we can live with the corner cases where
you're starting an archiver you don't really need.
        regards, tom lane


On Fri, 2010-04-23 at 15:18 -0400, Robert Haas wrote:

> We're going to need
> a bunch of GUCs any way we slice it.  The issue is whether there's a
> way to slice it that involves fewer AND and OR operators that have to
> be understood by users.

So we're proposing adding parameters to simplify things for users? I
don't think fiddling is going to improve things significantly from a
usability perspective, especially at the last minute. 

I'm guessing this conversation has more to do with the situation that
some very clever people have a little time on their hands after a long
period of hard work. I see no problem that needs to be solved, not
alongside this water cooler at least. Smells like beta time.

-- Simon Riggs           www.2ndQuadrant.com



Re: recovery_connections cannot start (was Re: master in standby mode croaks)

From
"Kevin Grittner"
Date:
Simon Riggs <simon@2ndQuadrant.com> wrote:
> So we're proposing adding parameters to simplify things for users?
I think it's a matter of having parameters which do simple, clear
things; rather than magically interacting to guess what the user
wants.  What do you want to log?  How many connections to you want
to allow for streaming it?  What's your script for sending it in
archive file format?  Is archiving turned on at the moment?  Let's
have GUC for each question, rather than having to work backwards
from what you want to which combination of GUC settings gets you to
that, or at least as close as the magic interpretation allows.
> I don't think fiddling is going to improve things significantly
> from a usability perspective, especially at the last minute.
If it involves changing the internal variables in a dangerous way,
perhaps we should settle for whatever we have at the moment.  If
it's a matter of how they get set from the GUCs, that doesn't sound
very risky to me.  Perhaps there are combinations which were
previously disallowed which would need to be tested, but are there
any other risks?
> [ad hominem digression]
Please, can we keep it to the merits?  It sounds like there are
several reasonable use-cases which could be handled by HS/SR except
for how our GUCs are set up for it.  Why limit the uses to a subset
of where it can be useful?  I'm extraordinarily busy right now,
which is why my skimming of these threads didn't alert me to the
problem sooner.  For that I apologize.
-Kevin


Simon Riggs <simon@2ndQuadrant.com> writes:
> So we're proposing adding parameters to simplify things for users?

Not so much "simplify" as "make understandable"; although flexibility
is a concern too.

> I'm guessing this conversation has more to do with the situation that
> some very clever people have a little time on their hands after a long
> period of hard work. I see no problem that needs to be solved, not
> alongside this water cooler at least. Smells like beta time.

[ shrug... ]  I'm just trying to learn from history and not repeat
a previous mistake.
        regards, tom lane


On Fri, Apr 23, 2010 at 3:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> ...  I'm still unconvinced of our ability to come
>> up with a solid design in the time we have, but I think it would make
>> sense to listen to proposals people want to make.  I poked some holes
>> in Heikki's design from this morning (which was, more or less, my
>> design from last week) but that doesn't mean they can't be plugged.
>
> The only hole I saw poked was the one about how archive_mode is used to
> decide whether to start the archiver process.  I think we could
> reasonably deal with that by starting the archiver iff wal_mode > 'crash'.
> There's no point in archiving otherwise, and the overhead of an idle
> archiver is small enough that we can live with the corner cases where
> you're starting an archiver you don't really need.

Well, I think the real hole is that turning archive_mode=on results in
WAL never being deleted unless it's successfully archived.

But we might be able to handle that like this:

wal_mode={standby|archive|crash}  # or whatever
wal_segments_always=<integer>   # keep this many segments always, for
SR - like current wal_keep_segments
wal_segments_unarchived=<integer> # keep this many unarchived
segments, -1 for infinite
max_wal_senders=<integer>          # same as now
archive_command=<string>            # same as now

So we always retain wal_segments_always segments, but if we have
trouble with archiving we'll retain up to wal_segments_archived.

...Robert


On Fri, 2010-04-23 at 14:56 -0500, Kevin Grittner wrote:
> Simon Riggs <simon@2ndQuadrant.com> wrote:
>  
> > So we're proposing adding parameters to simplify things for users?
>  
> I think it's a matter of having parameters which do simple, clear
> things; rather than magically interacting to guess what the user
> wants.  What do you want to log?  How many connections to you want
> to allow for streaming it?  What's your script for sending it in
> archive file format?  Is archiving turned on at the moment?  Let's
> have GUC for each question, rather than having to work backwards
> from what you want to which combination of GUC settings gets you to
> that, or at least as close as the magic interpretation allows.

I've just committed a change to make Hot Standby depend only upon
the setting "recovery_connections = on" on the master. That makes it
clear that there is one lever, not lots of confusing ones.

That might forestall further changes, because the correct way of doing
this was already as simple as people wanted it to be. The previous
requirement was actually a bug: the method of WAL delivery has nothing
at all to do with Hot Standby (currently).

Not intended to stop further debate, if people wish.

-- Simon Riggs           www.2ndQuadrant.com



Re: recovery_connections cannot start (was Re: master in standby mode croaks)

From
Heikki Linnakangas
Date:
Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> ...  I'm still unconvinced of our ability to come
>> up with a solid design in the time we have, but I think it would make
>> sense to listen to proposals people want to make.  I poked some holes
>> in Heikki's design from this morning (which was, more or less, my
>> design from last week) but that doesn't mean they can't be plugged.
> 
> The only hole I saw poked was the one about how archive_mode is used to
> decide whether to start the archiver process.  I think we could
> reasonably deal with that by starting the archiver iff wal_mode > 'crash'.
> There's no point in archiving otherwise, and the overhead of an idle
> archiver is small enough that we can live with the corner cases where
> you're starting an archiver you don't really need.

Agreed, but a more serious hole is what I pointed out at
http://archives.postgresql.org/message-id/4BD18722.3090608@enterprisedb.com.
That is, if you do:

wal_mode=standby
archive_command=''
max_wal_senders=5

That would be a valid configuration for enabling streaming replication
without archiving (which is possible and reasonable if you set the new
wal_keep_segments setting high enough). But as things stand, WAL
segments would be readied for archiving (.ready files would be created),
but they'e never archived and will accumulate indefinitely in the
master. You could work around that with archive_command='/usr/bin/true',
but that's not user-frienfly.

So my proposal would be:

wal_mode=crash/archive/standby
archive_mode=on/off        # if on, wal_mode must be >= 'archive'
archive_command='command'
max_wal_senders=<integer>    # if > 0, wal_mode must be >= 'archive'

replication_connections is not needed on the master anymore; on the
standby it enables/disables hot standby. It is ignored on the master, to
allow the same configuration file to be used on master and standby.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Robert Haas <robertmhaas@gmail.com> writes:
> Well, I think the real hole is that turning archive_mode=on results in
> WAL never being deleted unless it's successfully archived.

Hm, good point.  And at least in principle you could have SR setups
that don't care about having a backing WAL archive.

> But we might be able to handle that like this:

> wal_mode={standby|archive|crash}  # or whatever
> wal_segments_always=<integer>   # keep this many segments always, for
> SR - like current wal_keep_segments
> wal_segments_unarchived=<integer> # keep this many unarchived
> segments, -1 for infinite
> max_wal_senders=<integer>          # same as now
> archive_command=<string>            # same as now

> So we always retain wal_segments_always segments, but if we have
> trouble with archiving we'll retain up to wal_segments_archived.

And when that limit is reached, what happens?  Panic shutdown?
Silently drop unarchived data?  Neither one sounds very good.

I think either you want your WAL archived or you don't.  "Archive
if it's convenient" doesn't sound like a useful operating mode.
So maybe we do indeed need to keep archive_mode as a separate toggle.
        regards, tom lane


On Fri, 2010-04-23 at 23:10 +0300, Heikki Linnakangas wrote:
> So my proposal would be:
> 
> wal_mode=crash/archive/standby

OK, I agree to change in this area.

I definitely don't like the word "crash", which may scare and confuse
people. I don't think I would ever set any parameter to a word like
"crash" since it isn't clear whether it allows that event or protects
against it. Also, I don't like the word "standby" on its own, since that
has already been used for Warm Standby for some time, which corresponds
to the "archive" setting and is therefore confusing.

How about something like

wal_additional_info = none | archive | connect

Then its easy to understand that things slow down when you request
additional information in the WAL, and also clear that Hot Standby
requires slightly more info on top of that. It's also clear that this
has nothing at all to do with the delivery mechanism.

-- Simon Riggs           www.2ndQuadrant.com



Re: recovery_connections cannot start (was Re: master in standby mode croaks)

From
"Kevin Grittner"
Date:
Simon Riggs <simon@2ndQuadrant.com> wrote:
> On Fri, 2010-04-23 at 23:10 +0300, Heikki Linnakangas wrote:
>> So my proposal would be:
>> 
>> wal_mode=crash/archive/standby
> I definitely don't like the word "crash", which may scare and
> confuse people. I don't think I would ever set any parameter to a
> word like "crash" since it isn't clear whether it allows that
> event or protects against it. Also, I don't like the word
> "standby" on its own, since that has already been used for Warm
> Standby for some time, which corresponds to the "archive" setting
> and is therefore confusing.
Good points, although "recovery" instead of "crash" would seem to
cover that.
> How about something like
> 
> wal_additional_info = none | archive | connect
> 
> Then its easy to understand that things slow down when you request
> additional information in the WAL, and also clear that Hot Standby
> requires slightly more info on top of that. It's also clear that
> this has nothing at all to do with the delivery mechanism.
Are we going to support running warm standby through SR?  If so,
"connect" seems confusing for the level to support hot standby. 
Perhaps "live"?:
wal_mode=recovery/archive/live
-Kevin


Simon Riggs <simon@2ndQuadrant.com> writes:
> How about something like

> wal_additional_info = none | archive | connect

"connect" seems like a completely inappropriate word here.  It is
not obviously related to HS slaves and it could be taken to refer
to ordinary database connections (sessions).

Personally I agree with your objection to "crash" but not with the
objection to "standby".  Maybe this would be appropriate:
wal_mode = minimal | archive | hot_standby
        regards, tom lane


On Fri, Apr 23, 2010 at 4:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
>> How about something like
>
>> wal_additional_info = none | archive | connect
>
> "connect" seems like a completely inappropriate word here.  It is
> not obviously related to HS slaves and it could be taken to refer
> to ordinary database connections (sessions).
>
> Personally I agree with your objection to "crash" but not with the
> objection to "standby".  Maybe this would be appropriate:
>
>        wal_mode = minimal | archive | hot_standby

I was thinking maybe "log_shipping" instead of "archive", since we're
conflating the technology (log shipping) with the technology used to
implement it (archiving or streaming).

Possible "crash_recovery" rather than just "crash" where you have "mimimal".

I don't love "hot_standby" either but it might be the least of evils.

...Robert


On Fri, 2010-04-23 at 16:50 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > How about something like
> 
> > wal_additional_info = none | archive | connect
> 
> "connect" seems like a completely inappropriate word here.  It is
> not obviously related to HS slaves and it could be taken to refer
> to ordinary database connections (sessions).
> 
> Personally I agree with your objection to "crash" but not with the
> objection to "standby".  Maybe this would be appropriate:
> 
>     wal_mode = minimal | archive | hot_standby

Sounds good, I'll go for that.


In my understanding this means that archive_mode does completely and the
max_wal_senders does not affect WAL contents?

Does that mean that wal_mode can be SIGHUP now? It would be good. I
think this is how to do that: 
At the start of every WAL-avoiding operation we could take a copy of
wal_mode for the server and store in MyProc->wal_mode. At transaction
start we would set that to "not set". We could then make
pg_start_backup() wait for all transactions with wal_mode set to
complete before we continue.

-- Simon Riggs           www.2ndQuadrant.com



On Fri, 2010-04-23 at 17:29 -0400, Robert Haas wrote:
> Possible "crash_recovery" rather than just "crash" where you have
> "mimimal".

Minimal is good because it is a performance option also, which is an
aspect "crash_recovery" does not convey. 

(Plus we use the word crash again, which is too scary to use)

-- Simon Riggs           www.2ndQuadrant.com



On Fri, Apr 23, 2010 at 4:10 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> So my proposal would be:
>
> wal_mode=crash/archive/standby
> archive_mode=on/off             # if on, wal_mode must be >= 'archive'
> archive_command='command'
> max_wal_senders=<integer>       # if > 0, wal_mode must be >= 'archive'

As a general design comment, I think we should avoid still having an
archive_mode GUC but having it do something different.  If we're going
to change the semantics, we should also change the name, maybe to
"archiving".

...Robert


On Fri, 2010-04-23 at 17:43 -0400, Robert Haas wrote:
> On Fri, Apr 23, 2010 at 4:10 PM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
> > So my proposal would be:
> >
> > wal_mode=crash/archive/standby
> > archive_mode=on/off             # if on, wal_mode must be >= 'archive'
> > archive_command='command'
> > max_wal_senders=<integer>       # if > 0, wal_mode must be >= 'archive'
> 
> As a general design comment, I think we should avoid still having an
> archive_mode GUC but having it do something different.  If we're going
> to change the semantics, we should also change the name, maybe to
> "archiving".

We don't need *both* wal_mode and archive_mode, since archive_mode
exists only to ensure that full WAL is written even when archive_command
= '' momentarily.

Should do this

> > wal_mode=crash/archive/standby
> > archive_command='command'
> > max_wal_senders=<integer>       # if > 0, wal_mode must be >= 'archive'

and make wal_mode SIGHUP

-- Simon Riggs           www.2ndQuadrant.com



Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Apr 23, 2010 at 4:10 PM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>> So my proposal would be:
>> 
>> wal_mode=crash/archive/standby
>> archive_mode=on/off             # if on, wal_mode must be >= 'archive'
>> archive_command='command'
>> max_wal_senders=<integer>       # if > 0, wal_mode must be >= 'archive'

> As a general design comment, I think we should avoid still having an
> archive_mode GUC but having it do something different.  If we're going
> to change the semantics, we should also change the name, maybe to
> "archiving".

Agreed on the general point, but AFAICS that proposal keeps the meaning
of archive_mode the same as it was.
        regards, tom lane


Simon Riggs <simon@2ndQuadrant.com> writes:
> In my understanding this means that archive_mode does completely and the
> max_wal_senders does not affect WAL contents?

I think we'd concluded that we have to keep archive_mode as a separate
boolean.  (Or we could use Heikki's idea of a max number of unarchived
segments to hold onto, but I maintain that there are only two useful
values and so we might as well leave it as the existing boolean.)

> Does that mean that wal_mode can be SIGHUP now? It would be good. I
> think this is how to do that: 
> At the start of every WAL-avoiding operation we could take a copy of
> wal_mode for the server and store in MyProc->wal_mode. At transaction
> start we would set that to "not set". We could then make
> pg_start_backup() wait for all transactions with wal_mode set to
> complete before we continue.

I think that there are probably more synchronization issues than that,
and in any case now is not the time to be trying to implement that
feature.  Maybe we can make it work in 9.1.
        regards, tom lane


Simon Riggs <simon@2ndQuadrant.com> writes:
> We don't need *both* wal_mode and archive_mode, since archive_mode
> exists only to ensure that full WAL is written even when archive_command
> = '' momentarily.

No, you missed the point of the upthread discussion: archive_mode
controls whether to start the archiver *and whether to hold onto
not-yet-archived segments*.  We could maybe finesse the first point
but it's much harder to deal with the latter.  The only workable
alternative I can see to keeping archive_mode is to tell people to
set archive_command to something like /usr/bin/true ... which is not
simpler, especially not on Windows.
        regards, tom lane


On Fri, Apr 23, 2010 at 6:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Fri, Apr 23, 2010 at 4:10 PM, Heikki Linnakangas
>> <heikki.linnakangas@enterprisedb.com> wrote:
>>> So my proposal would be:
>>>
>>> wal_mode=crash/archive/standby
>>> archive_mode=on/off             # if on, wal_mode must be >= 'archive'
>>> archive_command='command'
>>> max_wal_senders=<integer>       # if > 0, wal_mode must be >= 'archive'
>
>> As a general design comment, I think we should avoid still having an
>> archive_mode GUC but having it do something different.  If we're going
>> to change the semantics, we should also change the name, maybe to
>> "archiving".
>
> Agreed on the general point, but AFAICS that proposal keeps the meaning
> of archive_mode the same as it was.

Well, clearly it doesn't.  Someone who thinks they can simply turn
archive_mode=on and set archive_command is going to be sadly
disappointed.  Before, archive_mode arguably switched the server
between two "modes", with a whole set of behaviors associated with it:
type of WAL logging, whether the archive runs, number of WAL segments
maintained.  Under any of the proposals on the table (other than,
"just adjust the error message", which still seems tempting) it's new
purview will be more limited.

...Robert

...Robert


Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Apr 23, 2010 at 6:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Agreed on the general point, but AFAICS that proposal keeps the meaning
>> of archive_mode the same as it was.

> Well, clearly it doesn't.  Someone who thinks they can simply turn
> archive_mode=on and set archive_command is going to be sadly
> disappointed.

Well, there is another variable that they'll have to adjust as well,
but ISTM that archive_mode still does what it did before, ie, determine
whether we attempt to archive WAL segments.
        regards, tom lane


On Fri, Apr 23, 2010 at 7:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Fri, Apr 23, 2010 at 6:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Agreed on the general point, but AFAICS that proposal keeps the meaning
>>> of archive_mode the same as it was.
>
>> Well, clearly it doesn't.  Someone who thinks they can simply turn
>> archive_mode=on and set archive_command is going to be sadly
>> disappointed.
>
> Well, there is another variable that they'll have to adjust as well,
> but ISTM that archive_mode still does what it did before, ie, determine
> whether we attempt to archive WAL segments.

But it doesn't do EVERYTHING that it did before.  Changing the name
would make that a lot more clear.  Of course I just work here.

...Robert


Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Apr 23, 2010 at 7:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Well, there is another variable that they'll have to adjust as well,
>> but ISTM that archive_mode still does what it did before, ie, determine
>> whether we attempt to archive WAL segments.

> But it doesn't do EVERYTHING that it did before.  Changing the name
> would make that a lot more clear.  Of course I just work here.

I think from the user's point of view it does what it did before.
The fact that the actual content of WAL changed was an implementation
detail that users weren't aware of.  Now that we have two interacting
features that affect WAL contents, it's getting too hard to hide that
from users --- but I see no need to rename archive_mode.
        regards, tom lane


On Fri, Apr 23, 2010 at 7:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Fri, Apr 23, 2010 at 7:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Well, there is another variable that they'll have to adjust as well,
>>> but ISTM that archive_mode still does what it did before, ie, determine
>>> whether we attempt to archive WAL segments.
>
>> But it doesn't do EVERYTHING that it did before.  Changing the name
>> would make that a lot more clear.  Of course I just work here.
>
> I think from the user's point of view it does what it did before.
> The fact that the actual content of WAL changed was an implementation
> detail that users weren't aware of.  Now that we have two interacting
> features that affect WAL contents, it's getting too hard to hide that
> from users --- but I see no need to rename archive_mode.

Well, when people use their same settings that they used for 8.4 and
it doesn't work, you can field those reports...

...Robert


Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Apr 23, 2010 at 7:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think from the user's point of view it does what it did before.
>> The fact that the actual content of WAL changed was an implementation
>> detail that users weren't aware of.  Now that we have two interacting
>> features that affect WAL contents, it's getting too hard to hide that
>> from users --- but I see no need to rename archive_mode.

> Well, when people use their same settings that they used for 8.4 and
> it doesn't work, you can field those reports...

I would expect that they'll get an error message that makes it clear
enough what to do ;-).  In any case, changing the name is hardly going
to fix things so that 8.4 settings will still work, so why are you
giving that case as an argument for it?
        regards, tom lane


On Fri, Apr 23, 2010 at 7:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Fri, Apr 23, 2010 at 7:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I think from the user's point of view it does what it did before.
>>> The fact that the actual content of WAL changed was an implementation
>>> detail that users weren't aware of.  Now that we have two interacting
>>> features that affect WAL contents, it's getting too hard to hide that
>>> from users --- but I see no need to rename archive_mode.
>
>> Well, when people use their same settings that they used for 8.4 and
>> it doesn't work, you can field those reports...
>
> I would expect that they'll get an error message that makes it clear
> enough what to do ;-).  In any case, changing the name is hardly going
> to fix things so that 8.4 settings will still work, so why are you
> giving that case as an argument for it?

Principle of obvious breakage.

...Robert


Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Apr 23, 2010 at 7:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I would expect that they'll get an error message that makes it clear
>> enough what to do ;-). �In any case, changing the name is hardly going
>> to fix things so that 8.4 settings will still work, so why are you
>> giving that case as an argument for it?

> Principle of obvious breakage.

And?  If we do it by adding the new variable while not renaming
archive_mode, then I'd expect an 8.4 configuration to yield an error
along the lines of

ERROR: invalid combination of configuration parameters
HINT: To turn on archive_mode, you must set wal_mode to "archive" or "hot_standby".

(precise wording open to debate, but clearly we can do at least this
well) whereas if we rename archive_mode, it's unlikely we can do better
than

ERROR: unrecognized parameter "archive_mode"

Do you really think the second one is going to make any user happier
than the first?
        regards, tom lane


On Fri, Apr 23, 2010 at 8:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Fri, Apr 23, 2010 at 7:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I would expect that they'll get an error message that makes it clear
>>> enough what to do ;-).  In any case, changing the name is hardly going
>>> to fix things so that 8.4 settings will still work, so why are you
>>> giving that case as an argument for it?
>
>> Principle of obvious breakage.
>
> And?  If we do it by adding the new variable while not renaming
> archive_mode, then I'd expect an 8.4 configuration to yield an error
> along the lines of
>
> ERROR: invalid combination of configuration parameters
> HINT: To turn on archive_mode, you must set wal_mode to "archive" or "hot_standby".
>
> (precise wording open to debate, but clearly we can do at least this
> well) whereas if we rename archive_mode, it's unlikely we can do better
> than
>
> ERROR: unrecognized parameter "archive_mode"
>
> Do you really think the second one is going to make any user happier
> than the first?

OK, good point.  I overlooked the fact that we could cross-check the
parameter settings on the master - I was imagining the error showing
up on the standby.  Guess I'm a little slow today...

...Robert


On Fri, 2010-04-23 at 19:33 -0400, Robert Haas wrote:

> Principle of obvious breakage.

That is a good principle. It can be applied both ways here.

Changing user interfaces (or indeed, anything) to very little obvious
gain is a considerable annoyance to users. IIABDFI

We need to be aware of the timing issues on the project. Changing
something that has been the same for years is just annoying to existing
users and makes upgrading to our brand new shiny software much harder
than we ourselves would like that to be. But also, deferring solutions
to user problems for vague reasons also needs to be avoided because
waiting til next release moves the time to fix from about 6 months to
about 18 months on average, which crosses patience threshold. So in
general, I seek to speed up necessary change and slow down unnecessary
change requests. I think we're improving on both.

-- Simon Riggs           www.2ndQuadrant.com



Re: recovery_connections cannot start

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
>   The only workable
> alternative I can see to keeping archive_mode is to tell people to
> set archive_command to something like /usr/bin/true ... which is not
> simpler, especially not on Windows.

Would it be possible to have "internal" commands there, as for example
cd is in my shell, or test, or time, or some more ?

That would allow for providing a portable /usr/bin/true command as far
as archiving is concerned (say, pg_archive_bypass), and will allow for
providing a default archiving command in the future, like "pg_archive_cp
/location" or something.

Regards,
--
dim


Re: recovery_connections cannot start

From
Simon Riggs
Date:
On Mon, 2010-04-26 at 10:41 +0200, Dimitri Fontaine wrote:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
> >   The only workable
> > alternative I can see to keeping archive_mode is to tell people to
> > set archive_command to something like /usr/bin/true ... which is not
> > simpler, especially not on Windows.
> 
> Would it be possible to have "internal" commands there, as for example
> cd is in my shell, or test, or time, or some more ?
> 
> That would allow for providing a portable /usr/bin/true command as far
> as archiving is concerned (say, pg_archive_bypass), and will allow for
> providing a default archiving command in the future, like "pg_archive_cp
> /location" or something. 

I think making a special case here is OK. 

If command string == 'true' then we don't bother to call system(3) at
all, we just assume it worked fine. 

That way we have a simple route on all platforms.

-- Simon Riggs           www.2ndQuadrant.com



On Fri, Apr 23, 2010 at 4:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Well, I think the real hole is that turning archive_mode=on results in
>> WAL never being deleted unless it's successfully archived.
>
> Hm, good point.  And at least in principle you could have SR setups
> that don't care about having a backing WAL archive.
>
>> But we might be able to handle that like this:
>
>> wal_mode={standby|archive|crash}  # or whatever
>> wal_segments_always=<integer>   # keep this many segments always, for
>> SR - like current wal_keep_segments
>> wal_segments_unarchived=<integer> # keep this many unarchived
>> segments, -1 for infinite
>> max_wal_senders=<integer>          # same as now
>> archive_command=<string>            # same as now
>
>> So we always retain wal_segments_always segments, but if we have
>> trouble with archiving we'll retain up to wal_segments_archived.
>
> And when that limit is reached, what happens?  Panic shutdown?
> Silently drop unarchived data?  Neither one sounds very good.

Silently drop unarchived data.  I agree that isn't very good, but
think about it this way: if archive_command is failing, then our log
shipping slave is not going to work.  But letting the disk fill up on
the primary does not make it any better.  It just makes the primary
stop working, too.  Obviously, all of this stuff needs to be monitored
or you're playing with fire, but I don't think having a safety valve
on the primary is a stupid idea.

...Robert


Re: recovery_connections cannot start

From
Robert Haas
Date:
On Mon, Apr 26, 2010 at 6:08 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Mon, 2010-04-26 at 10:41 +0200, Dimitri Fontaine wrote:
>> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> >   The only workable
>> > alternative I can see to keeping archive_mode is to tell people to
>> > set archive_command to something like /usr/bin/true ... which is not
>> > simpler, especially not on Windows.
>>
>> Would it be possible to have "internal" commands there, as for example
>> cd is in my shell, or test, or time, or some more ?
>>
>> That would allow for providing a portable /usr/bin/true command as far
>> as archiving is concerned (say, pg_archive_bypass), and will allow for
>> providing a default archiving command in the future, like "pg_archive_cp
>> /location" or something.
>
> I think making a special case here is OK.
>
> If command string == 'true' then we don't bother to call system(3) at
> all, we just assume it worked fine.
>
> That way we have a simple route on all platforms.

Separating wal_mode and archive_mode, as we recently discussed, might
eliminate the need for this kludge, if archive_mode can then be made
changeable without a restart.

...Robert


Re: recovery_connections cannot start (was Re: master in standby mode croaks)

From
Heikki Linnakangas
Date:
Tom Lane wrote:
> Personally I agree with your objection to "crash" but not with the
> objection to "standby".  Maybe this would be appropriate:
>
>     wal_mode = minimal | archive | hot_standby

Ok, here's a patch implementing this proposal. It adds a new wal_mode
setting, leaving archive_mode as it is. If you try to enable
archive_mode when wal_mode is 'minimal', you get a warning and
archive_mode is silently ignored. Likewise streaming replication
connections are not allowed if wal_mode is 'minimal'.
recovery_connections now does nothing in the master.

A bit more bikeshedding before I commit this:

* Should an invalid combination throw an ERROR and refuse to start,
instead of just warning?

* How about naming the parameter wal_level instead of wal_mode? That
would better convey that the higher levels add stuff on top of the lower
levels, instead of having different modes that are somehow mutually
exclusive.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index eb5765a..6c6a504 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -689,8 +689,7 @@ archive_command = 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/ser
    </para>

    <para>
-    When <varname>archive_mode</> is <literal>off</> and <xref
-    linkend="guc-max-wal-senders"> is zero some SQL commands
+    When <varname>wal_mode</> is <literal>minimal</> some SQL commands
     are optimized to avoid WAL logging, as described in <xref
     linkend="populate-pitr">.  If archiving or streaming replication were
     turned on during execution of one of these statements, WAL would not
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index c5692ba..63ca749 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1353,6 +1353,43 @@ SET ENABLE_SEQSCAN TO OFF;
      <title>Settings</title>
      <variablelist>

+     <varlistentry id="guc-wal-mode" xreflabel="wal_mode">
+      <term><varname>wal_mode</varname> (<type>enum</type>)</term>
+      <indexterm>
+       <primary><varname>wal_mode</> configuration parameter</primary>
+      </indexterm>
+      <listitem>
+       <para>
+        <varname>wal_mode</> determines how much information is written
+        to the WAL. The default value is <literal>minimal</>, which writes
+        only minimal information needed to recover from a crash or immediate
+        shutdown. <literal>archive</> adds logging required for WAL archiving,
+        and <literal>hot_standby</> further adds extra information about
+        running transactions required to run read-only queries on a standby
+        server.
+        This parameter can only be set at server start.
+       </para>
+       <para>
+        In <literal>minimal</> mode, WAL-logging of some bulk operations, like
+        <command>CREATE INDEX</>, <command>CLUSTER</> and <command>COPY</> on
+        a table that was created or truncated in the same transaction can be
+        safely skipped, which can make those operations much faster, but
+        minimal WAL does not contain enough information to reconstruct the
+        data from a base backup and the WAL logs, so at least
+        <literal>archive</> level must be used to enable WAL archiving
+        (<xref linkend="guc-archive-mode">) and streaming replication. See
+        also <xref linkend="populate-pitr">.
+       </para>
+       <para>
+        In <literal>hot_standby</> mode, the same information is logged as
+        in <literal>archive</> mode, plus information needed to reconstruct
+        the status of running transactions from the WAL. To enable read-only
+        queries on a standby server, <varname>wal_mode</> must be set to
+        <literal>hot_standby</> on the primary.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-fsync" xreflabel="fsync">
       <indexterm>
        <primary><varname>fsync</> configuration parameter</primary>
@@ -1726,7 +1763,9 @@ SET ENABLE_SEQSCAN TO OFF;
         <varname>archive_mode</> and <varname>archive_command</> are
         separate variables so that <varname>archive_command</> can be
         changed without leaving archiving mode.
-        This parameter can only be set at server start.
+        This parameter can only be set at server start. It is ignored
+        unless <varname>wal_mode</> is set to <literal>archive</> or
+        <literal>hot_standby</>.
        </para>
       </listitem>
      </varlistentry>
@@ -1884,16 +1923,14 @@ SET ENABLE_SEQSCAN TO OFF;
       </indexterm>
       <listitem>
        <para>
-        Parameter has two roles. During recovery, specifies whether or not
-        you can connect and run queries to enable <xref linkend="hot-standby">.
-        During normal running, specifies whether additional information is written
-        to WAL to allow recovery connections on a standby server that reads
-        WAL data generated by this server. The default value is
+        During recovery, specifies whether or not you can connect and run
+        queries to enable <xref linkend="hot-standby">. The default value is
         <literal>on</literal>.  It is thought that there is little
         measurable difference in performance from using this feature, so
         feedback is welcome if any production impacts are noticeable.
         It is likely that this parameter will be removed in later releases.
-        This parameter can only be set at server start.
+        This parameter can only be set at server start. It is ignored when
+        not in standby mode.
        </para>
       </listitem>
      </varlistentry>
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index d69f2ea..7fa0817 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -1589,9 +1589,9 @@ LOG:  database system is ready to accept read only connections
 </programlisting>

     Consistency information is recorded once per checkpoint on the primary, as long
-    as <varname>recovery_connections</> is enabled on the primary.  It is not possible
+    as <varname>wal_mode</> is set to <literal>hot_standby</> on the primary.  It is not possible
     to enable recovery connections on the standby when reading WAL written during the
-    period that <varname>recovery_connections</> was disabled on the primary.
+    period that <varname>wal_mode</> was not set to <literal>hot_standby</> on the primary.
     Reaching a consistent state can also be delayed in the presence
     of both of these conditions:

@@ -1838,7 +1838,7 @@ LOG:  database system is ready to accept read only connections
    </para>

    <para>
-    On the primary, parameters <varname>recovery_connections</> and
+    On the primary, parameters <varname>wal_mode</> and
     <varname>vacuum_defer_cleanup_age</> can be used.
     <varname>max_standby_delay</> has no effect if set on the primary.
    </para>
diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index b00e69f..a493348 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -835,10 +835,9 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
     <command>TRUNCATE</command> command. In such cases no WAL
     needs to be written, because in case of an error, the files
     containing the newly loaded data will be removed anyway.
-    However, this consideration does not apply when
-    <xref linkend="guc-archive-mode"> is on or streaming replication
-    is allowed (i.e., <xref linkend="guc-max-wal-senders"> is more
-    than or equal to one), as all commands must write WAL in that case.
+    However, this consideration only applies when
+    <xref linkend="guc-wal-mode"> is <literal>minimal</> as all commands
+    must write WAL otherwise.
    </para>

   </sect2>
@@ -910,18 +909,16 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
   </sect2>

   <sect2 id="populate-pitr">
-   <title>Turn off <varname>archive_mode</varname> and streaming replication</title>
+   <title>Disable WAL archival and streaming replication</title>

    <para>
     When loading large amounts of data into an installation that uses
     WAL archiving or streaming replication, you might want to disable
-    archiving (turn off the <xref linkend="guc-archive-mode">
-    configuration variable) and replication (zero the
-    <xref linkend="guc-max-wal-senders"> configuration variable)
-    while loading.  It might be
+    archiving and replication by setting the <xref linkend="guc-wal-mode">
+    configuration variable to <literal>minimal</> while loading.  It might be
     faster to take a new base backup after the load has completed
     than to process a large amount of incremental WAL data.
-    But note that changing either of these variables requires
+    But note that changing <varname>wal_mode</> requires
     a server restart.
    </para>

@@ -929,10 +926,9 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
     Aside from avoiding the time for the archiver or WAL sender to
     process the WAL data,
     doing this will actually make certain commands faster, because they
-    are designed not to write WAL at all if <varname>archive_mode</varname>
-    is off and <varname>max_wal_senders</varname> is zero.  (They can
-    guarantee crash safety more cheaply by doing an
-    <function>fsync</> at the end than by writing WAL.)
+    are designed not to write WAL at all if <varname>wal_mode</varname>
+    is <literal>minimal</>.  (They can guarantee crash safety more cheaply
+    by doing an <function>fsync</> at the end than by writing WAL.)
     This applies to the following commands:
     <itemizedlist>
      <listitem>
@@ -1015,9 +1011,10 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
      <listitem>
       <para>
        If using WAL archiving, consider disabling it during the restore.
-       To do that, turn off <varname>archive_mode</varname> before loading the
-       dump script, and afterwards turn it back on
-       and take a fresh base backup.
+       To do that, set <varname>wal_mode</varname> to <literal>minimal</>
+       before loading the dump script, and afterwards set it back to
+       <literal>archive</> or <literal>hot_standby</> and take a fresh
+       base backup.
       </para>
      </listitem>
      <listitem>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 7647f4e..0879b05 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -76,6 +76,7 @@ int            MaxStandbyDelay = 30;
 bool        fullPageWrites = true;
 bool        log_checkpoints = false;
 int            sync_method = DEFAULT_SYNC_METHOD;
+int            wal_mode = WAL_MODE_MINIMAL;

 #ifdef WAL_DEBUG
 bool        XLOG_DEBUG = false;
@@ -97,6 +98,13 @@ bool        XLOG_DEBUG = false;
 /*
  * GUC support
  */
+const struct config_enum_entry wal_mode_options[] = {
+    {"minimal", WAL_MODE_MINIMAL, false},
+    {"archive", WAL_MODE_ARCHIVE, false},
+    {"hot_standby", WAL_MODE_HOT_STANDBY, false},
+    {NULL, 0, false}
+};
+
 const struct config_enum_entry sync_method_options[] = {
     {"fsync", SYNC_METHOD_FSYNC, false},
 #ifdef HAVE_FSYNC_WRITETHROUGH
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 47f71bd..eeab0f3 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -728,6 +728,12 @@ PostmasterMain(int argc, char *argv[])
         write_stderr("%s: superuser_reserved_connections must be less than max_connections\n", progname);
         ExitPostmaster(1);
     }
+    if (XLogArchiveMode && wal_mode == WAL_MODE_MINIMAL)
+        ereport(WARNING,
+                (errmsg("archive_mode ignored because wal_mode is 'minimal'")));
+    if (max_wal_senders > 0 && wal_mode == WAL_MODE_MINIMAL)
+        ereport(WARNING,
+                (errmsg("WAL streaming connections not allowed because wal_mode is 'minimal'")));

     /*
      * Other one-time internal sanity checks can go here, if they are fast.
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 3838665..35bc772 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -253,6 +253,24 @@ WalSndHandshake(void)
                     {
                         StringInfoData buf;

+                        /*
+                         * Check that we're logging enough information in the
+                         * WAL for log-shipping.
+                         *
+                         * NOTE: This only checks the current value of
+                         * wal_mode. Even if the current setting is not
+                         * 'minimal', there can be old WAL in the pg_xlog
+                         * directory that was created with 'minimal'.
+                         * So this is not bulletproof, the purpose is
+                         * just to give a user-friendly error message that
+                         * hints how to configure the system correctly.
+                         */
+                        if (wal_mode == WAL_MODE_MINIMAL)
+                            ereport(FATAL,
+                                    (errcode(ERRCODE_CANNOT_CONNECT_NOW),
+                                     errmsg("standby connections not allowed because wal_mode='minimal'")));
+
+
                         /* Send a CopyOutResponse message, and start streaming */
                         pq_beginmessage(&buf, 'H');
                         pq_sendbyte(&buf, 0);
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 51753d6..a92a874 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -256,7 +256,7 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
      */
     if (!TransactionIdIsValid(latestRemovedXid))
     {
-        elog(DEBUG1, "Invalid latestremovexXid reported, using latestcompletedxid instead");
+        elog(DEBUG1, "invalid latestremovexXid reported, using latestcompletedxid instead");

         LWLockAcquire(ProcArrayLock, LW_SHARED);
         latestRemovedXid = ShmemVariableCache->latestCompletedXid;
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 2434cc0..2fb4090 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -340,6 +340,7 @@ static const struct config_enum_entry constraint_exclusion_options[] = {
 /*
  * Options for enum values stored in other modules
  */
+extern const struct config_enum_entry wal_mode_options[];
 extern const struct config_enum_entry sync_method_options[];

 /*
@@ -2785,6 +2786,15 @@ static struct config_enum ConfigureNamesEnum[] =
     },

     {
+        {"wal_mode", PGC_POSTMASTER, WAL_SETTINGS,
+            gettext_noop("Set the level of information written to the WAL."),
+            NULL
+        },
+        &wal_mode,
+        WAL_MODE_MINIMAL, wal_mode_options, NULL
+    },
+
+    {
         {"wal_sync_method", PGC_SIGHUP, WAL_SETTINGS,
             gettext_noop("Selects the method used for forcing WAL updates to disk."),
             NULL
@@ -7862,7 +7872,7 @@ pg_timezone_abbrev_initialize(void)
 static const char *
 show_archive_command(void)
 {
-    if (XLogArchiveMode)
+    if (XLogArchivingActive())
         return XLogArchiveCommand;
     else
         return "(disabled)";
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 92763eb..c9ee77c 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -150,6 +150,7 @@

 # - Settings -

+#wal_mode = minimal            # minimal, archive, or hot_standby
 #fsync = on                # turns forced synchronization on or off
 #synchronous_commit = on        # immediate fsync at commit
 #wal_sync_method = fsync        # the default is the first option
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 6bfc7d5..fa209a5 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -196,23 +196,23 @@ extern bool log_checkpoints;
 extern bool XLogRequestRecoveryConnections;
 extern int    MaxStandbyDelay;

-#define XLogArchivingActive()    (XLogArchiveMode)
-#define XLogArchiveCommandSet() (XLogArchiveCommand[0] != '\0')
+/* WAL modes */
+#define WAL_MODE_MINIMAL        0
+#define WAL_MODE_ARCHIVE        1
+#define WAL_MODE_HOT_STANDBY    2
+extern int    wal_mode;

-/*
- * This is in walsender.c, but declared here so that we don't need to include
- * walsender.h in all files that check XLogIsNeeded()
- */
-extern int    max_wal_senders;
+#define XLogArchivingActive()    (XLogArchiveMode && wal_mode >= WAL_MODE_ARCHIVE)
+#define XLogArchiveCommandSet() (XLogArchiveCommand[0] != '\0')

 /*
- * Is WAL-logging necessary? We need to log an XLOG record iff either
- * WAL archiving is enabled or XLOG streaming is allowed.
+ * Is WAL-logging necessary for archival or log-shipping, or can we skip
+ * WAL-logging if we fsync() the data before committing instead?
  */
-#define XLogIsNeeded() (XLogArchivingActive() || (max_wal_senders > 0))
+#define XLogIsNeeded() (wal_mode >= WAL_MODE_ARCHIVE)

 /* Do we need to WAL-log information required only for Hot Standby? */
-#define XLogStandbyInfoActive() (XLogRequestRecoveryConnections && XLogIsNeeded())
+#define XLogStandbyInfoActive() (wal_mode >= WAL_MODE_HOT_STANDBY)

 #ifdef WAL_DEBUG
 extern bool XLOG_DEBUG;
diff --git a/src/include/replication/walsender.h b/src/include/replication/walsender.h
index 6ad40a9..db64c88 100644
--- a/src/include/replication/walsender.h
+++ b/src/include/replication/walsender.h
@@ -39,6 +39,7 @@ extern bool am_walsender;

 /* user-settable parameters */
 extern int    WalSndDelay;
+extern int    max_wal_senders;

 extern int    WalSenderMain(void);
 extern void WalSndSignals(void);

Re: recovery_connections cannot start (was Re: master in standby mode croaks)

From
Stefan Kaltenbrunner
Date:
Robert Haas wrote:
> On Fri, Apr 23, 2010 at 4:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> Well, I think the real hole is that turning archive_mode=on results in
>>> WAL never being deleted unless it's successfully archived.
>> Hm, good point.  And at least in principle you could have SR setups
>> that don't care about having a backing WAL archive.
>>
>>> But we might be able to handle that like this:
>>> wal_mode={standby|archive|crash}  # or whatever
>>> wal_segments_always=<integer>   # keep this many segments always, for
>>> SR - like current wal_keep_segments
>>> wal_segments_unarchived=<integer> # keep this many unarchived
>>> segments, -1 for infinite
>>> max_wal_senders=<integer>          # same as now
>>> archive_command=<string>            # same as now
>>> So we always retain wal_segments_always segments, but if we have
>>> trouble with archiving we'll retain up to wal_segments_archived.
>> And when that limit is reached, what happens?  Panic shutdown?
>> Silently drop unarchived data?  Neither one sounds very good.
> 
> Silently drop unarchived data.  I agree that isn't very good, but
> think about it this way: if archive_command is failing, then our log
> shipping slave is not going to work.  But letting the disk fill up on
> the primary does not make it any better.  It just makes the primary
> stop working, too.  Obviously, all of this stuff needs to be monitored
> or you're playing with fire, but I don't think having a safety valve
> on the primary is a stupid idea.

hmm not sure I agree - you need to monitor diskspace usage in general on 
a system for obvious reasons. I think dealing with that kind of stuff is 
not really in our realm. We are a relational database and we need to 
guard the data, silently dropping data is imho not a good idea.
Just picture the typical scenario of maintenance during night times on 
the standby done by a sysadmin with some batch jobs running on the 
master just generating enough WAL to exceed the limit that will just 
cause the sysadmin to call the DBA in.
In general the question really is "will people set this to something 
sensible or rather to an absurdly high value just to avoid that their 
replication will ever break" - I guess people will do that later in 
critical environments...


Stefan


On Mon, Apr 26, 2010 at 8:05 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> Tom Lane wrote:
>> Personally I agree with your objection to "crash" but not with the
>> objection to "standby".  Maybe this would be appropriate:
>>
>>       wal_mode = minimal | archive | hot_standby
>
> Ok, here's a patch implementing this proposal. It adds a new wal_mode
> setting, leaving archive_mode as it is. If you try to enable
> archive_mode when wal_mode is 'minimal', you get a warning and
> archive_mode is silently ignored. Likewise streaming replication
> connections are not allowed if wal_mode is 'minimal'.
> recovery_connections now does nothing in the master.
>
> A bit more bikeshedding before I commit this:
>
> * Should an invalid combination throw an ERROR and refuse to start,
> instead of just warning?

I think so.  Otherwise silent breakage is a real possibility.

> * How about naming the parameter wal_level instead of wal_mode? That
> would better convey that the higher levels add stuff on top of the lower
> levels, instead of having different modes that are somehow mutually
> exclusive.

That works for me.

...Robert


Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Apr 26, 2010 at 8:05 AM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>> * How about naming the parameter wal_level instead of wal_mode? That
>> would better convey that the higher levels add stuff on top of the lower
>> levels, instead of having different modes that are somehow mutually
>> exclusive.

> That works for me.

What happens in the future if we have more options and they don't fall
into a neat superset order?
        regards, tom lane


On Mon, Apr 26, 2010 at 10:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Mon, Apr 26, 2010 at 8:05 AM, Heikki Linnakangas
>> <heikki.linnakangas@enterprisedb.com> wrote:
>>> * How about naming the parameter wal_level instead of wal_mode? That
>>> would better convey that the higher levels add stuff on top of the lower
>>> levels, instead of having different modes that are somehow mutually
>>> exclusive.
>
>> That works for me.
>
> What happens in the future if we have more options and they don't fall
> into a neat superset order?

We'll decide on the appropriate solution based on whatever our needs
are at that time?

...Robert


Folks,

(a) is this checked in yet?
(b) should we delay Beta to test it?\

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


On Mon, Apr 26, 2010 at 2:15 PM, Josh Berkus <josh@agliodbs.com> wrote:
> (a) is this checked in yet?

No.

> (b) should we delay Beta to test it?\

I suspect it's going to be checked in pretty soon, so that may not be
necessary.  Not my call, though.

...Robert


Re: recovery_connections cannot start

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:
>> On Mon, 2010-04-26 at 10:41 +0200, Dimitri Fontaine wrote:
>>> Would it be possible to have "internal" commands there, as for example
>>> cd is in my shell, or test, or time, or some more ?
>>>
>>> That would allow for providing a portable /usr/bin/true command as far
>>> as archiving is concerned (say, pg_archive_bypass), and will allow for
>>> providing a default archiving command in the future, like "pg_archive_cp
>>> /location" or something.
>
> Separating wal_mode and archive_mode, as we recently discussed, might
> eliminate the need for this kludge, if archive_mode can then be made
> changeable without a restart.

I don't see my proposal as anything like a kludge at all. Internal
commands are hugely practical and here would allow for PostgreSQL to
provide basic portable archive and restore commands for simple cases,
providing the necessary guarantees and error management. 

Bypass the archiving is the most obvious flavor and in my mind shouldn't
require an external dependency. Make simple things simple and complex
one possible, as they say. PostgreSQL is one of the best software I've
ever worked with on this point, but the WAL management is still in its
infancy there: whatever you want to setup, it's complex.

Having "internal" commands will not remove any feature we already
have. Users would still be able to hook-in their own solutions for more
complex or demanding environments.

Please do explain in what sense that proposal is a kludge, I'd like to
be able to understand your viewpoint. Or maybe it's just either bad
wording on your part or bad reading on mine, nonetheless I felt like
having to give some more details here. That's an important point in my
mind. 

Dunno how much it's relevant for 9.0 though, maybe we'll be able to
reach a good enough solution without an internal bypass archive
command, but having (only this) one does not sound so complex that we
should not consider it at all.

Regards,
-- 
dim


Re: recovery_connections cannot start

From
Robert Haas
Date:
On Tue, Apr 27, 2010 at 4:07 AM, Dimitri Fontaine
<dfontaine@hi-media.com> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>>> On Mon, 2010-04-26 at 10:41 +0200, Dimitri Fontaine wrote:
>>>> Would it be possible to have "internal" commands there, as for example
>>>> cd is in my shell, or test, or time, or some more ?
>>>>
>>>> That would allow for providing a portable /usr/bin/true command as far
>>>> as archiving is concerned (say, pg_archive_bypass), and will allow for
>>>> providing a default archiving command in the future, like "pg_archive_cp
>>>> /location" or something.
>>
>> Separating wal_mode and archive_mode, as we recently discussed, might
>> eliminate the need for this kludge, if archive_mode can then be made
>> changeable without a restart.
>
> I don't see my proposal as anything like a kludge at all. Internal
> commands are hugely practical and here would allow for PostgreSQL to
> provide basic portable archive and restore commands for simple cases,
> providing the necessary guarantees and error management.

Treating the string "true" as a special case seems like a kludge to
me.  Maybe a robust set of internal commands wouldn't be a kludge, but
that's not what's being proposed here.  I guess it's just a matter of
opinion.

...Robert


Re: recovery_connections cannot start

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Treating the string "true" as a special case seems like a kludge to
> me.  Maybe a robust set of internal commands wouldn't be a kludge, but
> that's not what's being proposed here.  I guess it's just a matter of
> opinion.

I don't see how to have internal commands without having special cases
for the setting, and I did propose "pg_archive_bypass" as the name. I
guess the implementation would be what Simon was talking about, though.

I don't see "true" as meaningful in the context of an archive_command…

Regards,
--
dim


Re: recovery_connections cannot start

From
Simon Riggs
Date:
On Tue, 2010-04-27 at 15:10 +0200, Dimitri Fontaine wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > Treating the string "true" as a special case seems like a kludge to
> > me.  Maybe a robust set of internal commands wouldn't be a kludge, but
> > that's not what's being proposed here.  I guess it's just a matter of
> > opinion.
> 
> I don't see how to have internal commands without having special cases
> for the setting, and I did propose "pg_archive_bypass" as the name. I
> guess the implementation would be what Simon was talking about, though.
> 
> I don't see "true" as meaningful in the context of an archive_command…

Saying "its a kludge" doesn't really address the issue and goes nowhere
towards fixing it. If we don't like the proposal, fine, then what is the
alternative solution?

-- Simon Riggs           www.2ndQuadrant.com



Re: recovery_connections cannot start

From
Robert Haas
Date:
On Apr 27, 2010, at 9:20 AM, Simon Riggs <simon@2ndQuadrant.com> wrote:
> On Tue, 2010-04-27 at 15:10 +0200, Dimitri Fontaine wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> Treating the string "true" as a special case seems like a kludge to
>>> me.  Maybe a robust set of internal commands wouldn't be a kludge,
>>> but
>>> that's not what's being proposed here.  I guess it's just a matter
>>> of
>>> opinion.
>>
>> I don't see how to have internal commands without having special
>> cases
>> for the setting, and I did propose "pg_archive_bypass" as the name. I
>> guess the implementation would be what Simon was talking about,
>> though.
>>
>> I don't see "true" as meaningful in the context of an
>> archive_command=85
>
> Saying "its a kludge" doesn't really address the issue and goes
> nowhere
> towards fixing it. If we don't like the proposal, fine, then what is
> the
> alternative solution?

I proposed one upthread.

...Robert

Re: recovery_connections cannot start

From
Robert Haas
Date:
On Apr 27, 2010, at 9:20 AM, Simon Riggs <simon@2ndQuadrant.com> wrote:
> On Tue, 2010-04-27 at 15:10 +0200, Dimitri Fontaine wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> Treating the string "true" as a special case seems like a kludge to
>>> me.  Maybe a robust set of internal commands wouldn't be a kludge,
>>> but
>>> that's not what's being proposed here.  I guess it's just a matter
>>> of
>>> opinion.
>>
>> I don't see how to have internal commands without having special
>> cases
>> for the setting, and I did propose "pg_archive_bypass" as the name. I
>> guess the implementation would be what Simon was talking about,
>> though.
>>
>> I don't see "true" as meaningful in the context of an
>> archive_command…
>
> Saying "its a kludge" doesn't really address the issue and goes
> nowhere
> towards fixing it. If we don't like the proposal, fine, then what is
> the
> alternative solution?

I proposed one upthread.

...Robert