Thread: MultiXact member wraparound protections are now enabled

MultiXact member wraparound protections are now enabled

From
Peter Eisentraut
Date:
Why is this message logged by default in a fresh installation?  The
technicality of that message doesn't seem to match the kinds of messages
that we normally print at startup.



Re: MultiXact member wraparound protections are now enabled

From
Robert Haas
Date:
On Wed, Jul 22, 2015 at 4:11 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> Why is this message logged by default in a fresh installation?  The
> technicality of that message doesn't seem to match the kinds of messages
> that we normally print at startup.

It seems nobody likes that message.

I did it that way because I wanted to provide an easy way for users to
know whether they had those protections enabled.  If you don't display
the message when things are already OK at startup, users have to make
a negative inference, like this: let's see, I'm on a version that is
new enough that it would have printed a message if the protections had
not been enabled, so the absence of the message must mean things are
OK.

But it seemed to me that this could be rather confusing.  I thought it
would be better to be explicit about whether the protections are
enabled in all cases.  That way, (1) if you see the message saying
they are enabled, they are enabled; (2) if you see the message saying
they are disabled, they are disabled; and (3) if you see neither
message, your version does not have those protections.

You are not the first person to dislike this, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: MultiXact member wraparound protections are now enabled

From
Peter Eisentraut
Date:
On 7/22/15 4:45 PM, Robert Haas wrote:
> But it seemed to me that this could be rather confusing.  I thought it
> would be better to be explicit about whether the protections are
> enabled in all cases.  That way, (1) if you see the message saying
> they are enabled, they are enabled; (2) if you see the message saying
> they are disabled, they are disabled; and (3) if you see neither
> message, your version does not have those protections.

But this is not documented, AFAICT, so I don't think anyone is going to
be able to follow that logic.  I don't see anything in the release notes
saying, look for this message to see how this applies to you, or whatever.



Re: MultiXact member wraparound protections are now enabled

From
Robert Haas
Date:
On Fri, Jul 24, 2015 at 9:14 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> On 7/22/15 4:45 PM, Robert Haas wrote:
>> But it seemed to me that this could be rather confusing.  I thought it
>> would be better to be explicit about whether the protections are
>> enabled in all cases.  That way, (1) if you see the message saying
>> they are enabled, they are enabled; (2) if you see the message saying
>> they are disabled, they are disabled; and (3) if you see neither
>> message, your version does not have those protections.
>
> But this is not documented, AFAICT, so I don't think anyone is going to
> be able to follow that logic.  I don't see anything in the release notes
> saying, look for this message to see how this applies to you, or whatever.

Good point.  I can't tell you what the right thing to do is, and I'm
sure there is room for debate about that.  I'm only telling you why I
did what I did.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: MultiXact member wraparound protections are now enabled

From
Simon Riggs
Date:
On 22 July 2015 at 21:45, Robert Haas <robertmhaas@gmail.com> wrote:
 
But it seemed to me that this could be rather confusing.  I thought it
would be better to be explicit about whether the protections are
enabled in all cases.  That way, (1) if you see the message saying
they are enabled, they are enabled; (2) if you see the message saying
they are disabled, they are disabled; and (3) if you see neither
message, your version does not have those protections.

(3) would imply that we can't ever remove the message, in case people think they are unprotected.

If we display (1) and then we find a further bug, where does that leave us? Do we put a second "really, really fixed" message?

AIUI this refers to a bug fix, its not like we've invented some anti-virus mode to actively prevent or even scan for further error. I'm not sure why we need a message to say a bug fix has been applied; that is what the release notes are for.

If something is disabled, we should say so, but otherwise silence means safety and success.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: MultiXact member wraparound protections are now enabled

From
Noah Misch
Date:
On Fri, Jul 24, 2015 at 09:14:09PM -0400, Peter Eisentraut wrote:
> On 7/22/15 4:45 PM, Robert Haas wrote:
> > But it seemed to me that this could be rather confusing.  I thought it
> > would be better to be explicit about whether the protections are
> > enabled in all cases.  That way, (1) if you see the message saying
> > they are enabled, they are enabled; (2) if you see the message saying
> > they are disabled, they are disabled; and (3) if you see neither
> > message, your version does not have those protections.
> 
> But this is not documented, AFAICT, so I don't think anyone is going to
> be able to follow that logic.  I don't see anything in the release notes
> saying, look for this message to see how this applies to you, or whatever.

I supported inclusion of the message, because it has good potential to help
experts studying historical logs to find the root cause of data corruption.
The complex histories of clusters showing corruption from this series of bugs
have brought great expense to the task of debugging new reports.  Given a
cluster having full mxact wraparound protections since last corruption-free
backup (or since initdb), one can rule out some causes.



Re: MultiXact member wraparound protections are now enabled

From
Simon Riggs
Date:
On 26 July 2015 at 20:15, Noah Misch <noah@leadboat.com> wrote:
On Fri, Jul 24, 2015 at 09:14:09PM -0400, Peter Eisentraut wrote:
> On 7/22/15 4:45 PM, Robert Haas wrote:
> > But it seemed to me that this could be rather confusing.  I thought it
> > would be better to be explicit about whether the protections are
> > enabled in all cases.  That way, (1) if you see the message saying
> > they are enabled, they are enabled; (2) if you see the message saying
> > they are disabled, they are disabled; and (3) if you see neither
> > message, your version does not have those protections.
>
> But this is not documented, AFAICT, so I don't think anyone is going to
> be able to follow that logic.  I don't see anything in the release notes
> saying, look for this message to see how this applies to you, or whatever.

I supported inclusion of the message, because it has good potential to help
experts studying historical logs to find the root cause of data corruption.
The complex histories of clusters showing corruption from this series of bugs
have brought great expense to the task of debugging new reports.  Given a
cluster having full mxact wraparound protections since last corruption-free
backup (or since initdb), one can rule out some causes.

Would it be better to replace it with a less specific and more generally useful message?

For example, Server started with release X.y.z
from which we could infer various useful things.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: MultiXact member wraparound protections are now enabled

From
Noah Misch
Date:
On Mon, Jul 27, 2015 at 07:59:40AM +0100, Simon Riggs wrote:
> On 26 July 2015 at 20:15, Noah Misch <noah@leadboat.com> wrote:
> > On Fri, Jul 24, 2015 at 09:14:09PM -0400, Peter Eisentraut wrote:
> > > On 7/22/15 4:45 PM, Robert Haas wrote:
> > > > But it seemed to me that this could be rather confusing.  I thought it
> > > > would be better to be explicit about whether the protections are
> > > > enabled in all cases.  That way, (1) if you see the message saying
> > > > they are enabled, they are enabled; (2) if you see the message saying
> > > > they are disabled, they are disabled; and (3) if you see neither
> > > > message, your version does not have those protections.
> > >
> > > But this is not documented, AFAICT, so I don't think anyone is going to
> > > be able to follow that logic.  I don't see anything in the release notes
> > > saying, look for this message to see how this applies to you, or
> > whatever.
> >
> > I supported inclusion of the message, because it has good potential to help
> > experts studying historical logs to find the root cause of data corruption.
> > The complex histories of clusters showing corruption from this series of
> > bugs
> > have brought great expense to the task of debugging new reports.  Given a
> > cluster having full mxact wraparound protections since last corruption-free
> > backup (or since initdb), one can rule out some causes.
> 
> 
> Would it be better to replace it with a less specific and more generally
> useful message?
> 
> For example, Server started with release X.y.z
> from which we could infer various useful things.

That message does sound generally useful, but we couldn't infer $subject from
it.  While the $subject message appears at startup in simple cases, autovacuum
prerequisite work can delay it indefinitely.



Re: MultiXact member wraparound protections are now enabled

From
Robert Haas
Date:
On Sat, Jul 25, 2015 at 4:11 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 22 July 2015 at 21:45, Robert Haas <robertmhaas@gmail.com> wrote:
>> But it seemed to me that this could be rather confusing.  I thought it
>> would be better to be explicit about whether the protections are
>> enabled in all cases.  That way, (1) if you see the message saying
>> they are enabled, they are enabled; (2) if you see the message saying
>> they are disabled, they are disabled; and (3) if you see neither
>> message, your version does not have those protections.
>
> (3) would imply that we can't ever remove the message, in case people think
> they are unprotected.
>
> If we display (1) and then we find a further bug, where does that leave us?
> Do we put a second "really, really fixed" message?
>
> AIUI this refers to a bug fix, its not like we've invented some anti-virus
> mode to actively prevent or even scan for further error. I'm not sure why we
> need a message to say a bug fix has been applied; that is what the release
> notes are for.
>
> If something is disabled, we should say so, but otherwise silence means
> safety and success.

Well, I think that we can eventually downgrade or remove the message
once (1) we've actually fixed all of the known multixact bugs and (2)
a couple of years have gone by and most people are in the clear.  But
right now, we've still got significant bugs unfixed.

https://wiki.postgresql.org/wiki/MultiXact_Bugs

Therefore, in my opinion, anything that might make it harder to debug
problems with the MultiXact system is premature at this point.  The
detective work that it took to figure out the chain of events that led
to the problem fixed in 068cfadf9e2190bdd50a30d19efc7c9f0b825b5e was
difficult; I wanted to make sure that future debugging would be
easier, not harder.  I still think that's the right decision, but I
recognize that not everyone agrees.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: MultiXact member wraparound protections are now enabled

From
Simon Riggs
Date:
On 28 July 2015 at 14:20, Robert Haas <robertmhaas@gmail.com> wrote:
On Sat, Jul 25, 2015 at 4:11 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 22 July 2015 at 21:45, Robert Haas <robertmhaas@gmail.com> wrote:
>> But it seemed to me that this could be rather confusing.  I thought it
>> would be better to be explicit about whether the protections are
>> enabled in all cases.  That way, (1) if you see the message saying
>> they are enabled, they are enabled; (2) if you see the message saying
>> they are disabled, they are disabled; and (3) if you see neither
>> message, your version does not have those protections.
>
> (3) would imply that we can't ever remove the message, in case people think
> they are unprotected.
>
> If we display (1) and then we find a further bug, where does that leave us?
> Do we put a second "really, really fixed" message?
>
> AIUI this refers to a bug fix, its not like we've invented some anti-virus
> mode to actively prevent or even scan for further error. I'm not sure why we
> need a message to say a bug fix has been applied; that is what the release
> notes are for.
>
> If something is disabled, we should say so, but otherwise silence means
> safety and success.

Well, I think that we can eventually downgrade or remove the message
once (1) we've actually fixed all of the known multixact bugs and (2)
a couple of years have gone by and most people are in the clear.  But
right now, we've still got significant bugs unfixed.

https://wiki.postgresql.org/wiki/MultiXact_Bugs

Therefore, in my opinion, anything that might make it harder to debug
problems with the MultiXact system is premature at this point.  The
detective work that it took to figure out the chain of events that led
to the problem fixed in 068cfadf9e2190bdd50a30d19efc7c9f0b825b5e was
difficult; I wanted to make sure that future debugging would be
easier, not harder.  I still think that's the right decision, but I
recognize that not everyone agrees.

I do now, thanks for explaining.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: MultiXact member wraparound protections are now enabled

From
Peter Eisentraut
Date:
On 7/28/15 9:20 AM, Robert Haas wrote:
> Well, I think that we can eventually downgrade or remove the message
> once (1) we've actually fixed all of the known multixact bugs and (2)
> a couple of years have gone by and most people are in the clear.

Fair enough.  But we should document this better in the future.