Re: [HACKERS] Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1 - Mailing list pgsql-general

From Andres Freund
Subject Re: [HACKERS] Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1
Date
Msg-id 20150602201924.GV30287@alap3.anarazel.de
Whole thread Raw
In response to Re: [HACKERS] Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-general
On 2015-06-01 14:22:32 -0400, Robert Haas wrote:

> commit d33b4eb0167f465edb00bd6c0e1bcaa67dd69fe9
> Author: Robert Haas <rhaas@postgresql.org>
> Date:   Fri May 29 14:35:53 2015 -0400
>
>     foo

Hehe!

> diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
> index 9568ff1..aca829d 100644
> --- a/src/backend/access/transam/multixact.c
> +++ b/src/backend/access/transam/multixact.c
> @@ -199,8 +199,9 @@ typedef struct MultiXactStateData
>      MultiXactOffset nextOffset;
>
>      /*
> -     * Oldest multixact that is still on disk.  Anything older than this
> -     * should not be consulted.  These values are updated by vacuum.
> +     * Oldest multixact that may still be referenced from a relation.
> +     * Anything older than this should not be consulted.  These values are
> +     * updated by vacuum.
>       */
>      MultiXactId oldestMultiXactId;
>      Oid            oldestMultiXactDB;
> @@ -213,6 +214,18 @@ typedef struct MultiXactStateData
>       */
>      MultiXactId lastCheckpointedOldest;
>
> +    /*
> +     * This is the oldest file that actually exist on the disk.  This value
> +     * is initialized by scanning pg_multixact/offsets, and subsequently
> +     * updated each time we complete a truncation.  We need a flag to
> +     * indicate whether this has been initialized yet.
> +     */
> +    MultiXactId oldestMultiXactOnDisk;
> +    bool        oldestMultiXactOnDiskValid;
> +
> +    /* Has TrimMultiXact been called yet? */
> +    bool        didTrimMultiXact;

I'm not really convinced tying things closer to having done trimming is
easier to understand than tying things to recovery having finished.

E.g.
    if (did_trim)
        oldestOffset = GetOldestReferencedOffset(oldest_datminmxid);
imo is harder to understand than if (!InRecovery).

Maybe we could just name it finishedStartup and rename the functions accordingly?

> @@ -1956,12 +1971,6 @@ StartupMultiXact(void)
>       */
>      pageno = MXOffsetToMemberPage(offset);
>      MultiXactMemberCtl->shared->latest_page_number = pageno;
> -
> -    /*
> -     * compute the oldest member we need to keep around to avoid old member
> -     * data overrun.
> -     */
> -    DetermineSafeOldestOffset(MultiXactState->oldestMultiXactId);
>  }

Maybe it's worthwhile to add a 'NB: At this stage the data directory is
not yet necessarily consistent' StartupMultiXact's comments, to avoid
reintroducing problems like this?

>      /*
> +     * We can read this without a lock, because it only changes when nothing
> +     * else is running.
> +     */
> +    did_trim = MultiXactState->didTrimMultiXact;

Err, Hot Standby? It might be ok to not lock, but the comment is
definitely wrong. I'm inclined to simply use locking, this doesn't look
sufficiently critical performancewise.

> +static MultiXactOffset
> +GetOldestReferencedOffset(MultiXactId oldestMXact)
> +{
> +    MultiXactId        earliest;
> +    MultiXactOffset oldestOffset;
> +
> +    /*
> +     * Because of bugs in early 9.3.X and 9.4.X releases (see comments in
> +     * TrimMultiXact for details), oldest_datminmxid might point to a
> +     * nonexistent multixact.  If so, use the oldest one that actually
> +     * exists.  Anything before this can't be successfully used anyway.
> +     */
> +    earliest = GetOldestMultiXactOnDisk();
> +    if (MultiXactIdPrecedes(oldestMXact, earliest))
> +        oldestMXact = earliest;

Hm. If GetOldestMultiXactOnDisk() gets the starting point by scanning
the disk it'll always get one at a segment boundary, right? I'm not sure
that's actually ok; because the value at the beginning of the segment
can very well end up being a 0, as MaybeExtendOffsetSlru() will have
filled the page with zeros.

I think that should be harmless, the worst that can happen is that
oldestOffset errorneously is 0, which should be correct, even though
possibly overly conservative, in these cases.

Greetings,

Andres Freund


pgsql-general by date:

Previous
From: Melvin Davidson
Date:
Subject: Re: TRIGGER TRUNCATE -- CASCADE or RESTRICT
Next
From: Andreas Ulbrich
Date:
Subject: Re: TRIGGER TRUNCATE -- CASCADE or RESTRICT