Re: MVCC catalog access - Mailing list pgsql-hackers

From Andres Freund
Subject Re: MVCC catalog access
Date
Msg-id 20130606093019.GI28067@alap2.anarazel.de
Whole thread Raw
In response to Re: MVCC catalog access  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: MVCC catalog access
List pgsql-hackers
Hi Robert,

Took a quick look through the patch to understand what your current
revision is actually doing and to facilitate thinking about possible
pain points.

Here are the notes I made during my reading:

On 2013-06-03 14:57:12 -0400, Robert Haas wrote:
> +++ b/src/backend/catalog/catalog.c
> @@ -232,6 +232,10 @@ IsReservedName(const char *name)
>   * know if it's shared.  Fortunately, the set of shared relations is
>   * fairly static, so a hand-maintained list of their OIDs isn't completely
>   * impractical.
> + *
> + * XXX: Now that we have MVCC catalog access, the reasoning above is no longer
> + * true.  Are there other good reasons to hard-code this, or should we revisit
> + * that decision?
>   */

We could just the function by looking in the shared
relmapper. Everything that can be mapped via it is shared.

> --- a/src/backend/commands/cluster.c
> +++ b/src/backend/commands/cluster.c
> @@ -480,6 +480,11 @@ check_index_is_clusterable(Relation OldHeap, Oid indexOid, bool recheck, LOCKMOD
>   * against concurrent SnapshotNow scans of pg_index.  Therefore this is unsafe
>   * to execute with less than full exclusive lock on the parent table;
>   * otherwise concurrent executions of RelationGetIndexList could miss indexes.
> + *
> + * XXX: Now that we have MVCC catalog access, SnapshotNow scans of pg_index
> + * shouldn't be common enough to worry about.  The above comment needs
> + * to be updated, and it may be possible to simplify the logic here in other
> + * ways also.
>   */

You're right, the comment needs to be changed, but I don't think the
effect can. A non-inplace upgrade changes the xmin of the row which is
relevant for indcheckxmin.
(In fact, isn't this update possibly causing problems like delaying the
use of such an index already)


> --- a/src/backend/commands/tablecmds.c
> +++ b/src/backend/commands/tablecmds.c
> @@ -2738,7 +2738,7 @@ AlterTableGetLockLevel(List *cmds)
>       * multiple DDL operations occur in a stream against frequently accessed
>       * tables.
>       *
> -     * 1. Catalog tables are read using SnapshotNow, which has a race bug that
> +     * 1. Catalog tables were read using SnapshotNow, which has a race bug that

Heh.

> --- a/src/backend/utils/time/snapmgr.c
> +++ b/src/backend/utils/time/snapmgr.c
> @@ -207,6 +207,19 @@ GetLatestSnapshot(void)
>  /*
> + * GetInstantSnapshot
> + *        Get a snapshot that is up-to-date as of the current instant,
> + *        but don't set the transaction snapshot.
> + */
> +Snapshot
> +GetInstantSnapshot(void)
> +{
> +    SecondarySnapshot = GetSnapshotData(&SecondarySnapshotData);
> +
> +    return SecondarySnapshot;
> +}

Hm. Looks like this should also change the description of SecondarySnapshot:

/** CurrentSnapshot points to the only snapshot taken in transaction-snapshot* mode, and to the latest one taken in a
read-committedtransaction.* SecondarySnapshot is a snapshot that's always up-to-date as of the current* instant, even
intransaction-snapshot mode.    It should only be used for* special-purpose code (say, RI checking.)*
 
and/* * Checking SecondarySnapshot is probably useless here, but it seems * better to be sure. */

Also looks like BuildEventTriggerCache() in evtcache.c should use
GetInstanSnapshot() now.

I actually wonder if we shouldn't just abolish GetLatestSnapshot(). None
of the callers seem to rely on it's behaviour from a quick look and it
seems rather confusing to have both.

> --- a/src/bin/pg_dump/pg_dump.c
> +++ b/src/bin/pg_dump/pg_dump.c
> @@ -14,13 +14,13 @@
>   *    Note that pg_dump runs in a transaction-snapshot mode transaction,
>   *    so it sees a consistent snapshot of the database including system
>   *    catalogs. However, it relies in part on various specialized backend
> - *    functions like pg_get_indexdef(), and those things tend to run on
> - *    SnapshotNow time, ie they look at the currently committed state.  So
> - *    it is possible to get 'cache lookup failed' error if someone
> - *    performs DDL changes while a dump is happening. The window for this
> - *    sort of thing is from the acquisition of the transaction snapshot to
> - *    getSchemaData() (when pg_dump acquires AccessShareLock on every
> - *    table it intends to dump). It isn't very large, but it can happen.
> + *    functions like pg_get_indexdef(), and those things tend to look at
> + *    the currently committed state.  So it is possible to get 'cache
> + *    lookup failed' error if someone performs DDL changes while a dump is
> + *    happening. The window for this sort of thing is from the acquisition
> + *    of the transaction snapshot to getSchemaData() (when pg_dump acquires
> + *    AccessShareLock on every table it intends to dump). It isn't very large,
> + *    but it can happen.

I think we need another term for what SnapshotNow used to express
here... Imo this description got less clear with this change.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: Redesigning checkpoint_segments
Next
From: Dmitriy Igrishin
Date:
Subject: Re: About large objects asynchronous and non-blocking support