Thread: pgsql: Create a "relation mapping" infrastructure to support changing

pgsql: Create a "relation mapping" infrastructure to support changing

From
tgl@postgresql.org (Tom Lane)
Date:
Log Message:
-----------
Create a "relation mapping" infrastructure to support changing the relfilenodes
of shared or nailed system catalogs.  This has two key benefits:

* The new CLUSTER-based VACUUM FULL can be applied safely to all catalogs.

* We no longer have to use an unsafe reindex-in-place approach for reindexing
  shared catalogs.

CLUSTER on nailed catalogs now works too, although I left it disabled on
shared catalogs because the resulting pg_index.indisclustered update would
only be visible in one database.

Since reindexing shared system catalogs is now fully transactional and
crash-safe, the former special cases in REINDEX behavior have been removed;
shared catalogs are treated the same as non-shared.

This commit does not do anything about the recently-discussed problem of
deadlocks between VACUUM FULL/CLUSTER on a system catalog and other
concurrent queries; will address that in a separate patch.  As a stopgap,
parallel_schedule has been tweaked to run vacuum.sql by itself, to avoid
such failures during the regression tests.

Modified Files:
--------------
    pgsql/contrib/oid2name:
        oid2name.c (r1.36 -> r1.37)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/contrib/oid2name/oid2name.c?r1=1.36&r2=1.37)
    pgsql/doc/src/sgml:
        catalogs.sgml (r2.220 -> r2.221)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/catalogs.sgml?r1=2.220&r2=2.221)
        diskusage.sgml (r1.19 -> r1.20)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/diskusage.sgml?r1=1.19&r2=1.20)
        func.sgml (r1.500 -> r1.501)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/func.sgml?r1=1.500&r2=1.501)
        pgbuffercache.sgml (r2.5 -> r2.6)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/pgbuffercache.sgml?r1=2.5&r2=2.6)
        storage.sgml (r1.30 -> r1.31)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/storage.sgml?r1=1.30&r2=1.31)
    pgsql/doc/src/sgml/ref:
        cluster.sgml (r1.47 -> r1.48)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/ref/cluster.sgml?r1=1.47&r2=1.48)
        reindex.sgml (r1.38 -> r1.39)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/ref/reindex.sgml?r1=1.38&r2=1.39)
    pgsql/src/backend/access/index:
        genam.c (r1.79 -> r1.80)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/index/genam.c?r1=1.79&r2=1.80)
    pgsql/src/backend/access/transam:
        rmgr.c (r1.28 -> r1.29)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/transam/rmgr.c?r1=1.28&r2=1.29)
        xact.c (r1.282 -> r1.283)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/transam/xact.c?r1=1.282&r2=1.283)
        xlog.c (r1.366 -> r1.367)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/transam/xlog.c?r1=1.366&r2=1.367)
    pgsql/src/backend/bootstrap:
        bootparse.y (r1.104 -> r1.105)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/bootstrap/bootparse.y?r1=1.104&r2=1.105)
        bootstrap.c (r1.258 -> r1.259)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/bootstrap/bootstrap.c?r1=1.258&r2=1.259)
    pgsql/src/backend/catalog:
        catalog.c (r1.87 -> r1.88)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/catalog/catalog.c?r1=1.87&r2=1.88)
        heap.c (r1.369 -> r1.370)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/catalog/heap.c?r1=1.369&r2=1.370)
        index.c (r1.332 -> r1.333)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/catalog/index.c?r1=1.332&r2=1.333)
        storage.c (r1.7 -> r1.8)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/catalog/storage.c?r1=1.7&r2=1.8)
        toasting.c (r1.29 -> r1.30)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/catalog/toasting.c?r1=1.29&r2=1.30)
    pgsql/src/backend/commands:
        cluster.c (r1.197 -> r1.198)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/commands/cluster.c?r1=1.197&r2=1.198)
        indexcmds.c (r1.190 -> r1.191)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/commands/indexcmds.c?r1=1.190&r2=1.191)
        tablecmds.c (r1.324 -> r1.325)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/commands/tablecmds.c?r1=1.324&r2=1.325)
        vacuum.c (r1.403 -> r1.404)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/commands/vacuum.c?r1=1.403&r2=1.404)
    pgsql/src/backend/executor:
        execMain.c (r1.344 -> r1.345)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/executor/execMain.c?r1=1.344&r2=1.345)
    pgsql/src/backend/parser:
        parse_clause.c (r1.195 -> r1.196)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/parser/parse_clause.c?r1=1.195&r2=1.196)
    pgsql/src/backend/utils/adt:
        dbsize.c (r1.28 -> r1.29)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/adt/dbsize.c?r1=1.28&r2=1.29)
    pgsql/src/backend/utils/cache:
        Makefile (r1.25 -> r1.26)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/cache/Makefile?r1=1.25&r2=1.26)
        catcache.c (r1.148 -> r1.149)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/cache/catcache.c?r1=1.148&r2=1.149)
        inval.c (r1.93 -> r1.94)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/cache/inval.c?r1=1.93&r2=1.94)
        relcache.c (r1.302 -> r1.303)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/cache/relcache.c?r1=1.302&r2=1.303)
    pgsql/src/backend/utils/init:
        miscinit.c (r1.180 -> r1.181)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/init/miscinit.c?r1=1.180&r2=1.181)
    pgsql/src/bin/pg_dump:
        pg_dump.c (r1.569 -> r1.570)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/bin/pg_dump/pg_dump.c?r1=1.569&r2=1.570)
    pgsql/src/include/access:
        rmgr.h (r1.20 -> r1.21)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/access/rmgr.h?r1=1.20&r2=1.21)
    pgsql/src/include/catalog:
        catalog.h (r1.47 -> r1.48)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/catalog/catalog.h?r1=1.47&r2=1.48)
        catversion.h (r1.582 -> r1.583)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/catalog/catversion.h?r1=1.582&r2=1.583)
        heap.h (r1.96 -> r1.97)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/catalog/heap.h?r1=1.96&r2=1.97)
        index.h (r1.81 -> r1.82)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/catalog/index.h?r1=1.81&r2=1.82)
        pg_class.h (r1.120 -> r1.121)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/catalog/pg_class.h?r1=1.120&r2=1.121)
        pg_proc.h (r1.567 -> r1.568)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/catalog/pg_proc.h?r1=1.567&r2=1.568)
        storage.h (r1.4 -> r1.5)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/catalog/storage.h?r1=1.4&r2=1.5)
    pgsql/src/include/commands:
        cluster.h (r1.39 -> r1.40)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/commands/cluster.h?r1=1.39&r2=1.40)
    pgsql/src/include:
        miscadmin.h (r1.217 -> r1.218)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/miscadmin.h?r1=1.217&r2=1.218)
    pgsql/src/include/storage:
        lwlock.h (r1.43 -> r1.44)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/storage/lwlock.h?r1=1.43&r2=1.44)
        relfilenode.h (r1.24 -> r1.25)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/storage/relfilenode.h?r1=1.24&r2=1.25)
        sinval.h (r1.56 -> r1.57)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/storage/sinval.h?r1=1.56&r2=1.57)
    pgsql/src/include/utils:
        builtins.h (r1.346 -> r1.347)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/utils/builtins.h?r1=1.346&r2=1.347)
        catcache.h (r1.69 -> r1.70)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/utils/catcache.h?r1=1.69&r2=1.70)
        inval.h (r1.47 -> r1.48)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/utils/inval.h?r1=1.47&r2=1.48)
        rel.h (r1.121 -> r1.122)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/utils/rel.h?r1=1.121&r2=1.122)
        relcache.h (r1.67 -> r1.68)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/utils/relcache.h?r1=1.67&r2=1.68)
    pgsql/src/test/regress/expected:
        vacuum.out (r1.3 -> r1.4)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/test/regress/expected/vacuum.out?r1=1.3&r2=1.4)
    pgsql/src/test/regress:
        parallel_schedule (r1.58 -> r1.59)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/test/regress/parallel_schedule?r1=1.58&r2=1.59)

Added Files:
-----------
    pgsql/src/backend/utils/cache:
        relmapper.c (r1.1)

(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/cache/relmapper.c?rev=1.1&content-type=text/x-cvsweb-markup)
    pgsql/src/include/utils:
        relmapper.h (r1.1)

(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/utils/relmapper.h?rev=1.1&content-type=text/x-cvsweb-markup)

Re: pgsql: Create a "relation mapping" infrastructure to support changing

From
Simon Riggs
Date:
On Sun, 2010-02-07 at 20:48 +0000, Tom Lane wrote:
> Log Message:
> -----------
> Create a "relation mapping" infrastructure to support changing the relfilenodes
> of shared or nailed system catalogs.  This has two key benefits:
>
> * The new CLUSTER-based VACUUM FULL can be applied safely to all catalogs.
>
> * We no longer have to use an unsafe reindex-in-place approach for reindexing
>   shared catalogs.
>
> CLUSTER on nailed catalogs now works too, although I left it disabled on
> shared catalogs because the resulting pg_index.indisclustered update would
> only be visible in one database.
>
> Since reindexing shared system catalogs is now fully transactional and
> crash-safe, the former special cases in REINDEX behavior have been removed;
> shared catalogs are treated the same as non-shared.
>
> This commit does not do anything about the recently-discussed problem of
> deadlocks between VACUUM FULL/CLUSTER on a system catalog and other
> concurrent queries; will address that in a separate patch.  As a stopgap,
> parallel_schedule has been tweaked to run vacuum.sql by itself, to avoid
> such failures during the regression tests.

Awesome!


Few questions:

We XLogFlush() in normal running, but not in recovery. Would it be
possible for the map changes to hit disk after the data changes in that
case?

What we're saying is if we process a relmap update WAL record then it
doesn't really matter whether or not we commit that transaction?

>         func.sgml (r1.500 -> r1.501)
>         (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/func.sgml?r1=1.500&r2=1.501)

It would be useful to mention that the return value is volatile and can
change across transactions or even within a transaction, just like ctid.

>         catalogs.sgml (r2.220 -> r2.221)
>         (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/catalogs.sgml?r1=2.220&r2=2.221)

No mention of pg_relation_filepath/filenode in there.

--
 Simon Riggs           www.2ndQuadrant.com


Re: pgsql: Create a "relation mapping" infrastructure to support changing

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> Few questions:

> We XLogFlush() in normal running, but not in recovery. Would it be
> possible for the map changes to hit disk after the data changes in that
> case?

No, because the map change is flushed to disk on sight of the WAL record
anyway.

> What we're saying is if we process a relmap update WAL record then it
> doesn't really matter whether or not we commit that transaction?

Right.  relmap update is committed independently of the surrounding
transaction.

> It would be useful to mention that the return value is volatile and can
> change across transactions or even within a transaction, just like ctid.

Hm.  I thought it was more reasonable to treat it as "stable", which is
the normal case.  You're right that there are some corner cases where it
could change without the caller's snapshot changing, but I'm not sure
it's worth worrying about.

            regards, tom lane

Re: pgsql: Create a "relation mapping" infrastructure to support changing

From
Simon Riggs
Date:
On Sun, 2010-02-07 at 17:52 -0500, Tom Lane wrote:
>
> > It would be useful to mention that the return value is volatile and
> can change across transactions or even within a transaction, just like
> ctid.
>
> Hm.  I thought it was more reasonable to treat it as "stable", which
> is the normal case.  You're right that there are some corner cases
> where it could change without the caller's snapshot changing, but I'm
> not sure it's worth worrying about.

OK, even if its stable I think we need to mention it because it's not
obvious that the file we keep catalog data in could change name. It's a
also a change from earlier releases where that would never happen.

--
 Simon Riggs           www.2ndQuadrant.com