pgsql: Improve the situation for parallel query versus temp relations. - Mailing list pgsql-committers

From Tom Lane
Subject pgsql: Improve the situation for parallel query versus temp relations.
Date
Msg-id E1bBA7W-0002JO-53@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Improve the situation for parallel query versus temp relations.

Transmit the leader's temp-namespace state to workers.  This is important
because without it, the workers do not really have the same search path
as the leader.  For example, there is no good reason (and no extant code
either) to prevent a worker from executing a temp function that the
leader created previously; but as things stood it would fail to find the
temp function, and then either fail or execute the wrong function entirely.

We still prohibit a worker from creating a temp namespace on its own.
In effect, a worker can only see the session's temp namespace if the leader
had created it before starting the worker, which seems like the right
semantics.

Also, transmit the leader's BackendId to workers, and arrange for workers
to use that when determining the physical file path of a temp relation
belonging to their session.  While the original intent was to prevent such
accesses entirely, there were a number of holes in that, notably in places
like dbsize.c which assume they can safely access temp rels of other
sessions anyway.  We might as well get this right, as a small down payment
on someday allowing workers to access the leader's temp tables.  (With
this change, directly using "MyBackendId" as a relation or buffer backend
ID is deprecated; you should use BackendIdForTempRelations() instead.
I left a couple of such uses alone though, as they're not going to be
reachable in parallel workers until we do something about localbuf.c.)

Move the thou-shalt-not-access-thy-leader's-temp-tables prohibition down
into localbuf.c, which is where it actually matters, instead of having it
in relation_open().  This amounts to recognizing that access to temp
tables' catalog entries is perfectly safe in a worker, it's only the data
in local buffers that is problematic.

Having done all that, we can get rid of the test in has_parallel_hazard()
that says that use of a temp table's rowtype is unsafe in parallel workers.
That test was unduly expensive, and if we really did need such a
prohibition, that was not even close to being a bulletproof guard for it.
(For example, any user-defined function executed in a parallel worker
might have attempted such access.)

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/cae1c788b9b43887e4a4fa51a11c3a8ffa334939

Modified Files
--------------
src/backend/access/heap/heapam.c      | 12 --------
src/backend/access/transam/parallel.c | 12 ++++++++
src/backend/catalog/catalog.c         |  2 +-
src/backend/catalog/namespace.c       | 45 ++++++++++++++++++++++++++++
src/backend/catalog/storage.c         |  2 +-
src/backend/optimizer/util/clauses.c  | 55 -----------------------------------
src/backend/storage/buffer/localbuf.c | 14 +++++++++
src/backend/utils/adt/dbsize.c        |  2 +-
src/backend/utils/cache/relcache.c    |  4 +--
src/backend/utils/init/globals.c      |  2 ++
src/include/catalog/namespace.h       |  4 +++
src/include/storage/backendid.h       | 10 +++++++
12 files changed, 92 insertions(+), 72 deletions(-)


pgsql-committers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] Re: pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <
Next
From: Kevin Grittner
Date:
Subject: pgsql: Fix interaction between CREATE INDEX and "snapshot too old".