Re: including PID or backend ID in relpath of temp rels - Mailing list pgsql-hackers

From Robert Haas
Subject Re: including PID or backend ID in relpath of temp rels
Date
Msg-id p2t603c8f071004271859hce160f31zafb6995812bd7da4@mail.gmail.com
Whole thread Raw
In response to including PID or backend ID in relpath of temp rels  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: including PID or backend ID in relpath of temp rels  (Alvaro Herrera <alvherre@commandprompt.com>)
Re: including PID or backend ID in relpath of temp rels  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Sun, Apr 25, 2010 at 9:07 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> 4. We could add an additional 32-bit value to RelFileNode to identify
> the backend (or a sentinel value when not temp) and create a separate
> structure XLogRelFileNode or PermRelFileNode or somesuch for use in
> contexts where no temp rels are allowed.

I experimented with this approach and created LocalRelFileNode and
GlobalRelFileNode and, for use in the buffer headers,
BufferRelFileNode (same as GlobalRelFileNode, but named differently
for clarity).  LocallRelFileNode = GlobalRelFileNode + the ID of the
owning backend for temp rels; or InvalidBackendId if referencing a
non-temporary rel.  These might not be the greatest names, but I think
the concept is good, because it really breaks the things that need to
be adjusted quite thoroughly.  In the course of repairing the damage I
came across a couple of things I wasn't sure about:

[relcache.c] RelationInitPhysicalAddr can't initialize
relation->rd_node.backend properly for a non-local temporary relation,
because that information isn't available.  But I'm not clear on why we
would need to create a relcache entry for a non-local temporary
relation.  If we do need to, then we'll probably need to store the
backend ID in pg_class.  That seems like something that would be best
avoided, all things being equal, especially since I can't see how to
generalize it to global temporary tables.

[smgr.c,inval.c] Do we need to call CacheInvalidSmgr for temporary
relations?  I think the only backend that can have an smgr reference
to a temprel other than the owning backend is bgwriter, and AFAICS
bgwriter will only have such a reference if it's responding to a
request by the owning backend to unlink the associated files, in which
case (I think) the owning backend will have no reference.

[dbsize.c] As with relcache.c, there's a problem if we're asked for
the size of a temporary relation that is not our own: we can't call
relpath() without knowing the ID of the owning backend, and there's no
way to acquire that information for pg_class.  I guess we could just
refuse to answer the question in that case, but that doesn't seem real
cool.  Or we could physically scan the directory for files that match
a suitably constructed wildcard, I suppose.

[storage.c,xact.c,twophase.c] smgrGetPendingDeletes returns via an out
parameter (its second argument) a list of RelFileNodes pending delete,
which we then write to WAL or to the two-phase state file.  Of course,
if the backend ID (or pid, but I picked backend ID somewhat
arbitrarily) is part of the filename, then we need to write that to
WAL, too.  It seems somewhat unfortunate to have to WAL-log temprels
here; as best I can tell, this is the only case where it's necessary.But if we implement a more general mechanism for
cleaningup temp
 
files, then might the need to do this go away?  Not sure.

[syncscan.c] It seems we pursue this optimization even for temprels; I
can't think of why that would be useful in practice.  If it's useless
overhead, should we skip it?  This is really independent of this
project; just a side thought.

...Robert


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Schema.Table.Col resolution seems broken in Alpha5
Next
From: Tom Lane
Date:
Subject: Error handling for ShmemInitStruct and ShmemInitHash