Re: [Proposal] Global temporary tables - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [Proposal] Global temporary tables
Date
Msg-id CA+TgmoZQmLhPdMGM1YA-if2=iYQdSEr9r5b=WWwmJS9FbJNx+Q@mail.gmail.com
Whole thread Raw
In response to Re: [Proposal] Global temporary tables  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Responses Re: [Proposal] Global temporary tables  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
List pgsql-hackers
On Wed, Jan 29, 2020 at 10:30 AM Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
> But please consider two arguments:
>
> 1. Index for GTT in any case has to be initialized on demand. In case of
> regular tables index is initialized at the moment of its creation. In
> case of GTT it doesn't work.
> So we should somehow detect that accessed index is not initialized and
> perform lazy initialization of the index.
> The only difference with the approach proposed by Pavel  (allow index
> for empty GTT but prohibit it for GTT filled with data) is whether we
> also need to populate index with data or not.
> I can imagine that implicit initialization of index in read-only query
> (select) can be unsafe and cause some problems. I have not encountered
> such problems yet after performing many tests with GTTs, but certainly I
> have not covered all possible scenarios (not sure that it is possible at
> all).
> But I do not understand how populating  index with data can add some
> extra unsafety.
>
> So I can not prove that building index for GTT on demand is safe, but it
> is not more unsafe than initialization of index on demand which is
> required in any case.

I think that the idea of calling ambuild() on the fly is not going to
work, because, again, I don't think that calling that from random
places in the code is safe. What I expect we're going to need to do
here is model this on the approach used for unlogged tables. For an
unlogged table, each table and index has an init fork which contains
the correct initial contents for that relation - which is nothing at
all for a heap table, and a couple of boilerplate pages for an index.
In the case of an unlogged table, the init forks get copied over the
main forks after crash recovery, and then we have a brand-new, empty
relation with brand-new empty indexes which everyone can use. In the
case of global temporary tables, I think that we should do the same
kind of copying, but at the time when the session first tries to
access the table. There is some fuzziness in my mind about what
exactly constitutes accessing the table - it probably shouldn't be
when the relcache entry is built, because that seems too early, but
I'm not sure what is exactly right. In any event, it's easier to find
a context where copying some files on disk (that are certain not to be
changing) is safe than it is to find a context where index builds are
safe.

> 2. Actually I do not propose some completely new approach. I try to
> provide behavior with is compatible with regular tables.
> If you create index for regular table, then it can be used in all
> sessions, right?

Yes. :-)

> And all "various backend-local data structures in the relcache, the
> planner, and the executor that remember information about indexes"
> have to be properly updated.  It is done using invalidation mechanism.
> The same mechanism is used in case of DDL operations with GTT, because
> we change system catalog.

I mean, that's not really a valid argument. Invalidations can only
take effect at certain points in the code, and the whole argument here
is about which places in the code are safe for which operations, so
the fact that some things (like accepting invalidations) are safe at
some points in the code (like the places where we accept them) does
not prove that other things (like calling ambuild) are safe at other
points in the code (like wherever you are proposing to call it). In
particular, if you've got a relation open, there's currently no way
for another index to show up while you've still got that relation
open. That means that the planner and executor (which keep the
relevant relations open) don't ever have to worry about updating their
data structures, because it can never be necessary. It also means that
any code anywhere in the system that keeps a lock on a relation can
count on the list of indexes for that relation staying the same until
it releases the lock. In fact, it can hold on to pointers to data
allocated by the relcache and count on those pointers being stable for
as long as it holds the lock, and RelationClearRelation contain
specific code that aims to make sure that certain objects don't get
deallocated and reallocated at a different address precisely for that
reason. That code, however, only works as long as nothing actually
changes. The upshot is that it's entirely possible for changing
catalog entries in one backend with an inadequate lock level -- or at
unexpected point in the code -- to cause a core dump either in that
backend or in some other backend. This stuff is really subtle, and
super-easy to screw up.

I am speaking a bit generally here, because I haven't really studied
*exactly* what might go wrong in the relcache, or elsewhere, as a
result of creating an index on the fly. However, I'm very sure that a
general appeal to invalidation messages is not sufficient to make
something like what you want to do safe. Invalidation messages are a
complex, ancient, under-documented, fragile system for solving a very
specific problem that is not the one you are hoping they'll solve
here. They could justifiably be called magic, but it's not the sort of
magic where the fairy godmother waves her wand and solves all of your
problems; it's more like the kind where you go explore the forbidden
forest and are never seen or heard from again.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Enabling B-Tree deduplication by default
Next
From: Tom Lane
Date:
Subject: Re: Removing pg_pltemplate and creating "trustable" extensions