Re: [Proposal] Global temporary tables - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject Re: [Proposal] Global temporary tables
Date
Msg-id d6eedce3-6b54-8167-7caa-b6b085156219@postgrespro.ru
Whole thread Raw
In response to Re: [Proposal] Global temporary tables  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [Proposal] Global temporary tables
List pgsql-hackers

On 05.02.2020 00:38, Robert Haas wrote:
>
> My guess it that the right time to do this work is just after we
> acquire locks, at the end of parse analysis. I think trying to do it
> during execution is too late, since the planner looks at indexes, and
> trying to do it in the planner instead of before we start planning
> seems more likely to cause bugs and has no real advantages. It's just
> better to do complicated things (like creating indexes) separately
> rather than in the middle of some other complicated thing (like
> planning). I could tie my shoelaces the first time they get tangled up
> with my break pedal but it's better to do it before I get in the car.
I have implemented this approach in my new patch

https://www.postgresql.org/message-id/3e88b59f-73e8-685e-4983-9026f94c57c5%40postgrespro.ru

I have added check whether index is initialized or not to plancat.c 
where optimizer checks if index is valid.
Now it should work for all kinds of indexes (B-Tree, hash, user defined 
access methods...).
>
> And I'm still inclined to do it by flat-copying files rather than
> calling ambuild. It will be slightly faster, but also importantly, it
> will guarantee that (1) every backend gets exactly the same initial
> state and (2) it has fewer ways to fail because it doesn't involve
> calling any user-defined code. Those seem like fairly compelling
> advantages, and I don't see what the disadvantages are. I think
> calling ambuild() at the point in time proposed in the preceding
> paragraph would be fairly safe and would probably work OK most of the
> time, but I can't think of any reason it would be better.

There is very important reason (from my point of view): allow other 
sessions to use created index and
so provide compatible behavior with regular tables (and with Oracle).
So we should be able to populate index with existed GTT data.
And ambuild will do it.

>
> Incidentally, what I'd be inclined to do is - if the session is
> running a query that does only read-only operations, let it continue
> to point to the "master" copy of the GTT and its indexes, which is
> stored in the relfilenodes indicated for those relations in pg_class.
> If it's going to acquire a lock heavier than AccessShareLock, then
> give it is own copies of the table and indexes, stored in a temporary
> relfilenode (tXXX_YYY) and redirect all future access to that GTT by
> this backend to there. Maybe there's some reason this won't work, but
> it seems nice to avoid saying that we've "attached" to the GTT if all
> we did is read the empty table.
>
Sorry, I do not understand the benefits of such optimization. It seems 
to be very rare situation when session will try to access temp table 
which was not previously filled with data. But even if it happen, 
keeping "master" copy will not safe much: we in any case have shared 
metadata and no data. Yes, with current approach, first access to GTT 
will cause creation of empty indexes. But It is just initialization of 
1-3 pages. I do not think that delaying index initialization can be 
really useful.

In any case, calling ambuild is the simplest and most universal 
approach, providing desired and compatible behavior.
I really do not understand why we should try yo invent some alternative 
solution.





pgsql-hackers by date:

Previous
From: Kasahara Tatsuhito
Date:
Subject: Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
Next
From: Amit Langote
Date:
Subject: Re: pg_stat_progress_basebackup - progress reporting forpg_basebackup, in the server side