Re: [Proposal] Global temporary tables - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: [Proposal] Global temporary tables
Date
Msg-id CAFj8pRANfAciHwDwNyH3-QkkNPFFDRuBfFamYGaXWZKsW4HCCQ@mail.gmail.com
Whole thread Raw
In response to Re: [Proposal] Global temporary tables  (Andres Freund <andres@anarazel.de>)
Responses Re: [Proposal] Global temporary tables  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers


ne 27. 2. 2022 v 5:13 odesílatel Andres Freund <andres@anarazel.de> napsal:
Hi,

On 2022-02-27 04:17:52 +0100, Pavel Stehule wrote:
> > You redirect stats from pg_class and pg_statistics to a local hash table.
> > This is pretty hairy :(

As is I think the patch is architecturally completely unacceptable. Having
code everywhere to redirect to manually written in-memory catalog table code
isn't maintainable.


> > I guess you'd also need to handle pg_statistic_ext and ext_data.
> > pg_stats doesn't work, since the data isn't in pg_statistic - it'd need to
> > look
> > at pg_get_gtt_statistics.
>
> Without this, the GTT will be terribly slow like current temporary tables
> with a lot of problems with bloating of pg_class, pg_attribute and
> pg_depend tables.

I think it's not a great idea to solve multiple complicated problems at
once...

I thought about this issue for a very long time, and I didn't find any better (without more significant rewriting of pg storage). In a lot of projects, that I know, the temporary tables are strictly prohibited due possible devastating impact to system catalog bloat.  It is a serious problem. So any implementation of GTT should solve the questions: a) how to reduce catalog bloating, b) how to allow session related statistics for GTT. I agree so implementation of GTT like template based LTT (local temporary tables) can be very simple (it is possible by extension), but with the same unhappy performance impacts.

I don't say so current design should be accepted without any discussions and without changes. Maybe GTT based on LTT can be better than nothing (what we have now), and can be good enough for a lot of projects where the load is not too high (and almost all projects have low load). Unfortunately,it can be a trap for a lot of projects in future, so there should be discussion and proposed solutions for fix of related issues. The performance of GTT should be fixable, so any discussion about this topic should have part about protections against catalog bloat and about cost related to frequent catalog updates.

But anyway, I invite (and probably not just me) any discussion on how to implement this feature, how to solve performance issues, and how to divide implementation into smaller steps. I am sure so fast GTT  implementation can be used for fast implementation of LTT too, and maybe with all other temporary objects

Regards

Pavel


Greetings,

Andres Freund

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [Proposal] Global temporary tables
Next
From: Bharath Rupireddy
Date:
Subject: Add WAL recovery messages with log_wal_traffic GUC (was: add recovery, backup, archive, streaming etc. activity messages to server logs along with ps display)