Re: [Proposal] Global temporary tables - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [Proposal] Global temporary tables
Date
Msg-id 20220225074500.sfizxbmlrj2s6hp5@alap3.anarazel.de
Whole thread Raw
In response to Re: [Proposal] Global temporary tables  (Wenjing Zeng <wjzeng2012@gmail.com>)
Responses Re: [Proposal] Global temporary tables  (Wenjing Zeng <wjzeng2012@gmail.com>)
List pgsql-hackers
Hi,


This is a huge thread. Realistically reviewers and committers can't reread
it. I think there needs to be more of a description of how this works included
in the patchset and *why* it works that way. The readme does a bit of that,
but not particularly well.


On 2022-02-25 14:26:47 +0800, Wenjing Zeng wrote:
> +++ b/README.gtt.txt
> @@ -0,0 +1,172 @@
> +Global Temporary Table(GTT)
> +=========================================
> +
> +Feature description
> +-----------------------------------------
> +
> +Previously, temporary tables are defined once and automatically
> +exist (starting with empty contents) in every session before using them.

I think for a README "previously" etc isn't good language - if it were
commited, it'd not be understandable anymore. It makes more sense for commit
messages etc.


> +Main design ideas
> +-----------------------------------------
> +In general, GTT and LTT use the same storage and buffer design and
> +implementation. The storage files for both types of temporary tables are named
> +as t_backendid_relfilenode, and the local buffer is used to cache the data.

What does "named ast_backendid_relfilenode" mean?


> +The schema of GTTs is shared among sessions while their data are not. We build
> +a new mechanisms to manage those non-shared data and their statistics.
> +Here is the summary of changes:
> +
> +1) CATALOG
> +GTTs store session-specific data. The storage information of GTTs'data, their
> +transaction information, and their statistics are not stored in the catalog.
> +
> +2) STORAGE INFO & STATISTICS INFO & TRANSACTION INFO
> +In order to maintain durability and availability of GTTs'session-specific data,
> +their storage information, statistics, and transaction information is managed
> +in a local hash table tt_storage_local_hash.

"maintain durability"? Durable across what? In the context of databases it's
typically about crash safety, but that can't be the case here.


> +3) DDL
> +Currently, GTT supports almost all table'DDL except CLUSTER/VACUUM FULL.
> +Part of the DDL behavior is limited by shared definitions and multiple copies of
> +local data, and we added some structures to handle this.

> +A shared hash table active_gtt_shared_hash is added to track the state of the
> +GTT in a different session. This information is recorded in the hash table
> +during the DDL execution of the GTT.

> +The data stored in a GTT can only be modified or accessed by owning session.
> +The statements that only modify data in a GTT do not need a high level of
> +table locking. The operations making those changes include truncate GTT,
> +reindex GTT, and lock GTT.

I think you need to introduce a bit more terminology for any of this to make
sense. Sometimes GTT means the global catalog entity, sometimes, like here, it
appears to mean the session specific contents of a GTT.

What state of a GTT in a nother session?


How do GTTs handle something like BEGIN; TRUNCATE some_gtt_table; ROLLBACK;?


> +1.2 on commit clause
> +LTT's status associated with on commit DELETE ROWS and on commit PRESERVE ROWS
> +is not stored in catalog. Instead, GTTs need a bool value on_commit_delete_rows
> +in reloptions which is shared among sessions.

Why?



> +2.3 statistics info
> +1) relpages reltuples relallvisible relfilenode

?


> +3 DDL
> +3.1. active_gtt_shared_hash
> +This is the hash table created in shared memory to trace the GTT files initialized
> +in each session. Each hash entry contains a bitmap that records the backendid of
> +the initialized GTT file. With this hash table, we know which backend/session
> +is using this GTT. Such information is used during GTT's DDL operations.

So there's a separate locking protocol for GTTs that doesn't use the normal
locking infrastructure? Why?


> +3.7 CLUSTER GTT/VACUUM FULL GTT
> +The current version does not support.

Why?


> +4 MVCC commit log(clog) cleanup
> +
> +The GTT storage file contains transaction information. Queries for GTT data rely
> +on transaction information such as clog. The transaction information required by
> +each session may be completely different.

Why is transaction information different between sessions? Or does this just
mean that different transaction ids will be accessed?



0003-gtt-v67-implementation.patch
 71 files changed, 3167 insertions(+), 195 deletions(-)

This needs to be broken into smaller chunks to be reviewable.


> @@ -677,6 +678,14 @@ _bt_getrootheight(Relation rel)
>      {
>          Buffer        metabuf;
>  
> +        /*
> +         * If a global temporary table storage file is not initialized in the
> +         * this session, its index does not have a root page, just returns 0.
> +         */
> +        if (RELATION_IS_GLOBAL_TEMP(rel) &&
> +            !gtt_storage_attached(RelationGetRelid(rel)))
> +            return 0;
> +
>          metabuf = _bt_getbuf(rel, BTREE_METAPAGE, BT_READ);
>          metad = _bt_getmeta(rel, metabuf);

Stuff like this seems not acceptable. Accesses would have to be prevented much
earlier. Otherwise each index method is going to need copies of this logic. I
also doubt that _bt_getrootheight() is the only place that'd need this.


>  static void
>  index_update_stats(Relation rel,
>                     bool hasindex,
> -                   double reltuples)
> +                   double reltuples,
> +                   bool isreindex)
>  {
>      Oid            relid = RelationGetRelid(rel);
>      Relation    pg_class;
> @@ -2797,6 +2824,13 @@ index_update_stats(Relation rel,
>      Form_pg_class rd_rel;
>      bool        dirty;
>  
> +    /*
> +     * Most of the global Temp table data is updated to the local hash, and reindex
> +     * does not refresh relcache, so call a separate function.
> +     */
> +    if (RELATION_IS_GLOBAL_TEMP(rel))
> +        return index_update_gtt_relstats(rel, hasindex, reltuples, isreindex);
> +

So basically every single place in the code that does catalog accesses is
going to need a completely separate implementation for GTTs? That seems
unmaintainable.



> +/*-------------------------------------------------------------------------
> + *
> + * storage_gtt.c
> + *      The body implementation of Global temparary table.
> + *
> + * IDENTIFICATION
> + *      src/backend/catalog/storage_gtt.c
> + *
> + *      See src/backend/catalog/GTT_README for Global temparary table's
> + *      requirements and design.
> + *
> + *-------------------------------------------------------------------------
> + */

I don't think that path to the readme is correct.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: Add checkpoint and redo LSN to LogCheckpointEnd log message