Re: [Proposal] Global temporary tables - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject Re: [Proposal] Global temporary tables
Date
Msg-id f9ec7bca-1e6d-cc12-fc7c-0a55a60a6a06@postgrespro.ru
Whole thread Raw
In response to Re: [Proposal] Global temporary tables  ("曾文旌(义从)" <wenjing.zwj@alibaba-inc.com>)
Responses Re: [Proposal] Global temporary tables
List pgsql-hackers


On 01.02.2020 19:14, 曾文旌(义从) wrote:


2020年1月27日 下午5:38,Konstantin Knizhnik <k.knizhnik@postgrespro.ru> 写道:



On 25.01.2020 18:15, 曾文旌(义从) wrote:
I wonder why do we need some special check for GTT here.
From my point of view cleanup at startup of local storage of temp tables should be performed in the same way for local and global temp tables.
After oom kill, In autovacuum, the Isolated local temp table will be cleaned like orphan temporary tables. The definition of local temp table is deleted with the storage file. 
But GTT can not do that. So we have the this implementation in my patch.
If you have other solutions, please let me know.

I wonder if it is possible that autovacuum or some other Postgres process is killed by OOM and postmaster is not noticing it can doens't restart Postgres instance?
as far as I know, crash of any process connected to Postgres shared memory (and autovacuum definitely has such connection) cause Postgres restart.
Postmaster will not restart after oom happen, but the startup process will. GTT data files are cleaned up in the startup process.

Yes, exactly.
But it is still not clear to me why do we need some special handling for GTT?
Shared memory is reinitialized and storage of temporary tables is removed.
It is true for both local and global temp tables.



In my design
1 Because different sessions have different transaction information, I choose to store the transaction information of GTT in MyProc,not catalog.
2 About the XID wraparound problem, the reason is the design of the temp table storage(local temp table and global temp table) that makes it can not to do vacuum by autovacuum. 
It should be completely solve at the storage level.


My point of view is that vacuuming of temp tables is common problem for local and global temp tables.
So it has to be addressed in the common way and so we should not try to fix this problem only for GTT.
I think I agree with you this point.
However, this does not mean that GTT transaction information stored in pg_class is correct.
If you keep it that way, like in global_private_temp-8.patch, It may cause data loss in GTT after aotuvauum.

In my patch autovacuum is prohibited for GTT.

IMHO forced terminated of client sessions is not acceptable solution.
And it is not an absolutely necessary requirement.
So from my point of view we should not add such limitations to GTT design.
This limitation makes it possible for the GTT to do all the DDL.
IMHO even oracle's GTT has similar limitations.

I have checked that Oracle is not preventing creation of index for GTT if there are some active sessions working with this table. And this index becomes visible for all this sessions.


As global_private_temp-8.patch, think about:
1 session X tale several hours doing some statistical work with the GTT A, which generated some data using transaction 100, The work is not over.
2 Then session Y vacuumed A, and the GTT's relfrozenxid (in pg_class) was updated to 1000 0000.
3 Then the aotuvacuum happened, the clog  before 1000 0000 was cleaned up.
4 The data in session A could be lost due to missing clog, The analysis task failed.

However This is likely to happen because you allowed the GTT do vacuum. 
And this is not a common problem, that not happen with local temp tables.
I feel uneasy about leaving such a question. We can improve it.


May be the easies solution is to prohibit explicit vacuum of GTT?

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

pgsql-hackers by date:

Previous
From: Konstantin Knizhnik
Date:
Subject: Re: [Proposal] Global temporary tables
Next
From: Fujii Masao
Date:
Subject: Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)