Re: [Proposal] Global temporary tables - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject Re: [Proposal] Global temporary tables
Date
Msg-id 745ef1fa-5a81-d30d-aa33-bce24a022a3a@postgrespro.ru
Whole thread Raw
In response to Re: [Proposal] Global temporary tables  ("曾文旌(义从)" <wenjing.zwj@alibaba-inc.com>)
List pgsql-hackers

On 08.11.2019 10:50, 曾文旌(义从) wrote:
> In my opinion, it is not a good idea to trigger a btbuild with a select or DML, the cost of which depends on the
amountof data in the GTT.
 
IMHO it is better than returning error.
Also index will be used only if cost of plan with index will be 
considered better than cost of plan without index. If you do not have 
index, then you have to scan the whole table.
Time of such scan is comparable with time of building index.

Yes, I agree that indexes for GTT are used to be created together with 
table itself before it is used by any application.
But if later DBA recognized that efficient execution of queries requires 
some more indexes,
it will be strange and dangerous to prevent him from adding such index 
until all clients which have accessed this table will drop their 
connections.
Also maintaining in shared memory information about attached backends 
seems to be overkill.

>>
>> This code initializes B-Tree and load data in it when GTT index is access and is not initialized yet.
>> It looks a little bit hacker but it works.
>>
>> I also wonder why you are keeping information about GTT in shared memory. Looks like the only information we really
needto share is table's metadata.
 
>> But it is already shared though catalog. All other GTT related information is private to backend so I do not see
reasonsto place it in shared memory.
 
> The shared hash structure tracks which backend has initialized the GTT storage in order to implement the DDL of the
GTT.
Sorry, I do not understand this argument.
DDL is performed on shared metadata present in global catalog.
Standard postgres invalidation mechanism is used to notify all backends 
about schema changes.
Why do we need to maintain some extra information in shared memory.
Can you give me example of DLL which does't work without such shared hash?

> As for GTT, there is only one definition(include index on GTT), but each backend may have one data.
> For the implementation of drop GTT, I assume that all data and definitions need to be deleted.

Data of dropped GTT is removed on normal backend termination or cleaned 
up at server restart in case of abnormal shutdown (as it is done for 
local temp tables).
I have not used any shared control structures for GTT in my 
implementation and that is why I wonder why do you need it and what are 
the expected problems with my
implementation?

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: SPI refactoring
Next
From: Grigory Smolkin
Date:
Subject: Re: [proposal] recovery_target "latest"