Thread: [Proposal] Global temporary tables

[Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:
Dear Hackers,

This propose a way to develop global temporary tables in PostgreSQL.

I noticed that there is an "Allow temporary tables to exist as empty by default in all sessions" in the postgresql todolist.

In recent years, PG community had many discussions about global temp table (GTT) support. Previous discussion covered the following topics: 
(1) The main benefit or function: GTT offers features like “persistent schema, ephemeral data”, which avoids catalog bloat and reduces catalog vacuum. 
(2) Whether follows ANSI concept of temporary tables
(3) How to deal with statistics, single copy of schema definition, relcache
(5) A recent implementation and design from Konstantin Knizhnik covered many functions of GTT: https://www.postgresql.org/message-id/attachment/103265/global_private_temp-1.patch

However, as pointed by Konstantin himself, the implementation still needs functions related to CLOG, vacuum, and MVCC visibility.

We developed GTT based on PG 11 and included most needed features, such as how to deal with concurrent DDL and DML operations, how to handle vacuum and too old relfrozenxids, and how to store and access GTT statistics. 

This design followed many suggestions from previous discussion in community. Here are some examples:
“have a separate 'relpersistence' setting for global temp tables…by having the backend id in all filename….   From Andres Freund
Use session memory context to store information related to GTT.   From Pavel Stehule
“extend the relfilenode mapper to support a backend-local non-persistent relfilenode map that's used to track temp table and index relfilenodes…” from Craig Ringer

Our implementation creates one record in pg_class for GTT’s schema definition. When rows are first inserted into the GTT in a session, a session specific file is created to store the GTT’s data. Those files are removed when the session ends. We maintain the GTT’s statistics in session local memory. DDL operations, such as DROP table or CREATE INDEX, can be executed on a GTT only by one session, while no other sessions insert any data into the GTT before or it is already truncated. This also avoids the concurrency of DML and DDL operations on GTT. We maintain a session level oldest relfrozenxids for GTT. This way, autovacuum or vacuum can truncate CLOG and increase global relfrozenxids based on all tables’ relfrozenxids, including GTT’s. 
The follows summarize the main design and implementation: 
Syntax: ON COMMIT PRESERVE ROWS and ON COMMIT DELETE ROWS
Data storage and buffering follows the same way as local temp table with a relfilenode including session id.
A hash table(A) in shared memory is used to track sessions and their usage of GTTs and to serialize DDL and DML operations. 
Another hash table(B) in session memory is introduced to record storage files for GTTs and their indexes. When a session ends, those files are removed. 
The same hash table(B) in session memory is used to record the relfrozenxids of each GTT. The oldest one is stored in myproc so that autovacuum and vacuum may use it to determine global oldest relfrozenxids and truncate clog. 
The same hash table(B) in session memory stores GTT’s session level statistics, It is generated during the operations of vacuum and analyze, and used by SQL optimizer to create execution plan. 
Some utility functions are added for DBA to manage GTTs. 
TRUNCATE command on a GTT behaves differently from that on a normal table. The command deletes the data immediately but keeps relfilenode using lower level table lock, RowExclusiveLock, instead of  AccessExclusiveLock. 
Main limits of this version or future improvement: need suggestions from community: 
1 VACUUM FULL and CLUSTER are not supported; any operations which may change relfilenode are disabled to GTT.
2 Sequence column is not supported in GTT for now.
3 Users defined statistics is not supported.


Details:

Requirement
The features list about global temp table:
1. global temp table (ON COMMIT clause is omitted, SQL specifies that the default behavior is ON COMMIT DELETE ROWS)
2. support with on commit DELETE ROWS
3. support with on commit PRESERVE ROWS
4. not support ON COMMIT DROP

Feature description
Global temp tables are defined just once and automatically exist (starting with empty contents) in every session that needs them.
Global temp table, each session use local buffer, read or write independent data files.
Use on commit DELETE ROWS for a transaction-specific global temp table. This is the default. database will truncate the table (delete all its rows) after each commit.
Use on commit PRESERVE ROWS Specify PRESERVE ROWS for a session-specific global temp table. databse will truncate the table (delete all its rows) when you terminate the session.

design
Global temp tables are designed based on local temp table(buffer and storage files). 
Because the catalog of global temp table is shared between sessions but the data is not shared, we need to build some new mechanisms to manage non-shared data and statistics for those data.

1. catalog
1.1 relpersistence
define RELPERSISTENCEGLOBALTEMP 'g'
Mark global temp table in pg_class relpersistence to 'T'. The relpersistence of the index created on the global temp table is also set to ’T'

1.2 on commit clause
In local temp table on commit DELETE ROWS and on commit PRESERVE ROWS not store in catalog, but GTT need.
Store a bool value oncommitdelete_rows to reloptions only for GTT and share with other session.

2. gram.y
Global temp table already has a syntax tree. jush need to remove the warning message "GLOBAL is deprecated in temporary table creation" and mark relpersistence = RELPERSISTENCEGLOBALTEMP

3. STORAGE
3.1. active_gtt_shared_hash
create a hash table in shared memory to trace the GTT files that are initialized in each session. 
Each hash entry contains a bitmap that records the backendid of the initialized GTT file.
With this hash table, we know which backend/session are using this GTT.
It will be used in GTT's DDL.

3.2. gtt_storage_local_hash
In each backend, create a local hashtable gtt_storage_local_hash for tracks GTT storage file and statistics.
1). GTT storage file track
When one session inserts data into a GTT for the first time, record to local hash.
2). normal clean GTT files
Use beforeshmemexit to ensure that all files for the session GTT are deleted when the session exits.
3). abnormal situation file cleanup
When a backend exits abnormally (such as oom kill), the startup process started to recovery before accept connect. startup process check and remove all GTT files before redo wal.

4 DDL
4.1 DROP GTT
One GTT table is allowed to be deleted when only the current session USES it. After get the AccessExclusiveLock of the GTT table, use active_gtt_shared_hash to check and make sure that.

4.2 ALTER GTT
Same as drop GTT.

4.3 CREATE INDEX ON GTT, DROP INDEX ON GTT
Same as drop GTT.

4.4 TRUNCATE GTT
The truncate GTT use RowExclusiveLock, not AccessExclusiveLock, Because truncate only cleans up local data file and local buffers in this session.
Also, truncate immediately deletes the data file without changing the relfilenode of the GTT table. btw, I'm not sure the implementation will be acceptable to the community.

4.5  create index on GTT
Same as drop GTT.

4.6 OTHERS
Any table operations about GTT that need to change relfilenode are disabled, such as vacuum full/cluster.

5. The statistics of GTT
1 relpages reltuples relallvisible frozenxid minmulti from pg_class
2 The statistics for each column from pg_statistic
All the above information will be stored to gtt_storage_local_hash.
When vacuum or analyze GTT's statistic will update, and the planner will use them. Of course, statistics only contain data within the current session.

5.1. View global temp table statistics
Provide pggttattstatistic get column statistics for GTT. Provide pggtt_relstats to rel statistics for GTT.
These functions are implemented in a plug-in, without add system view or function.

6. autovacuum
Autovacuum skips all GTT.

7. vacuum(frozenxid push, clog truncate)
The GTT data file contains transaction information. Queries for GTT data rely on transaction information such as clog. That's can not be vacuumed automatically by vacuum.
7.1 The session level gtt oldest frozenxid
When one GTT been create or remove, record the session level oldest frozenxid and put it into MyProc. 

7.1 vacuum
When vacuum push the db's frozenxid(vacupdatedatfrozenxid), need to consider the GTT. It needs to calculate the transactions required for the GTT(search all MyPorc), to avoid the clog required by GTT being cleaned.

8. Parallel query
Planner does not produce parallel query plans for SQL related to global temp table.

9. Operability
Provide pggttattachedpid lists all the pids that are using the GTT. Provide pglistgttrelfrozenxids lists the session level oldest frozenxid of using GTT.
These functions are implemented in a plug-in, without add system view or function.
DBA can use the above function and pgterminatebackend to force the cleanup of "too old" GTT tables and sessions.

10. Limitations and todo list
10.1. alter GTT
10.2. pg_statistic_ext
10.3. remove GTT's relfilenode can not change limit.
cluster/vacuum full, optimize truncate gtt.
10.4. SERIAL column type
The GTT from different sessions share a sequence(SERIAL type).
Need each session use the sequence independently.
10.5. Locking optimization for GTT.
10.6 materialized views is not support on GTT.


What do you thinking about this proposal?
Looking forward to your feedback.

Thanks!


regards

--
Zeng Wenjing
Alibaba Group-Database Products Business Unit


Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:


On 11.10.2019 15:15, 曾文旌(义从) wrote:
Dear Hackers,

This propose a way to develop global temporary tables in PostgreSQL.

I noticed that there is an "Allow temporary tables to exist as empty by default in all sessions" in the postgresql todolist.

In recent years, PG community had many discussions about global temp table (GTT) support. Previous discussion covered the following topics: 
(1) The main benefit or function: GTT offers features like “persistent schema, ephemeral data”, which avoids catalog bloat and reduces catalog vacuum. 
(2) Whether follows ANSI concept of temporary tables
(3) How to deal with statistics, single copy of schema definition, relcache
(5) A recent implementation and design from Konstantin Knizhnik covered many functions of GTT: https://www.postgresql.org/message-id/attachment/103265/global_private_temp-1.patch

However, as pointed by Konstantin himself, the implementation still needs functions related to CLOG, vacuum, and MVCC visibility.


Just to clarify.
I have now proposed several different solutions for GTT:

Shared vs. private buffers for GTT:
1. Private buffers. This is least invasive patch, requiring no changes in relfilenodes.
2. Shared buffers. Requires changing relfilenode but supports parallel query execution for GTT.

Access to GTT at replica:
1. Access is prohibited (as for original temp tables). No changes at all.
2. Tuples of temp tables are marked with forzen XID.  Minimal changes, rollbacks are not possible.
3. Providing special XIDs for GTT at replica. No changes in CLOG are required, but special MVCC visibility rules are used for GTT. Current limitation: number of transactions accessing GTT at replica is limited by 2^32
and bitmap of correspondent size has to be maintained (tuples of GTT are not proceeded by vacuum and not frozen, so XID horizon never moved).

So except the limitation mentioned above (which I do not consider as critical) there is only one problem which was not addressed: maintaining statistics for GTT.
If all of the following conditions are true:

1) GTT are used in joins
2) There are indexes defined for GTT
3) Size and histogram of GTT in different backends can significantly vary.
4) ANALYZE was explicitly called for GTT

then query execution plan built in one backend will be also used for other backends where it can be inefficient.
I also do not consider this problem as "show stopper" for adding GTT to Postgres.

I still do not understand the opinion of community which functionality of GTT is considered to be most important.
But the patch with local buffers and no replica support is small enough to become good starting point.


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


pá 11. 10. 2019 v 15:50 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 11.10.2019 15:15, 曾文旌(义从) wrote:
Dear Hackers,

This propose a way to develop global temporary tables in PostgreSQL.

I noticed that there is an "Allow temporary tables to exist as empty by default in all sessions" in the postgresql todolist.

In recent years, PG community had many discussions about global temp table (GTT) support. Previous discussion covered the following topics: 
(1) The main benefit or function: GTT offers features like “persistent schema, ephemeral data”, which avoids catalog bloat and reduces catalog vacuum. 
(2) Whether follows ANSI concept of temporary tables
(3) How to deal with statistics, single copy of schema definition, relcache
(5) A recent implementation and design from Konstantin Knizhnik covered many functions of GTT: https://www.postgresql.org/message-id/attachment/103265/global_private_temp-1.patch

However, as pointed by Konstantin himself, the implementation still needs functions related to CLOG, vacuum, and MVCC visibility.


Just to clarify.
I have now proposed several different solutions for GTT:

Shared vs. private buffers for GTT:
1. Private buffers. This is least invasive patch, requiring no changes in relfilenodes.
2. Shared buffers. Requires changing relfilenode but supports parallel query execution for GTT.

This is important argument for using share buffers. Maybe the best is mix of both - store files in temporal tablespace, but using share buffers. More, it can be accessible for autovacuum.

Access to GTT at replica:
1. Access is prohibited (as for original temp tables). No changes at all.
2. Tuples of temp tables are marked with forzen XID.  Minimal changes, rollbacks are not possible.
3. Providing special XIDs for GTT at replica. No changes in CLOG are required, but special MVCC visibility rules are used for GTT. Current limitation: number of transactions accessing GTT at replica is limited by 2^32
and bitmap of correspondent size has to be maintained (tuples of GTT are not proceeded by vacuum and not frozen, so XID horizon never moved).

So except the limitation mentioned above (which I do not consider as critical) there is only one problem which was not addressed: maintaining statistics for GTT.
If all of the following conditions are true:

1) GTT are used in joins
2) There are indexes defined for GTT
3) Size and histogram of GTT in different backends can significantly vary.
4) ANALYZE was explicitly called for GTT

then query execution plan built in one backend will be also used for other backends where it can be inefficient.
I also do not consider this problem as "show stopper" for adding GTT to Postgres.

The last issue is show stopper in my mind. It really depends on usage. There are situation where shared statistics are ok (and maybe good solution), and other situation, where shared statistics are just unusable.

Regards

Pavel



I still do not understand the opinion of community which functionality of GTT is considered to be most important.
But the patch with local buffers and no replica support is small enough to become good starting point.


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2019年10月12日 下午1:16,Pavel Stehule <pavel.stehule@gmail.com> 写道:



pá 11. 10. 2019 v 15:50 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 11.10.2019 15:15, 曾文旌(义从) wrote:
Dear Hackers,

This propose a way to develop global temporary tables in PostgreSQL.

I noticed that there is an "Allow temporary tables to exist as empty by default in all sessions" in the postgresql todolist.

In recent years, PG community had many discussions about global temp table (GTT) support. Previous discussion covered the following topics: 
(1) The main benefit or function: GTT offers features like “persistent schema, ephemeral data”, which avoids catalog bloat and reduces catalog vacuum. 
(2) Whether follows ANSI concept of temporary tables
(3) How to deal with statistics, single copy of schema definition, relcache
(5) A recent implementation and design from Konstantin Knizhnik covered many functions of GTT: https://www.postgresql.org/message-id/attachment/103265/global_private_temp-1.patch

However, as pointed by Konstantin himself, the implementation still needs functions related to CLOG, vacuum, and MVCC visibility.


Just to clarify.
I have now proposed several different solutions for GTT:

Shared vs. private buffers for GTT:
1. Private buffers. This is least invasive patch, requiring no changes in relfilenodes.
2. Shared buffers. Requires changing relfilenode but supports parallel query execution for GTT.

This is important argument for using share buffers. Maybe the best is mix of both - store files in temporal tablespace, but using share buffers. More, it can be accessible for autovacuum.

Access to GTT at replica:
1. Access is prohibited (as for original temp tables). No changes at all.
2. Tuples of temp tables are marked with forzen XID.  Minimal changes, rollbacks are not possible.
3. Providing special XIDs for GTT at replica. No changes in CLOG are required, but special MVCC visibility rules are used for GTT. Current limitation: number of transactions accessing GTT at replica is limited by 2^32
and bitmap of correspondent size has to be maintained (tuples of GTT are not proceeded by vacuum and not frozen, so XID horizon never moved).

So except the limitation mentioned above (which I do not consider as critical) there is only one problem which was not addressed: maintaining statistics for GTT. 
If all of the following conditions are true:

1) GTT are used in joins
2) There are indexes defined for GTT
3) Size and histogram of GTT in different backends can significantly vary. 
4) ANALYZE was explicitly called for GTT

then query execution plan built in one backend will be also used for other backends where it can be inefficient.
I also do not consider this problem as "show stopper" for adding GTT to Postgres.

The last issue is show stopper in my mind. It really depends on usage. There are situation where shared statistics are ok (and maybe good solution), and other situation, where shared statistics are just unusable.
This proposal calculates and stores independent statistics(relpages reltuples and histogram of GTT) for the gtt data within each session, ensuring optimizer can get accurate statistics.


Regards

Pavel



I still do not understand the opinion of community which functionality of GTT is considered to be most important.
But the patch with local buffers and no replica support is small enough to become good starting point.


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2019年10月11日 下午9:50,Konstantin Knizhnik <k.knizhnik@postgrespro.ru> 写道:



On 11.10.2019 15:15, 曾文旌(义从) wrote:
Dear Hackers,

This propose a way to develop global temporary tables in PostgreSQL.

I noticed that there is an "Allow temporary tables to exist as empty by default in all sessions" in the postgresql todolist.

In recent years, PG community had many discussions about global temp table (GTT) support. Previous discussion covered the following topics: 
(1) The main benefit or function: GTT offers features like “persistent schema, ephemeral data”, which avoids catalog bloat and reduces catalog vacuum. 
(2) Whether follows ANSI concept of temporary tables
(3) How to deal with statistics, single copy of schema definition, relcache
(5) A recent implementation and design from Konstantin Knizhnik covered many functions of GTT: https://www.postgresql.org/message-id/attachment/103265/global_private_temp-1.patch

However, as pointed by Konstantin himself, the implementation still needs functions related to CLOG, vacuum, and MVCC visibility.


Just to clarify.
I have now proposed several different solutions for GTT:

Shared vs. private buffers for GTT:
1. Private buffers. This is least invasive patch, requiring no changes in relfilenodes.
2. Shared buffers. Requires changing relfilenode but supports parallel query execution for GTT.

Access to GTT at replica:
1. Access is prohibited (as for original temp tables). No changes at all.
2. Tuples of temp tables are marked with forzen XID.  Minimal changes, rollbacks are not possible.
3. Providing special XIDs for GTT at replica. No changes in CLOG are required, but special MVCC visibility rules are used for GTT. Current limitation: number of transactions accessing GTT at replica is limited by 2^32
and bitmap of correspondent size has to be maintained (tuples of GTT are not proceeded by vacuum and not frozen, so XID horizon never moved).

So except the limitation mentioned above (which I do not consider as critical) there is only one problem which was not addressed: maintaining statistics for GTT.
If all of the following conditions are true:

1) GTT are used in joins
2) There are indexes defined for GTT
3) Size and histogram of GTT in different backends can significantly vary.
4) ANALYZE was explicitly called for GTT

then query execution plan built in one backend will be also used for other backends where it can be inefficient.
I also do not consider this problem as "show stopper" for adding GTT to Postgres.
When session A writes 10000000 rows of data to gtt X, session B also uses X at the same time and it has 100 rows of different data. If B uses analyze to count the statistics of 100000 rows of data and updates it to catalog.
Obviously, session A will get inaccurate query plan based on misaligned statistics when calculating the query plan for X related queries. Session A may think that table X is too small to be worth using index scan, but it is not. Each session needs to get the statistics of the self data to make the query plan.


I still do not understand the opinion of community which functionality of GTT is considered to be most important.
But the patch with local buffers and no replica support is small enough to become good starting point.
Yes ,the first step, we focus on complete basic functions of gtt (dml ddl index on gtt (MVCC visibility rules) storage).
Abnormal statistics can cause problems with index selection on gtt, so index on gtt and accurate statistical information is necessary.



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Fri, Oct 11, 2019 at 9:50 AM Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
> Just to clarify.
> I have now proposed several different solutions for GTT:
>
> Shared vs. private buffers for GTT:
> 1. Private buffers. This is least invasive patch, requiring no changes in relfilenodes.
> 2. Shared buffers. Requires changing relfilenode but supports parallel query execution for GTT.

I vote for #1. I think parallel query for temp objects may be a
desirable feature, but I don't think it should be the job of a patch
implementing GTTs to make it happen. In fact, I think it would be an
actively bad idea, because I suspect that if we do eventually support
temp relations for parallel query, we're going to want a solution that
is shared between regular temp tables and global temp tables, not
separate solutions for each.

> Access to GTT at replica:
> 1. Access is prohibited (as for original temp tables). No changes at all.
> 2. Tuples of temp tables are marked with forzen XID.  Minimal changes, rollbacks are not possible.
> 3. Providing special XIDs for GTT at replica. No changes in CLOG are required, but special MVCC visibility rules are
usedfor GTT. Current limitation: number of transactions accessing GTT at replica is limited by 2^32
 
> and bitmap of correspondent size has to be maintained (tuples of GTT are not proceeded by vacuum and not frozen, so
XIDhorizon never moved).
 

I again vote for #1. A GTT is defined to allow data to be visible only
within one session -- so what does it even mean for the data to be
accessible on a replica?

> So except the limitation mentioned above (which I do not consider as critical) there is only one problem which was
notaddressed: maintaining statistics for GTT.
 
> If all of the following conditions are true:
>
> 1) GTT are used in joins
> 2) There are indexes defined for GTT
> 3) Size and histogram of GTT in different backends can significantly vary.
> 4) ANALYZE was explicitly called for GTT
>
> then query execution plan built in one backend will be also used for other backends where it can be inefficient.
> I also do not consider this problem as "show stopper" for adding GTT to Postgres.

I think that's *definitely* a show stopper.

> I still do not understand the opinion of community which functionality of GTT is considered to be most important.
> But the patch with local buffers and no replica support is small enough to become good starting point.

Well, it seems we now have two patches for this feature. I guess we
need to figure out which one is better, and whether it's possible for
the two efforts to be merged, rather than having two different teams
hacking on separate code bases.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


pá 25. 10. 2019 v 17:01 odesílatel Robert Haas <robertmhaas@gmail.com> napsal:
On Fri, Oct 11, 2019 at 9:50 AM Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
> Just to clarify.
> I have now proposed several different solutions for GTT:
>
> Shared vs. private buffers for GTT:
> 1. Private buffers. This is least invasive patch, requiring no changes in relfilenodes.
> 2. Shared buffers. Requires changing relfilenode but supports parallel query execution for GTT.

I vote for #1. I think parallel query for temp objects may be a
desirable feature, but I don't think it should be the job of a patch
implementing GTTs to make it happen. In fact, I think it would be an
actively bad idea, because I suspect that if we do eventually support
temp relations for parallel query, we're going to want a solution that
is shared between regular temp tables and global temp tables, not
separate solutions for each.

> Access to GTT at replica:
> 1. Access is prohibited (as for original temp tables). No changes at all.
> 2. Tuples of temp tables are marked with forzen XID.  Minimal changes, rollbacks are not possible.
> 3. Providing special XIDs for GTT at replica. No changes in CLOG are required, but special MVCC visibility rules are used for GTT. Current limitation: number of transactions accessing GTT at replica is limited by 2^32
> and bitmap of correspondent size has to be maintained (tuples of GTT are not proceeded by vacuum and not frozen, so XID horizon never moved).

I again vote for #1. A GTT is defined to allow data to be visible only
within one session -- so what does it even mean for the data to be
accessible on a replica?

why not? there are lot of sessions on replica servers. One usage of temp tables is fixing estimation errors. You can create temp table with partial query result, run ANALYZE and evaluate other steps. Now this case is not possible on replica servers.

One motivation for GTT  is decreasing port costs from Oracle. But other motivations, like do more complex calculations on replica are valid and valuable.



> So except the limitation mentioned above (which I do not consider as critical) there is only one problem which was not addressed: maintaining statistics for GTT.
> If all of the following conditions are true:
>
> 1) GTT are used in joins
> 2) There are indexes defined for GTT
> 3) Size and histogram of GTT in different backends can significantly vary.
> 4) ANALYZE was explicitly called for GTT
>
> then query execution plan built in one backend will be also used for other backends where it can be inefficient.
> I also do not consider this problem as "show stopper" for adding GTT to Postgres.

I think that's *definitely* a show stopper.

> I still do not understand the opinion of community which functionality of GTT is considered to be most important.
> But the patch with local buffers and no replica support is small enough to become good starting point.

Well, it seems we now have two patches for this feature. I guess we
need to figure out which one is better, and whether it's possible for
the two efforts to be merged, rather than having two different teams
hacking on separate code bases.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:

On 25.10.2019 18:01, Robert Haas wrote:
> On Fri, Oct 11, 2019 at 9:50 AM Konstantin Knizhnik
> <k.knizhnik@postgrespro.ru> wrote:
>> Just to clarify.
>> I have now proposed several different solutions for GTT:
>>
>> Shared vs. private buffers for GTT:
>> 1. Private buffers. This is least invasive patch, requiring no changes in relfilenodes.
>> 2. Shared buffers. Requires changing relfilenode but supports parallel query execution for GTT.
> I vote for #1. I think parallel query for temp objects may be a
> desirable feature, but I don't think it should be the job of a patch
> implementing GTTs to make it happen. In fact, I think it would be an
> actively bad idea, because I suspect that if we do eventually support
> temp relations for parallel query, we're going to want a solution that
> is shared between regular temp tables and global temp tables, not
> separate solutions for each.

Sorry, may be I do not not understand you.
It seems to me that there is only one thing preventing usage of 
temporary tables in parallel plans: private buffers.
If global temporary tables are accessed as normal tables though shared 
buffers then them can be used in parallel queries
and no extra support is required for it.
At least I have checked that parallel queries are correctly worked for 
my implementation of GTT with shared buffers.
So I do not understand about which "separate solutions" you are talking 
about.

I can agree that private buffers may be  good starting point for GTT 
implementation, because it is less invasive and GTT access speed is 
exactly the same as of normal temp tables.
But I do not understand your argument why it is "actively bad idea".

>> Access to GTT at replica:
>> 1. Access is prohibited (as for original temp tables). No changes at all.
>> 2. Tuples of temp tables are marked with forzen XID.  Minimal changes, rollbacks are not possible.
>> 3. Providing special XIDs for GTT at replica. No changes in CLOG are required, but special MVCC visibility rules are
usedfor GTT. Current limitation: number of transactions accessing GTT at replica is limited by 2^32
 
>> and bitmap of correspondent size has to be maintained (tuples of GTT are not proceeded by vacuum and not frozen, so
XIDhorizon never moved).
 
> I again vote for #1. A GTT is defined to allow data to be visible only
> within one session -- so what does it even mean for the data to be
> accessible on a replica?

There are sessions at replica (in case of hot standby), aren't there?

>
>> So except the limitation mentioned above (which I do not consider as critical) there is only one problem which was
notaddressed: maintaining statistics for GTT.
 
>> If all of the following conditions are true:
>>
>> 1) GTT are used in joins
>> 2) There are indexes defined for GTT
>> 3) Size and histogram of GTT in different backends can significantly vary.
>> 4) ANALYZE was explicitly called for GTT
>>
>> then query execution plan built in one backend will be also used for other backends where it can be inefficient.
>> I also do not consider this problem as "show stopper" for adding GTT to Postgres.
> I think that's *definitely* a show stopper.
Well, if both you and Pavel think that it is really "show stopper", then 
this problem really has to be addressed.
I slightly confused about this opinion, because Pavel has told me 
himself that 99% of users never create indexes for temp tables
or run "analyze" for them. And without it, this problem is not a problem 
at all.

>> I still do not understand the opinion of community which functionality of GTT is considered to be most important.
>> But the patch with local buffers and no replica support is small enough to become good starting point.
> Well, it seems we now have two patches for this feature. I guess we
> need to figure out which one is better, and whether it's possible for
> the two efforts to be merged, rather than having two different teams
> hacking on separate code bases.

I am open for cooperations.
Source code of all my patches is available.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:

>
>> So except the limitation mentioned above (which I do not consider as critical) there is only one problem which was not addressed: maintaining statistics for GTT.
>> If all of the following conditions are true:
>>
>> 1) GTT are used in joins
>> 2) There are indexes defined for GTT
>> 3) Size and histogram of GTT in different backends can significantly vary.
>> 4) ANALYZE was explicitly called for GTT
>>
>> then query execution plan built in one backend will be also used for other backends where it can be inefficient.
>> I also do not consider this problem as "show stopper" for adding GTT to Postgres.
> I think that's *definitely* a show stopper.
Well, if both you and Pavel think that it is really "show stopper", then
this problem really has to be addressed.
I slightly confused about this opinion, because Pavel has told me
himself that 99% of users never create indexes for temp tables
or run "analyze" for them. And without it, this problem is not a problem
at all.


Users doesn't do ANALYZE on temp tables in 99%. It's true. But second fact is so users has lot of problems. It's very similar to wrong statistics on persistent tables. When data are small, then it is not problem for users, although from my perspective it's not optimal. When data are not small, then the problem can be brutal. Temporary tables are not a exception. And users and developers are people - we know only about fatal problems. There are lot of unoptimized queries, but because the problem is not fatal, then it is not reason for report it. And lot of people has not any idea how fast the databases can be. The knowledges of  users and app developers are sad book.

Pavel

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:

> 2019年10月26日 上午12:22,Konstantin Knizhnik <k.knizhnik@postgrespro.ru> 写道:
>
>
>
> On 25.10.2019 18:01, Robert Haas wrote:
>> On Fri, Oct 11, 2019 at 9:50 AM Konstantin Knizhnik
>> <k.knizhnik@postgrespro.ru> wrote:
>>> Just to clarify.
>>> I have now proposed several different solutions for GTT:
>>>
>>> Shared vs. private buffers for GTT:
>>> 1. Private buffers. This is least invasive patch, requiring no changes in relfilenodes.
>>> 2. Shared buffers. Requires changing relfilenode but supports parallel query execution for GTT.
>> I vote for #1. I think parallel query for temp objects may be a
>> desirable feature, but I don't think it should be the job of a patch
>> implementing GTTs to make it happen. In fact, I think it would be an
>> actively bad idea, because I suspect that if we do eventually support
>> temp relations for parallel query, we're going to want a solution that
>> is shared between regular temp tables and global temp tables, not
>> separate solutions for each.
>
> Sorry, may be I do not not understand you.
> It seems to me that there is only one thing preventing usage of temporary tables in parallel plans: private buffers.
> If global temporary tables are accessed as normal tables though shared buffers then them can be used in parallel
queries
> and no extra support is required for it.
> At least I have checked that parallel queries are correctly worked for my implementation of GTT with shared buffers.
> So I do not understand about which "separate solutions" you are talking about.
>
> I can agree that private buffers may be  good starting point for GTT implementation, because it is less invasive and
GTTaccess speed is exactly the same as of normal temp tables. 
> But I do not understand your argument why it is "actively bad idea".
>
>>> Access to GTT at replica:
>>> 1. Access is prohibited (as for original temp tables). No changes at all.
>>> 2. Tuples of temp tables are marked with forzen XID.  Minimal changes, rollbacks are not possible.
>>> 3. Providing special XIDs for GTT at replica. No changes in CLOG are required, but special MVCC visibility rules
areused for GTT. Current limitation: number of transactions accessing GTT at replica is limited by 2^32 
>>> and bitmap of correspondent size has to be maintained (tuples of GTT are not proceeded by vacuum and not frozen, so
XIDhorizon never moved). 
>> I again vote for #1. A GTT is defined to allow data to be visible only
>> within one session -- so what does it even mean for the data to be
>> accessible on a replica?
>
> There are sessions at replica (in case of hot standby), aren't there?
>
>>
>>> So except the limitation mentioned above (which I do not consider as critical) there is only one problem which was
notaddressed: maintaining statistics for GTT. 
>>> If all of the following conditions are true:
>>>
>>> 1) GTT are used in joins
>>> 2) There are indexes defined for GTT
>>> 3) Size and histogram of GTT in different backends can significantly vary.
>>> 4) ANALYZE was explicitly called for GTT
>>>
>>> then query execution plan built in one backend will be also used for other backends where it can be inefficient.
>>> I also do not consider this problem as "show stopper" for adding GTT to Postgres.
>> I think that's *definitely* a show stopper.
> Well, if both you and Pavel think that it is really "show stopper", then this problem really has to be addressed.
> I slightly confused about this opinion, because Pavel has told me himself that 99% of users never create indexes for
temptables 
> or run "analyze" for them. And without it, this problem is not a problem at all.
>
>>> I still do not understand the opinion of community which functionality of GTT is considered to be most important.
>>> But the patch with local buffers and no replica support is small enough to become good starting point.
>> Well, it seems we now have two patches for this feature. I guess we
>> need to figure out which one is better, and whether it's possible for
>> the two efforts to be merged, rather than having two different teams
>> hacking on separate code bases.
>
> I am open for cooperations.
> Source code of all my patches is available.
We are also willing to cooperate to complete this feature.
Let me prepare the code(merge code to pg12) and up to community, then see how we work together.

> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>
>




Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Fri, Oct 25, 2019 at 11:14 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> > Access to GTT at replica:
>> > 1. Access is prohibited (as for original temp tables). No changes at all.
>> > 2. Tuples of temp tables are marked with forzen XID.  Minimal changes, rollbacks are not possible.
>> > 3. Providing special XIDs for GTT at replica. No changes in CLOG are required, but special MVCC visibility rules
areused for GTT. Current limitation: number of transactions accessing GTT at replica is limited by 2^32
 
>> > and bitmap of correspondent size has to be maintained (tuples of GTT are not proceeded by vacuum and not frozen,
soXID horizon never moved).
 
>>
>> I again vote for #1. A GTT is defined to allow data to be visible only
>> within one session -- so what does it even mean for the data to be
>> accessible on a replica?
>
> why not? there are lot of sessions on replica servers. One usage of temp tables is fixing estimation errors. You can
createtemp table with partial query result, run ANALYZE and evaluate other steps. Now this case is not possible on
replicaservers.
 
>
> One motivation for GTT  is decreasing port costs from Oracle. But other motivations, like do more complex
calculationson replica are valid and valuable.
 

Hmm, I think I was slightly confused when I wrote my previous
response. I now see that what was under discussion was not making data
from the master visible on the standbys, which really wouldn't make
any sense, but rather allowing standby sessions to also use the GTT,
each with its own local copy of the data. I don't think that's a bad
feature, but look how invasive the required changes are. Not allowing
rollbacks seems dead on arrival; an abort would be able to leave the
table and index mutually inconsistent.  A separate XID space would be
a real solution, perhaps, but it would be *extremely* complicated and
invasive to implement.

One thing that I've learned over and over again as a developer is that
you get a lot more done if you tackle one problem at a time. GTTs are
a sufficiently-large problem all by themselves; a major reworking of
the way XIDs work might be a good project to undertake at some point,
but it doesn't make any sense to incorporate that into the GTT
project, which is otherwise about a mostly-separate set of issues.
Let's not try to solve more problems at once than strictly necessary.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Fri, Oct 25, 2019 at 12:22 PM Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
> On 25.10.2019 18:01, Robert Haas wrote:
> > On Fri, Oct 11, 2019 at 9:50 AM Konstantin Knizhnik
> > <k.knizhnik@postgrespro.ru> wrote:
> >> Just to clarify.
> >> I have now proposed several different solutions for GTT:
> >>
> >> Shared vs. private buffers for GTT:
> >> 1. Private buffers. This is least invasive patch, requiring no changes in relfilenodes.
> >> 2. Shared buffers. Requires changing relfilenode but supports parallel query execution for GTT.
> > I vote for #1. I think parallel query for temp objects may be a
> > desirable feature, but I don't think it should be the job of a patch
> > implementing GTTs to make it happen. In fact, I think it would be an
> > actively bad idea, because I suspect that if we do eventually support
> > temp relations for parallel query, we're going to want a solution that
> > is shared between regular temp tables and global temp tables, not
> > separate solutions for each.
>
> Sorry, may be I do not not understand you.
> It seems to me that there is only one thing preventing usage of
> temporary tables in parallel plans: private buffers.
> If global temporary tables are accessed as normal tables though shared
> buffers then them can be used in parallel queries
> and no extra support is required for it.
> At least I have checked that parallel queries are correctly worked for
> my implementation of GTT with shared buffers.
> So I do not understand about which "separate solutions" you are talking
> about.
>
> I can agree that private buffers may be  good starting point for GTT
> implementation, because it is less invasive and GTT access speed is
> exactly the same as of normal temp tables.
> But I do not understand your argument why it is "actively bad idea".

Well, it sounds like you're talking about ending up in a situation
where local temporary tables are still in private buffers, but global
temporary table data is in shared buffers. I think that would be
inconsistent. And it would mean that when somebody wanted to make
local temporary tables accessible in parallel query, they'd have to
write a patch for that.  In other words, I don't support dividing the
patches like this:

Patch #1: Support global temporary tables + allow global temporary
tables to used by parallel query
Patch #2: Allow local temporary tables to be used by parallel query

I support dividing them like this:

Patch #1: Support global temporary tables
Patch #2: Allow (all kinds of) temporary tables to be used by parallel query

The second division looks a lot cleaner to me, although as always I
might be missing something.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:

On 28.10.2019 15:07, Robert Haas wrote:
> On Fri, Oct 25, 2019 at 11:14 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>>> Access to GTT at replica:
>>>> 1. Access is prohibited (as for original temp tables). No changes at all.
>>>> 2. Tuples of temp tables are marked with forzen XID.  Minimal changes, rollbacks are not possible.
>>>> 3. Providing special XIDs for GTT at replica. No changes in CLOG are required, but special MVCC visibility rules
areused for GTT. Current limitation: number of transactions accessing GTT at replica is limited by 2^32
 
>>>> and bitmap of correspondent size has to be maintained (tuples of GTT are not proceeded by vacuum and not frozen,
soXID horizon never moved).
 
>>> I again vote for #1. A GTT is defined to allow data to be visible only
>>> within one session -- so what does it even mean for the data to be
>>> accessible on a replica?
>> why not? there are lot of sessions on replica servers. One usage of temp tables is fixing estimation errors. You can
createtemp table with partial query result, run ANALYZE and evaluate other steps. Now this case is not possible on
replicaservers.
 
>>
>> One motivation for GTT  is decreasing port costs from Oracle. But other motivations, like do more complex
calculationson replica are valid and valuable.
 
> Hmm, I think I was slightly confused when I wrote my previous
> response. I now see that what was under discussion was not making data
> from the master visible on the standbys, which really wouldn't make
> any sense, but rather allowing standby sessions to also use the GTT,
> each with its own local copy of the data. I don't think that's a bad
> feature, but look how invasive the required changes are. Not allowing
> rollbacks seems dead on arrival; an abort would be able to leave the
> table and index mutually inconsistent.  A separate XID space would be
> a real solution, perhaps, but it would be *extremely* complicated and
> invasive to implement.

Sorry, but both statements are not true.
As I mentioned before, I have implemented both solutions.

I am not sure how vital is lack of aborts for transactions working with 
GTT at replica.
Some people said that there is no sense in aborts of read-only 
transactions at replica (despite to the fact that them are saving 
intermediate results in GTT).
Some people said something similar with your's "dead on arrival".
But inconsistency is not possible: if such transaction is really 
aborted, then backend is terminated and nobody can see this inconsistency.

Concerning second alternative: you can check yourself that it is not 
*extremely* complicated and invasive.
I extracted changes which are related with handling transactions at 
replica and attached them to this mail.
It is just 500 lines (including diff contexts). Certainly there are some 
limitation of this implementation: number of  transactions working with 
GTT at replica is limited by 2^32
and since GTT tuples are not frozen, analog of GTT CLOG kept in memory 
is never truncated.

>
> One thing that I've learned over and over again as a developer is that
> you get a lot more done if you tackle one problem at a time. GTTs are
> a sufficiently-large problem all by themselves; a major reworking of
> the way XIDs work might be a good project to undertake at some point,
> but it doesn't make any sense to incorporate that into the GTT
> project, which is otherwise about a mostly-separate set of issues.
> Let's not try to solve more problems at once than strictly necessary.
>
I agree with it and think that implementation of GTT with private 
buffers and no replica access is good starting point.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:

On 28.10.2019 15:13, Robert Haas wrote:
> On Fri, Oct 25, 2019 at 12:22 PM Konstantin Knizhnik
> <k.knizhnik@postgrespro.ru> wrote:
>> On 25.10.2019 18:01, Robert Haas wrote:
>>> On Fri, Oct 11, 2019 at 9:50 AM Konstantin Knizhnik
>>> <k.knizhnik@postgrespro.ru> wrote:
>>>> Just to clarify.
>>>> I have now proposed several different solutions for GTT:
>>>>
>>>> Shared vs. private buffers for GTT:
>>>> 1. Private buffers. This is least invasive patch, requiring no changes in relfilenodes.
>>>> 2. Shared buffers. Requires changing relfilenode but supports parallel query execution for GTT.
>>> I vote for #1. I think parallel query for temp objects may be a
>>> desirable feature, but I don't think it should be the job of a patch
>>> implementing GTTs to make it happen. In fact, I think it would be an
>>> actively bad idea, because I suspect that if we do eventually support
>>> temp relations for parallel query, we're going to want a solution that
>>> is shared between regular temp tables and global temp tables, not
>>> separate solutions for each.
>> Sorry, may be I do not not understand you.
>> It seems to me that there is only one thing preventing usage of
>> temporary tables in parallel plans: private buffers.
>> If global temporary tables are accessed as normal tables though shared
>> buffers then them can be used in parallel queries
>> and no extra support is required for it.
>> At least I have checked that parallel queries are correctly worked for
>> my implementation of GTT with shared buffers.
>> So I do not understand about which "separate solutions" you are talking
>> about.
>>
>> I can agree that private buffers may be  good starting point for GTT
>> implementation, because it is less invasive and GTT access speed is
>> exactly the same as of normal temp tables.
>> But I do not understand your argument why it is "actively bad idea".
> Well, it sounds like you're talking about ending up in a situation
> where local temporary tables are still in private buffers, but global
> temporary table data is in shared buffers. I think that would be
> inconsistent. And it would mean that when somebody wanted to make
> local temporary tables accessible in parallel query, they'd have to
> write a patch for that.  In other words, I don't support dividing the
> patches like this:
>
> Patch #1: Support global temporary tables + allow global temporary
> tables to used by parallel query
> Patch #2: Allow local temporary tables to be used by parallel query
>
> I support dividing them like this:
>
> Patch #1: Support global temporary tables
> Patch #2: Allow (all kinds of) temporary tables to be used by parallel query
>
> The second division looks a lot cleaner to me, although as always I
> might be missing something.
>
Logically it may be good decision. But piratically support of parallel 
access to GTT requires just accessing their data through shared buffer.
But in case of local temp tables we need also need to some how share 
table's metadata between parallel workers. It seems to be much more 
complicated if ever possible.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Mon, Oct 28, 2019 at 9:48 AM Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
> Logically it may be good decision. But piratically support of parallel
> access to GTT requires just accessing their data through shared buffer.
> But in case of local temp tables we need also need to some how share
> table's metadata between parallel workers. It seems to be much more
> complicated if ever possible.

Why? The backends all share a snapshot, and can load whatever they
need from the system catalogs.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Mon, Oct 28, 2019 at 9:37 AM Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
> Sorry, but both statements are not true.

Well, I think they are true.

> I am not sure how vital is lack of aborts for transactions working with
> GTT at replica.
> Some people said that there is no sense in aborts of read-only
> transactions at replica (despite to the fact that them are saving
> intermediate results in GTT).
> Some people said something similar with your's "dead on arrival".
> But inconsistency is not possible: if such transaction is really
> aborted, then backend is terminated and nobody can see this inconsistency.

Aborting the current transaction is a very different thing from
terminating the backend.

Also, the idea that there is no sense in aborts of read-only
transactions on a replica seems totally wrong. Suppose that you insert
a row into the table and then you go to insert a row in each index,
but one of the index inserts fails - duplicate key, out of memory
error, I/O error, whatever. Now the table and the index are
inconsistent. Normally, we're protected against this by MVCC, but if
you use a solution that breaks MVCC by using the same XID for all
transactions, then it can happen.

> Concerning second alternative: you can check yourself that it is not
> *extremely* complicated and invasive.
> I extracted changes which are related with handling transactions at
> replica and attached them to this mail.
> It is just 500 lines (including diff contexts). Certainly there are some
> limitation of this implementation: number of  transactions working with
> GTT at replica is limited by 2^32
> and since GTT tuples are not frozen, analog of GTT CLOG kept in memory
> is never truncated.

I admit that this patch is not lengthy, but there remains the question
of whether it is correct. It's possible that the problem isn't as
complicated as I think it is, but I do think there are quite a number
of reasons why this patch wouldn't be considered acceptable...

> I agree with it and think that implementation of GTT with private
> buffers and no replica access is good starting point.

...but given that we seem to agree on this point, perhaps it isn't
necessary to argue about those things right now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:

On 28.10.2019 19:40, Robert Haas wrote:
> Aborting the current transaction is a very different thing from
> terminating the backend.
>
> Also, the idea that there is no sense in aborts of read-only
> transactions on a replica seems totally wrong. Suppose that you insert
> a row into the table and then you go to insert a row in each index,
> but one of the index inserts fails - duplicate key, out of memory
> error, I/O error, whatever. Now the table and the index are
> inconsistent. Normally, we're protected against this by MVCC, but if
> you use a solution that breaks MVCC by using the same XID for all
> transactions, then it can happen.


Certainly I understand the difference between abort of transaction and 
termination of backend.
I do not say that it is good solution. And definitely aborts can happen 
for read-only transactions.
I just wanted to express one moment: transaction aborts are caused by 
two reasons:
- expected programming errors: deadlocks, conversion errors, unique 
constraint violation,...
- unexpected system errors: disk space exhaustion, out of memory, I/O 
errors...

Usually at replica with read-only transactions we do not have to deal 
with errors of first kind.
So transaction may be aborted, but such abort most likely means that 
something is wrong with the system
and restart of backend is not so bad solution in this situation.

In any case, I do not insist on this "frozen XID" approach.
The only advantage of this approach is that it is very simple to 
implement: correspondent patch contains just 80 lines of code
and actually it requires just 5 (five) one-line changes.
I didn't agree with your statement just because restart of backend makes 
it not possible to observe any inconsistencies in the database.

> ...but given that we seem to agree on this point, perhaps it isn't
> necessary to argue about those things right now.
>
Ok.
I attached new patch for GTT with local (private) buffer and no replica 
access.
It provides GTT for all built-in indexes


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:


On 25.10.2019 20:00, Pavel Stehule wrote:

>
>> So except the limitation mentioned above (which I do not consider as critical) there is only one problem which was not addressed: maintaining statistics for GTT.
>> If all of the following conditions are true:
>>
>> 1) GTT are used in joins
>> 2) There are indexes defined for GTT
>> 3) Size and histogram of GTT in different backends can significantly vary.
>> 4) ANALYZE was explicitly called for GTT
>>
>> then query execution plan built in one backend will be also used for other backends where it can be inefficient.
>> I also do not consider this problem as "show stopper" for adding GTT to Postgres.
> I think that's *definitely* a show stopper.
Well, if both you and Pavel think that it is really "show stopper", then
this problem really has to be addressed.
I slightly confused about this opinion, because Pavel has told me
himself that 99% of users never create indexes for temp tables
or run "analyze" for them. And without it, this problem is not a problem
at all.


Users doesn't do ANALYZE on temp tables in 99%. It's true. But second fact is so users has lot of problems. It's very similar to wrong statistics on persistent tables. When data are small, then it is not problem for users, although from my perspective it's not optimal. When data are not small, then the problem can be brutal. Temporary tables are not a exception. And users and developers are people - we know only about fatal problems. There are lot of unoptimized queries, but because the problem is not fatal, then it is not reason for report it. And lot of people has not any idea how fast the databases can be. The knowledges of  users and app developers are sad book.

Pavel

It seems to me that I have found quite elegant solution for per-backend statistic for GTT: I just inserting it in backend's catalog cache, but not in pg_statistic table itself.
To do it I have to add InsertSysCache/InsertCatCache functions which insert pinned entry in the correspondent cache.
I wonder if there are some pitfalls of such approach?

New patch for GTT is attached.
-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Attachment

Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Fri, Nov 1, 2019 at 11:15 AM Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
> It seems to me that I have found quite elegant solution for per-backend statistic for GTT: I just inserting it in
backend'scatalog cache, but not in pg_statistic table itself.
 
> To do it I have to add InsertSysCache/InsertCatCache functions which insert pinned entry in the correspondent cache.
> I wonder if there are some pitfalls of such approach?

That sounds pretty hackish. You'd have to be very careful, for
example, that if the tables were dropped or re-analyzed, all of the
old entries got removed -- and then it would still fail if any code
tried to access the statistics directly from the table, rather than
via the caches. My assumption is that the statistics ought to be
stored in some backend-private data structure designed for that
purpose, and that the code that needs the data should be taught to
look for it there when the table is a GTT.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:

On 01.11.2019 18:26, Robert Haas wrote:
> On Fri, Nov 1, 2019 at 11:15 AM Konstantin Knizhnik
> <k.knizhnik@postgrespro.ru> wrote:
>> It seems to me that I have found quite elegant solution for per-backend statistic for GTT: I just inserting it in
backend'scatalog cache, but not in pg_statistic table itself.
 
>> To do it I have to add InsertSysCache/InsertCatCache functions which insert pinned entry in the correspondent
cache.
>> I wonder if there are some pitfalls of such approach?
> That sounds pretty hackish. You'd have to be very careful, for
> example, that if the tables were dropped or re-analyzed, all of the
> old entries got removed --

I have checked it:
- when table is reanalyzed, then cache entries are replaced.
- when table is dropped, then cache entries are removed.

> and then it would still fail if any code
> tried to access the statistics directly from the table, rather than
> via the caches. My assumption is that the statistics ought to be
> stored in some backend-private data structure designed for that
> purpose, and that the code that needs the data should be taught to
> look for it there when the table is a GTT.

Yes, if you do "select * from pg_statistic" then you will not see 
statistic for GTT in this case.
But I do not think that it is so critical. I do not believe that anybody 
is trying to manually interpret values in this table.
And optimizer is retrieving statistic through sys-cache mechanism and so 
is able to build correct plan in this case.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


pá 1. 11. 2019 v 17:09 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 01.11.2019 18:26, Robert Haas wrote:
> On Fri, Nov 1, 2019 at 11:15 AM Konstantin Knizhnik
> <k.knizhnik@postgrespro.ru> wrote:
>> It seems to me that I have found quite elegant solution for per-backend statistic for GTT: I just inserting it in backend's catalog cache, but not in pg_statistic table itself.
>> To do it I have to add InsertSysCache/InsertCatCache functions which insert pinned entry in the correspondent cache.
>> I wonder if there are some pitfalls of such approach?
> That sounds pretty hackish. You'd have to be very careful, for
> example, that if the tables were dropped or re-analyzed, all of the
> old entries got removed --

I have checked it:
- when table is reanalyzed, then cache entries are replaced.
- when table is dropped, then cache entries are removed.

> and then it would still fail if any code
> tried to access the statistics directly from the table, rather than
> via the caches. My assumption is that the statistics ought to be
> stored in some backend-private data structure designed for that
> purpose, and that the code that needs the data should be taught to
> look for it there when the table is a GTT.

Yes, if you do "select * from pg_statistic" then you will not see
statistic for GTT in this case.
But I do not think that it is so critical. I do not believe that anybody
is trying to manually interpret values in this table.
And optimizer is retrieving statistic through sys-cache mechanism and so
is able to build correct plan in this case.

Years ago, when I though about it, I wrote patch with similar design. It's working, but surely it's ugly.

I have another idea. Can be pg_statistics view instead a table?

Some like

SELECT * FROM pg_catalog.pg_statistics_rel
UNION ALL
SELECT * FROM pg_catalog.pg_statistics_gtt();

Internally - when stat cache is filled, then there can be used pg_statistics_rel and pg_statistics_gtt() directly. What I remember, there was not possibility to work with queries, only with just relations.

Or crazy idea - today we can implement own types of heaps. Is possible to create engine where result can be combination of some shared data and local data. So union will be implemented on heap level.
This implementation can be simple, just scanning pages from shared buffers and from local buffers. For these tables we don't need complex metadata. It's crazy idea, and I think so union with table function should be best.

Regards

Pavel





--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [Proposal] Global temporary tables

From
Julien Rouhaud
Date:
On Sat, Nov 2, 2019 at 6:31 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
> pá 1. 11. 2019 v 17:09 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:
>>
>> On 01.11.2019 18:26, Robert Haas wrote:
>> > On Fri, Nov 1, 2019 at 11:15 AM Konstantin Knizhnik
>> > <k.knizhnik@postgrespro.ru> wrote:
>> >> It seems to me that I have found quite elegant solution for per-backend statistic for GTT: I just inserting it in
backend'scatalog cache, but not in pg_statistic table itself. 
>> >> To do it I have to add InsertSysCache/InsertCatCache functions which insert pinned entry in the correspondent
cache.
>> >> I wonder if there are some pitfalls of such approach?
>> > That sounds pretty hackish. You'd have to be very careful, for
>> > example, that if the tables were dropped or re-analyzed, all of the
>> > old entries got removed --
>>
>> I have checked it:
>> - when table is reanalyzed, then cache entries are replaced.
>> - when table is dropped, then cache entries are removed.
>>
>> > and then it would still fail if any code
>> > tried to access the statistics directly from the table, rather than
>> > via the caches. My assumption is that the statistics ought to be
>> > stored in some backend-private data structure designed for that
>> > purpose, and that the code that needs the data should be taught to
>> > look for it there when the table is a GTT.
>>
>> Yes, if you do "select * from pg_statistic" then you will not see
>> statistic for GTT in this case.
>> But I do not think that it is so critical. I do not believe that anybody
>> is trying to manually interpret values in this table.
>> And optimizer is retrieving statistic through sys-cache mechanism and so
>> is able to build correct plan in this case.
>
>
> Years ago, when I though about it, I wrote patch with similar design. It's working, but surely it's ugly.
>
> I have another idea. Can be pg_statistics view instead a table?
>
> Some like
>
> SELECT * FROM pg_catalog.pg_statistics_rel
> UNION ALL
> SELECT * FROM pg_catalog.pg_statistics_gtt();
>
> Internally - when stat cache is filled, then there can be used pg_statistics_rel and pg_statistics_gtt() directly.
WhatI remember, there was not possibility to work with queries, only with just relations. 

It'd be a loss if you lose the ability to see the statistics, as there
are valid use cases where you need to see the stats, eg. understanding
why you don't get the plan you wanted.  There's also at least one
extension [1] that allows you to backup and use restored statistics,
so there are definitely people interested in it.

[1]: https://github.com/ossc-db/pg_dbms_stats



Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


so 2. 11. 2019 v 8:18 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
On Sat, Nov 2, 2019 at 6:31 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
> pá 1. 11. 2019 v 17:09 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:
>>
>> On 01.11.2019 18:26, Robert Haas wrote:
>> > On Fri, Nov 1, 2019 at 11:15 AM Konstantin Knizhnik
>> > <k.knizhnik@postgrespro.ru> wrote:
>> >> It seems to me that I have found quite elegant solution for per-backend statistic for GTT: I just inserting it in backend's catalog cache, but not in pg_statistic table itself.
>> >> To do it I have to add InsertSysCache/InsertCatCache functions which insert pinned entry in the correspondent cache.
>> >> I wonder if there are some pitfalls of such approach?
>> > That sounds pretty hackish. You'd have to be very careful, for
>> > example, that if the tables were dropped or re-analyzed, all of the
>> > old entries got removed --
>>
>> I have checked it:
>> - when table is reanalyzed, then cache entries are replaced.
>> - when table is dropped, then cache entries are removed.
>>
>> > and then it would still fail if any code
>> > tried to access the statistics directly from the table, rather than
>> > via the caches. My assumption is that the statistics ought to be
>> > stored in some backend-private data structure designed for that
>> > purpose, and that the code that needs the data should be taught to
>> > look for it there when the table is a GTT.
>>
>> Yes, if you do "select * from pg_statistic" then you will not see
>> statistic for GTT in this case.
>> But I do not think that it is so critical. I do not believe that anybody
>> is trying to manually interpret values in this table.
>> And optimizer is retrieving statistic through sys-cache mechanism and so
>> is able to build correct plan in this case.
>
>
> Years ago, when I though about it, I wrote patch with similar design. It's working, but surely it's ugly.
>
> I have another idea. Can be pg_statistics view instead a table?
>
> Some like
>
> SELECT * FROM pg_catalog.pg_statistics_rel
> UNION ALL
> SELECT * FROM pg_catalog.pg_statistics_gtt();
>
> Internally - when stat cache is filled, then there can be used pg_statistics_rel and pg_statistics_gtt() directly. What I remember, there was not possibility to work with queries, only with just relations.

It'd be a loss if you lose the ability to see the statistics, as there
are valid use cases where you need to see the stats, eg. understanding
why you don't get the plan you wanted.  There's also at least one
extension [1] that allows you to backup and use restored statistics,
so there are definitely people interested in it.

[1]: https://github.com/ossc-db/pg_dbms_stats

I don't think - the extensions can use UNION and the content will be same as caches used by planner.


Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


so 2. 11. 2019 v 8:23 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:


so 2. 11. 2019 v 8:18 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
On Sat, Nov 2, 2019 at 6:31 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
> pá 1. 11. 2019 v 17:09 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:
>>
>> On 01.11.2019 18:26, Robert Haas wrote:
>> > On Fri, Nov 1, 2019 at 11:15 AM Konstantin Knizhnik
>> > <k.knizhnik@postgrespro.ru> wrote:
>> >> It seems to me that I have found quite elegant solution for per-backend statistic for GTT: I just inserting it in backend's catalog cache, but not in pg_statistic table itself.
>> >> To do it I have to add InsertSysCache/InsertCatCache functions which insert pinned entry in the correspondent cache.
>> >> I wonder if there are some pitfalls of such approach?
>> > That sounds pretty hackish. You'd have to be very careful, for
>> > example, that if the tables were dropped or re-analyzed, all of the
>> > old entries got removed --
>>
>> I have checked it:
>> - when table is reanalyzed, then cache entries are replaced.
>> - when table is dropped, then cache entries are removed.
>>
>> > and then it would still fail if any code
>> > tried to access the statistics directly from the table, rather than
>> > via the caches. My assumption is that the statistics ought to be
>> > stored in some backend-private data structure designed for that
>> > purpose, and that the code that needs the data should be taught to
>> > look for it there when the table is a GTT.
>>
>> Yes, if you do "select * from pg_statistic" then you will not see
>> statistic for GTT in this case.
>> But I do not think that it is so critical. I do not believe that anybody
>> is trying to manually interpret values in this table.
>> And optimizer is retrieving statistic through sys-cache mechanism and so
>> is able to build correct plan in this case.
>
>
> Years ago, when I though about it, I wrote patch with similar design. It's working, but surely it's ugly.
>
> I have another idea. Can be pg_statistics view instead a table?
>
> Some like
>
> SELECT * FROM pg_catalog.pg_statistics_rel
> UNION ALL
> SELECT * FROM pg_catalog.pg_statistics_gtt();
>
> Internally - when stat cache is filled, then there can be used pg_statistics_rel and pg_statistics_gtt() directly. What I remember, there was not possibility to work with queries, only with just relations.

It'd be a loss if you lose the ability to see the statistics, as there
are valid use cases where you need to see the stats, eg. understanding
why you don't get the plan you wanted.  There's also at least one
extension [1] that allows you to backup and use restored statistics,
so there are definitely people interested in it.

[1]: https://github.com/ossc-db/pg_dbms_stats

I don't think - the extensions can use UNION and the content will be same as caches used by planner.

sure, if some one try to modify directly system tables, then it should be fixed. 

Re: [Proposal] Global temporary tables

From
Julien Rouhaud
Date:
On Sat, Nov 2, 2019 at 8:23 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
> so 2. 11. 2019 v 8:18 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
>>
>> On Sat, Nov 2, 2019 at 6:31 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> >
>> > pá 1. 11. 2019 v 17:09 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:
>> >>
>> >> On 01.11.2019 18:26, Robert Haas wrote:
>> >> > On Fri, Nov 1, 2019 at 11:15 AM Konstantin Knizhnik
>> >> > <k.knizhnik@postgrespro.ru> wrote:
>> >> >> It seems to me that I have found quite elegant solution for per-backend statistic for GTT: I just inserting it
inbackend's catalog cache, but not in pg_statistic table itself. 
>> >> >> To do it I have to add InsertSysCache/InsertCatCache functions which insert pinned entry in the correspondent
cache.
>> >> >> I wonder if there are some pitfalls of such approach?
>> >> > That sounds pretty hackish. You'd have to be very careful, for
>> >> > example, that if the tables were dropped or re-analyzed, all of the
>> >> > old entries got removed --
>> >>
>> >> I have checked it:
>> >> - when table is reanalyzed, then cache entries are replaced.
>> >> - when table is dropped, then cache entries are removed.
>> >>
>> >> > and then it would still fail if any code
>> >> > tried to access the statistics directly from the table, rather than
>> >> > via the caches. My assumption is that the statistics ought to be
>> >> > stored in some backend-private data structure designed for that
>> >> > purpose, and that the code that needs the data should be taught to
>> >> > look for it there when the table is a GTT.
>> >>
>> >> Yes, if you do "select * from pg_statistic" then you will not see
>> >> statistic for GTT in this case.
>> >> But I do not think that it is so critical. I do not believe that anybody
>> >> is trying to manually interpret values in this table.
>> >> And optimizer is retrieving statistic through sys-cache mechanism and so
>> >> is able to build correct plan in this case.
>> >
>> >
>> > Years ago, when I though about it, I wrote patch with similar design. It's working, but surely it's ugly.
>> >
>> > I have another idea. Can be pg_statistics view instead a table?
>> >
>> > Some like
>> >
>> > SELECT * FROM pg_catalog.pg_statistics_rel
>> > UNION ALL
>> > SELECT * FROM pg_catalog.pg_statistics_gtt();
>> >
>> > Internally - when stat cache is filled, then there can be used pg_statistics_rel and pg_statistics_gtt() directly.
WhatI remember, there was not possibility to work with queries, only with just relations. 
>>
>> It'd be a loss if you lose the ability to see the statistics, as there
>> are valid use cases where you need to see the stats, eg. understanding
>> why you don't get the plan you wanted.  There's also at least one
>> extension [1] that allows you to backup and use restored statistics,
>> so there are definitely people interested in it.
>>
>> [1]: https://github.com/ossc-db/pg_dbms_stats
>
>
> I don't think - the extensions can use UNION and the content will be same as caches used by planner.

Yes, I agree that changing pg_statistics to be a view as you showed
would fix the problem.  I was answering Konstantin's point:

>> >> But I do not think that it is so critical. I do not believe that anybody
>> >> is trying to manually interpret values in this table.
>> >> And optimizer is retrieving statistic through sys-cache mechanism and so
>> >> is able to build correct plan in this case.

which is IMHO a wrong assumption.



Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:

On 02.11.2019 10:19, Julien Rouhaud wrote:
> On Sat, Nov 2, 2019 at 6:31 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> pá 1. 11. 2019 v 17:09 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:
>>> On 01.11.2019 18:26, Robert Haas wrote:
>>>> On Fri, Nov 1, 2019 at 11:15 AM Konstantin Knizhnik
>>>> <k.knizhnik@postgrespro.ru> wrote:
>>>>> It seems to me that I have found quite elegant solution for per-backend statistic for GTT: I just inserting it in
backend'scatalog cache, but not in pg_statistic table itself.
 
>>>>> To do it I have to add InsertSysCache/InsertCatCache functions which insert pinned entry in the correspondent
cache.
>>>>> I wonder if there are some pitfalls of such approach?
>>>> That sounds pretty hackish. You'd have to be very careful, for
>>>> example, that if the tables were dropped or re-analyzed, all of the
>>>> old entries got removed --
>>> I have checked it:
>>> - when table is reanalyzed, then cache entries are replaced.
>>> - when table is dropped, then cache entries are removed.
>>>
>>>> and then it would still fail if any code
>>>> tried to access the statistics directly from the table, rather than
>>>> via the caches. My assumption is that the statistics ought to be
>>>> stored in some backend-private data structure designed for that
>>>> purpose, and that the code that needs the data should be taught to
>>>> look for it there when the table is a GTT.
>>> Yes, if you do "select * from pg_statistic" then you will not see
>>> statistic for GTT in this case.
>>> But I do not think that it is so critical. I do not believe that anybody
>>> is trying to manually interpret values in this table.
>>> And optimizer is retrieving statistic through sys-cache mechanism and so
>>> is able to build correct plan in this case.
>>
>> Years ago, when I though about it, I wrote patch with similar design. It's working, but surely it's ugly.
>>
>> I have another idea. Can be pg_statistics view instead a table?
>>
>> Some like
>>
>> SELECT * FROM pg_catalog.pg_statistics_rel
>> UNION ALL
>> SELECT * FROM pg_catalog.pg_statistics_gtt();
>>
>> Internally - when stat cache is filled, then there can be used pg_statistics_rel and pg_statistics_gtt() directly.
WhatI remember, there was not possibility to work with queries, only with just relations.
 
> It'd be a loss if you lose the ability to see the statistics, as there
> are valid use cases where you need to see the stats, eg. understanding
> why you don't get the plan you wanted.  There's also at least one
> extension [1] that allows you to backup and use restored statistics,
> so there are definitely people interested in it.
>
> [1]: https://github.com/ossc-db/pg_dbms_stats
It seems to have completely no sense to backup and restore statistic for 
temporary tables which life time is limited to life time of backend,
doesn't it?





Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:


On 02.11.2019 8:30, Pavel Stehule wrote:


pá 1. 11. 2019 v 17:09 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 01.11.2019 18:26, Robert Haas wrote:
> On Fri, Nov 1, 2019 at 11:15 AM Konstantin Knizhnik
> <k.knizhnik@postgrespro.ru> wrote:
>> It seems to me that I have found quite elegant solution for per-backend statistic for GTT: I just inserting it in backend's catalog cache, but not in pg_statistic table itself.
>> To do it I have to add InsertSysCache/InsertCatCache functions which insert pinned entry in the correspondent cache.
>> I wonder if there are some pitfalls of such approach?
> That sounds pretty hackish. You'd have to be very careful, for
> example, that if the tables were dropped or re-analyzed, all of the
> old entries got removed --

I have checked it:
- when table is reanalyzed, then cache entries are replaced.
- when table is dropped, then cache entries are removed.

> and then it would still fail if any code
> tried to access the statistics directly from the table, rather than
> via the caches. My assumption is that the statistics ought to be
> stored in some backend-private data structure designed for that
> purpose, and that the code that needs the data should be taught to
> look for it there when the table is a GTT.

Yes, if you do "select * from pg_statistic" then you will not see
statistic for GTT in this case.
But I do not think that it is so critical. I do not believe that anybody
is trying to manually interpret values in this table.
And optimizer is retrieving statistic through sys-cache mechanism and so
is able to build correct plan in this case.

Years ago, when I though about it, I wrote patch with similar design. It's working, but surely it's ugly.

I have another idea. Can be pg_statistics view instead a table?

Some like

SELECT * FROM pg_catalog.pg_statistics_rel
UNION ALL
SELECT * FROM pg_catalog.pg_statistics_gtt();

And pg_catalog.pg_statistics_gtt() is set returning functions?
I afraid that it is not acceptable solution from performance point of view: pg_statictic table is accessed by keys (<relid>,<attpos>,<inh>)
If it can not be done using index scan, then it can cause significant performance slow down.


Internally - when stat cache is filled, then there can be used pg_statistics_rel and pg_statistics_gtt() directly. What I remember, there was not possibility to work with queries, only with just relations.

Or crazy idea - today we can implement own types of heaps. Is possible to create engine where result can be combination of some shared data and local data. So union will be implemented on heap level.
This implementation can be simple, just scanning pages from shared buffers and from local buffers. For these tables we don't need complex metadata. It's crazy idea, and I think so union with table function should be best.

Frankly speaking, implementing special heap access method for pg_statistic just to handle case of global temp tables seems to be overkill
from my point of view. It requires a lot coding (or at least copying a lot of code from heapam). Also, as I wrote above, we need also index for efficient lookup of statistic.


Re: [Proposal] Global temporary tables

From
Julien Rouhaud
Date:
On Sat, Nov 2, 2019 at 4:09 PM Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
>
> On 02.11.2019 10:19, Julien Rouhaud wrote:
> > On Sat, Nov 2, 2019 at 6:31 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:
> >> pá 1. 11. 2019 v 17:09 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:
> >>> On 01.11.2019 18:26, Robert Haas wrote:
> >>>> On Fri, Nov 1, 2019 at 11:15 AM Konstantin Knizhnik
> >>>> <k.knizhnik@postgrespro.ru> wrote:
> >>>>> It seems to me that I have found quite elegant solution for per-backend statistic for GTT: I just inserting it
inbackend's catalog cache, but not in pg_statistic table itself. 
> >>>>> To do it I have to add InsertSysCache/InsertCatCache functions which insert pinned entry in the correspondent
cache.
> >>>>> I wonder if there are some pitfalls of such approach?
> >>>> That sounds pretty hackish. You'd have to be very careful, for
> >>>> example, that if the tables were dropped or re-analyzed, all of the
> >>>> old entries got removed --
> >>> I have checked it:
> >>> - when table is reanalyzed, then cache entries are replaced.
> >>> - when table is dropped, then cache entries are removed.
> >>>
> >>>> and then it would still fail if any code
> >>>> tried to access the statistics directly from the table, rather than
> >>>> via the caches. My assumption is that the statistics ought to be
> >>>> stored in some backend-private data structure designed for that
> >>>> purpose, and that the code that needs the data should be taught to
> >>>> look for it there when the table is a GTT.
> >>> Yes, if you do "select * from pg_statistic" then you will not see
> >>> statistic for GTT in this case.
> >>> But I do not think that it is so critical. I do not believe that anybody
> >>> is trying to manually interpret values in this table.
> >>> And optimizer is retrieving statistic through sys-cache mechanism and so
> >>> is able to build correct plan in this case.
> >>
> >> Years ago, when I though about it, I wrote patch with similar design. It's working, but surely it's ugly.
> >>
> >> I have another idea. Can be pg_statistics view instead a table?
> >>
> >> Some like
> >>
> >> SELECT * FROM pg_catalog.pg_statistics_rel
> >> UNION ALL
> >> SELECT * FROM pg_catalog.pg_statistics_gtt();
> >>
> >> Internally - when stat cache is filled, then there can be used pg_statistics_rel and pg_statistics_gtt() directly.
WhatI remember, there was not possibility to work with queries, only with just relations. 
> > It'd be a loss if you lose the ability to see the statistics, as there
> > are valid use cases where you need to see the stats, eg. understanding
> > why you don't get the plan you wanted.  There's also at least one
> > extension [1] that allows you to backup and use restored statistics,
> > so there are definitely people interested in it.
> >
> > [1]: https://github.com/ossc-db/pg_dbms_stats
> It seems to have completely no sense to backup and restore statistic for
> temporary tables which life time is limited to life time of backend,
> doesn't it?

In general yes I agree, but it doesn't if the goal is to understand
why even after an analyze on the temporary table your query is still
behaving poorly.  It can be useful to allow reproduction or just give
someone else the statistics to see what's going on.



Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:



And pg_catalog.pg_statistics_gtt() is set returning functions?

yes

I afraid that it is not acceptable solution from performance point of view: pg_statictic table is accessed by keys (<relid>,<attpos>,<inh>)

I don't think so it is problem. The any component, that needs to use fast access can use some special function that check index or check some memory buffers.


If it can not be done using index scan, then it can cause significant performance slow down.

where you need fast access when you use SQL access? Inside postgres optimizer is caches everywhere. And statistics cache should to know so have to check index and some memory buffers.

The proposed view will not be used by optimizer, but it can be used by some higher layers. I think so there is a agreement so GTT metadata should not be stored in system catalogue. If are stored in some syscache or somewhere else is not important in this moment. But can be nice if for user the GTT metadata should not be black hole. I think so is better to change some current tables to views, than use some special function just specialized for GTT (these functions should to exists in both variants). When I think about it - this is important not just for functionality that we expect from GTT. It is important for consistency of Postgres catalog - how much different should be GTT than other types of tables in system catalogue from user's perspective.


Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:
Dear Hackers


I attached the patch of GTT implementationI base on PG12.
The GTT design came from my first email.
Some limitations in patch will be eliminated in later versions.

Later, I will comment on Konstantin's patch and make some proposals for cooperation.
Looking forward to your feedback.

Thanks.

Zeng Wenjing





> 2019年10月29日 上午12:40,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Mon, Oct 28, 2019 at 9:37 AM Konstantin Knizhnik
> <k.knizhnik@postgrespro.ru> wrote:
>> Sorry, but both statements are not true.
>
> Well, I think they are true.
>
>> I am not sure how vital is lack of aborts for transactions working with
>> GTT at replica.
>> Some people said that there is no sense in aborts of read-only
>> transactions at replica (despite to the fact that them are saving
>> intermediate results in GTT).
>> Some people said something similar with your's "dead on arrival".
>> But inconsistency is not possible: if such transaction is really
>> aborted, then backend is terminated and nobody can see this inconsistency.
>
> Aborting the current transaction is a very different thing from
> terminating the backend.
>
> Also, the idea that there is no sense in aborts of read-only
> transactions on a replica seems totally wrong. Suppose that you insert
> a row into the table and then you go to insert a row in each index,
> but one of the index inserts fails - duplicate key, out of memory
> error, I/O error, whatever. Now the table and the index are
> inconsistent. Normally, we're protected against this by MVCC, but if
> you use a solution that breaks MVCC by using the same XID for all
> transactions, then it can happen.
>
>> Concerning second alternative: you can check yourself that it is not
>> *extremely* complicated and invasive.
>> I extracted changes which are related with handling transactions at
>> replica and attached them to this mail.
>> It is just 500 lines (including diff contexts). Certainly there are some
>> limitation of this implementation: number of  transactions working with
>> GTT at replica is limited by 2^32
>> and since GTT tuples are not frozen, analog of GTT CLOG kept in memory
>> is never truncated.
>
> I admit that this patch is not lengthy, but there remains the question
> of whether it is correct. It's possible that the problem isn't as
> complicated as I think it is, but I do think there are quite a number
> of reasons why this patch wouldn't be considered acceptable...
>
>> I agree with it and think that implementation of GTT with private
>> buffers and no replica access is good starting point.
>
> ...but given that we seem to agree on this point, perhaps it isn't
> necessary to argue about those things right now.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company


Attachment

Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:

On 06.11.2019 16:24, 曾文旌(义从) wrote:
> Dear Hackers
>
>
> I attached the patch of GTT implementationI base on PG12.
> The GTT design came from my first email.
> Some limitations in patch will be eliminated in later versions.
>
> Later, I will comment on Konstantin's patch and make some proposals for cooperation.
> Looking forward to your feedback.
>
> Thanks.
>
> Zeng Wenjing
>

Thank you for this patch.
My first comments:

1.  I have ported you patch to the latest Postgres version (my patch is 
attached).
2. You patch is supporting only B-Tree index for GTT. All other indexes 
(hash, gin, gist, brin,...) are not currently supported.
3. I do not understand the reason for the following limitation:
"We allow to create index on global temp table only this session use it"

First of all it seems to significantly reduce usage of global temp tables.
Why do we need GTT at all? Mostly because we need to access temporary 
data in more than one backend. Otherwise we can just use normal table.
If temp table is expected to be larger enough, so that we need to create 
index for it, then it is hard to believe that it will be needed only in 
one backend.

May be the assumption is that all indexes has to be created before GTT 
start to be used.
But right now this check is not working correctly in any case - if you 
insert some data into the table, then
you can not create index any more:

postgres=# create global temp table gtt(x integer primary key, y integer);
CREATE TABLE
postgres=# insert into gtt values (generate_series(1,100000), 
generate_series(1,100000));
INSERT 0 100000
postgres=# create index on gtt(y);
ERROR:  can not create index when have one or more backend attached this 
global temp table

I wonder why do you need such restriction?


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:

> 2019年11月7日 上午12:08,Konstantin Knizhnik <k.knizhnik@postgrespro.ru> 写道:
>
>
>
> On 06.11.2019 16:24, 曾文旌(义从) wrote:
>> Dear Hackers
>>
>>
>> I attached the patch of GTT implementationI base on PG12.
>> The GTT design came from my first email.
>> Some limitations in patch will be eliminated in later versions.
>>
>> Later, I will comment on Konstantin's patch and make some proposals for cooperation.
>> Looking forward to your feedback.
>>
>> Thanks.
>>
>> Zeng Wenjing
>>
>
> Thank you for this patch.
> My first comments:
>
> 1.  I have ported you patch to the latest Postgres version (my patch is attached).
> 2. You patch is supporting only B-Tree index for GTT. All other indexes (hash, gin, gist, brin,...) are not currently
supported.
Currently I only support btree index.
I noticed that your patch supports more index types, which is where I'd like to work with you.

> 3. I do not understand the reason for the following limitation:
> "We allow to create index on global temp table only this session use it"
>
> First of all it seems to significantly reduce usage of global temp tables.
> Why do we need GTT at all? Mostly because we need to access temporary data in more than one backend. Otherwise we can
justuse normal table. 
> If temp table is expected to be larger enough, so that we need to create index for it, then it is hard to believe
thatit will be needed only in one backend. 
>
> May be the assumption is that all indexes has to be created before GTT start to be used.
Yes, Currently, GTT's index is only supported and created in an empty table state, and other sessions are not using it.
There has two improvements pointer:
1 Index can create on GTT(A) when the GTT(A)  in the current session is not empty, requiring the GTT table to be empty
inthe other session. 
Index_build needs to be done in the current session just like a normal table. This improvement is relatively easy.

2 Index can create on GTT(A)  when more than one session are using this GTT(A).
Because when I'm done creating an index of the GTT in this session and setting it to be an valid index, it's not true
forthe GTT in other sessions. 
Indexes on gtt in other sessions require "rebuild_index" before using it.
I don't have a better solution right now, maybe you have some suggestions.


> But right now this check is not working correctly in any case - if you insert some data into the table, then
> you can not create index any more:
>
> postgres=# create global temp table gtt(x integer primary key, y integer);
> CREATE TABLE
> postgres=# insert into gtt values (generate_series(1,100000), generate_series(1,100000));
> INSERT 0 100000
> postgres=# create index on gtt(y);
> ERROR:  can not create index when have one or more backend attached this global temp table
>
> I wonder why do you need such restriction?
>
>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
> <global_temporary_table_v1-pg13.patch>




Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


čt 7. 11. 2019 v 10:30 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


> 2019年11月7日 上午12:08,Konstantin Knizhnik <k.knizhnik@postgrespro.ru> 写道:
>
>
>
> On 06.11.2019 16:24, 曾文旌(义从) wrote:
>> Dear Hackers
>>
>>
>> I attached the patch of GTT implementationI base on PG12.
>> The GTT design came from my first email.
>> Some limitations in patch will be eliminated in later versions.
>>
>> Later, I will comment on Konstantin's patch and make some proposals for cooperation.
>> Looking forward to your feedback.
>>
>> Thanks.
>>
>> Zeng Wenjing
>>
>
> Thank you for this patch.
> My first comments:
>
> 1.  I have ported you patch to the latest Postgres version (my patch is attached).
> 2. You patch is supporting only B-Tree index for GTT. All other indexes (hash, gin, gist, brin,...) are not currently supported.
Currently I only support btree index.
I noticed that your patch supports more index types, which is where I'd like to work with you.

> 3. I do not understand the reason for the following limitation:
> "We allow to create index on global temp table only this session use it"
>
> First of all it seems to significantly reduce usage of global temp tables.
> Why do we need GTT at all? Mostly because we need to access temporary data in more than one backend. Otherwise we can just use normal table.
> If temp table is expected to be larger enough, so that we need to create index for it, then it is hard to believe that it will be needed only in one backend.
>
> May be the assumption is that all indexes has to be created before GTT start to be used.
Yes, Currently, GTT's index is only supported and created in an empty table state, and other sessions are not using it.
There has two improvements pointer:
1 Index can create on GTT(A) when the GTT(A)  in the current session is not empty, requiring the GTT table to be empty in the other session.
Index_build needs to be done in the current session just like a normal table. This improvement is relatively easy.

2 Index can create on GTT(A)  when more than one session are using this GTT(A).
Because when I'm done creating an index of the GTT in this session and setting it to be an valid index, it's not true for the GTT in other sessions.
Indexes on gtt in other sessions require "rebuild_index" before using it.
I don't have a better solution right now, maybe you have some suggestions.

I think so DDL operations can be implemented in some reduced form - so DDL are active only for one session, and for other sessions are invisible. Important is state of GTT object on session start.

For example ALTER TABLE DROP COLUMN can has very fatal impact on other sessions. So I think the best of GTT can be pattern - the structure of GTT table is immutable for any session that doesn't do DDL operations.



> But right now this check is not working correctly in any case - if you insert some data into the table, then
> you can not create index any more:
>
> postgres=# create global temp table gtt(x integer primary key, y integer);
> CREATE TABLE
> postgres=# insert into gtt values (generate_series(1,100000), generate_series(1,100000));
> INSERT 0 100000
> postgres=# create index on gtt(y);
> ERROR:  can not create index when have one or more backend attached this global temp table
>
> I wonder why do you need such restriction?
>
>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
> <global_temporary_table_v1-pg13.patch>

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2019年11月7日 下午5:30,曾文旌(义从) <wenjing.zwj@alibaba-inc.com> 写道:



2019年11月7日 上午12:08,Konstantin Knizhnik <k.knizhnik@postgrespro.ru> 写道:



On 06.11.2019 16:24, 曾文旌(义从) wrote:
Dear Hackers


I attached the patch of GTT implementationI base on PG12.
The GTT design came from my first email.
Some limitations in patch will be eliminated in later versions.

Later, I will comment on Konstantin's patch and make some proposals for cooperation.
Looking forward to your feedback.

Thanks.

Zeng Wenjing


Thank you for this patch.
My first comments:

1.  I have ported you patch to the latest Postgres version (my patch is attached).
2. You patch is supporting only B-Tree index for GTT. All other indexes (hash, gin, gist, brin,...) are not currently supported.
Currently I only support btree index.
I noticed that your patch supports more index types, which is where I'd like to work with you.

3. I do not understand the reason for the following limitation:
"We allow to create index on global temp table only this session use it"

First of all it seems to significantly reduce usage of global temp tables.
Why do we need GTT at all? Mostly because we need to access temporary data in more than one backend. Otherwise we can just use normal table.
If temp table is expected to be larger enough, so that we need to create index for it, then it is hard to believe that it will be needed only in one backend.

May be the assumption is that all indexes has to be created before GTT start to be used.
Yes, Currently, GTT's index is only supported and created in an empty table state, and other sessions are not using it.
There has two improvements pointer:
1 Index can create on GTT(A) when the GTT(A)  in the current session is not empty, requiring the GTT table to be empty in the other session.
Index_build needs to be done in the current session just like a normal table. This improvement is relatively easy.
This part of the improvement has been completed.
New patch is attached.


2 Index can create on GTT(A)  when more than one session are using this GTT(A).
Because when I'm done creating an index of the GTT in this session and setting it to be an valid index, it's not true for the GTT in other sessions.
Indexes on gtt in other sessions require "rebuild_index" before using it. 
I don't have a better solution right now, maybe you have some suggestions.


But right now this check is not working correctly in any case - if you insert some data into the table, then
you can not create index any more:

postgres=# create global temp table gtt(x integer primary key, y integer);
CREATE TABLE
postgres=# insert into gtt values (generate_series(1,100000), generate_series(1,100000));
INSERT 0 100000
postgres=# create index on gtt(y);
ERROR:  can not create index when have one or more backend attached this global temp table

Index can create on GTT(A) when the GTT(A)  in the current session is not empty now.
But still requiring the GTT table to be empty in the other session.

I wonder why do you need such restriction?


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

<global_temporary_table_v1-pg13.patch>


Zeng Wenjing


Attachment

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2019年11月7日 下午5:40,Pavel Stehule <pavel.stehule@gmail.com> 写道:



čt 7. 11. 2019 v 10:30 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


> 2019年11月7日 上午12:08,Konstantin Knizhnik <k.knizhnik@postgrespro.ru> 写道:
>
>
>
> On 06.11.2019 16:24, 曾文旌(义从) wrote:
>> Dear Hackers
>>
>>
>> I attached the patch of GTT implementationI base on PG12.
>> The GTT design came from my first email.
>> Some limitations in patch will be eliminated in later versions.
>>
>> Later, I will comment on Konstantin's patch and make some proposals for cooperation.
>> Looking forward to your feedback.
>>
>> Thanks.
>>
>> Zeng Wenjing
>>
>
> Thank you for this patch.
> My first comments:
>
> 1.  I have ported you patch to the latest Postgres version (my patch is attached).
> 2. You patch is supporting only B-Tree index for GTT. All other indexes (hash, gin, gist, brin,...) are not currently supported.
Currently I only support btree index.
I noticed that your patch supports more index types, which is where I'd like to work with you.

> 3. I do not understand the reason for the following limitation:
> "We allow to create index on global temp table only this session use it"
>
> First of all it seems to significantly reduce usage of global temp tables.
> Why do we need GTT at all? Mostly because we need to access temporary data in more than one backend. Otherwise we can just use normal table.
> If temp table is expected to be larger enough, so that we need to create index for it, then it is hard to believe that it will be needed only in one backend.
>
> May be the assumption is that all indexes has to be created before GTT start to be used.
Yes, Currently, GTT's index is only supported and created in an empty table state, and other sessions are not using it.
There has two improvements pointer:
1 Index can create on GTT(A) when the GTT(A)  in the current session is not empty, requiring the GTT table to be empty in the other session.
Index_build needs to be done in the current session just like a normal table. This improvement is relatively easy.

2 Index can create on GTT(A)  when more than one session are using this GTT(A).
Because when I'm done creating an index of the GTT in this session and setting it to be an valid index, it's not true for the GTT in other sessions.
Indexes on gtt in other sessions require "rebuild_index" before using it.
I don't have a better solution right now, maybe you have some suggestions.

I think so DDL operations can be implemented in some reduced form - so DDL are active only for one session, and for other sessions are invisible. Important is state of GTT object on session start.

For example ALTER TABLE DROP COLUMN can has very fatal impact on other sessions. So I think the best of GTT can be pattern - the structure of GTT table is immutable for any session that doesn't do DDL operations.
Yes, Those ddl that need to rewrite data files will have this problem.
This is why I disabled alter GTT in the current version.
It can be improved, such as Alter GTT can also be allowed when only the current session is in use.
Users can also choose to kick off other sessions that are using gtt, then do alter GTT.
I provide a function(pg_gtt_attached_pid(relation, schema)) to query which session a GTT is being used by.




> But right now this check is not working correctly in any case - if you insert some data into the table, then
> you can not create index any more:
>
> postgres=# create global temp table gtt(x integer primary key, y integer);
> CREATE TABLE
> postgres=# insert into gtt values (generate_series(1,100000), generate_series(1,100000));
> INSERT 0 100000
> postgres=# create index on gtt(y);
> ERROR:  can not create index when have one or more backend attached this global temp table
>
> I wonder why do you need such restriction?
>
>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
> <global_temporary_table_v1-pg13.patch>


Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


čt 7. 11. 2019 v 13:17 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2019年11月7日 下午5:40,Pavel Stehule <pavel.stehule@gmail.com> 写道:



čt 7. 11. 2019 v 10:30 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


> 2019年11月7日 上午12:08,Konstantin Knizhnik <k.knizhnik@postgrespro.ru> 写道:
>
>
>
> On 06.11.2019 16:24, 曾文旌(义从) wrote:
>> Dear Hackers
>>
>>
>> I attached the patch of GTT implementationI base on PG12.
>> The GTT design came from my first email.
>> Some limitations in patch will be eliminated in later versions.
>>
>> Later, I will comment on Konstantin's patch and make some proposals for cooperation.
>> Looking forward to your feedback.
>>
>> Thanks.
>>
>> Zeng Wenjing
>>
>
> Thank you for this patch.
> My first comments:
>
> 1.  I have ported you patch to the latest Postgres version (my patch is attached).
> 2. You patch is supporting only B-Tree index for GTT. All other indexes (hash, gin, gist, brin,...) are not currently supported.
Currently I only support btree index.
I noticed that your patch supports more index types, which is where I'd like to work with you.

> 3. I do not understand the reason for the following limitation:
> "We allow to create index on global temp table only this session use it"
>
> First of all it seems to significantly reduce usage of global temp tables.
> Why do we need GTT at all? Mostly because we need to access temporary data in more than one backend. Otherwise we can just use normal table.
> If temp table is expected to be larger enough, so that we need to create index for it, then it is hard to believe that it will be needed only in one backend.
>
> May be the assumption is that all indexes has to be created before GTT start to be used.
Yes, Currently, GTT's index is only supported and created in an empty table state, and other sessions are not using it.
There has two improvements pointer:
1 Index can create on GTT(A) when the GTT(A)  in the current session is not empty, requiring the GTT table to be empty in the other session.
Index_build needs to be done in the current session just like a normal table. This improvement is relatively easy.

2 Index can create on GTT(A)  when more than one session are using this GTT(A).
Because when I'm done creating an index of the GTT in this session and setting it to be an valid index, it's not true for the GTT in other sessions.
Indexes on gtt in other sessions require "rebuild_index" before using it.
I don't have a better solution right now, maybe you have some suggestions.

I think so DDL operations can be implemented in some reduced form - so DDL are active only for one session, and for other sessions are invisible. Important is state of GTT object on session start.

For example ALTER TABLE DROP COLUMN can has very fatal impact on other sessions. So I think the best of GTT can be pattern - the structure of GTT table is immutable for any session that doesn't do DDL operations.
Yes, Those ddl that need to rewrite data files will have this problem.
This is why I disabled alter GTT in the current version.
It can be improved, such as Alter GTT can also be allowed when only the current session is in use.

I think so it is acceptable solution for some first steps, but I cannot to imagine so this behave can be good for production usage. But can be good enough for some time.

Regards

Pavel

Users can also choose to kick off other sessions that are using gtt, then do alter GTT.
I provide a function(pg_gtt_attached_pid(relation, schema)) to query which session a GTT is being used by.




> But right now this check is not working correctly in any case - if you insert some data into the table, then
> you can not create index any more:
>
> postgres=# create global temp table gtt(x integer primary key, y integer);
> CREATE TABLE
> postgres=# insert into gtt values (generate_series(1,100000), generate_series(1,100000));
> INSERT 0 100000
> postgres=# create index on gtt(y);
> ERROR:  can not create index when have one or more backend attached this global temp table
>
> I wonder why do you need such restriction?
>
>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
> <global_temporary_table_v1-pg13.patch>


Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:

On 07.11.2019 12:30, 曾文旌(义从) wrote:
>
>> May be the assumption is that all indexes has to be created before GTT start to be used.
> Yes, Currently, GTT's index is only supported and created in an empty table state, and other sessions are not using
it.
> There has two improvements pointer:
> 1 Index can create on GTT(A) when the GTT(A)  in the current session is not empty, requiring the GTT table to be
emptyin the other session.
 
> Index_build needs to be done in the current session just like a normal table. This improvement is relatively easy.
>
> 2 Index can create on GTT(A)  when more than one session are using this GTT(A).
> Because when I'm done creating an index of the GTT in this session and setting it to be an valid index, it's not true
forthe GTT in other sessions.
 
> Indexes on gtt in other sessions require "rebuild_index" before using it.
> I don't have a better solution right now, maybe you have some suggestions.
It is possible to create index on demand:

Buffer
_bt_getbuf(Relation rel, BlockNumber blkno, int access)
{
     Buffer        buf;

     if (blkno != P_NEW)
     {
         /* Read an existing block of the relation */
         buf = ReadBuffer(rel, blkno);
         /* Session temporary relation may be not yet initialized for 
this backend. */
         if (blkno == BTREE_METAPAGE && 
GlobalTempRelationPageIsNotInitialized(rel, BufferGetPage(buf)))
         {
             Relation heap = RelationIdGetRelation(rel->rd_index->indrelid);
             ReleaseBuffer(buf);
             DropRelFileNodeLocalBuffers(rel->rd_node, MAIN_FORKNUM, blkno);
             btbuild(heap, rel, BuildIndexInfo(rel));
             RelationClose(heap);
             buf = ReadBuffer(rel, blkno);
             LockBuffer(buf, access);
         }
         else
         {
             LockBuffer(buf, access);
             _bt_checkpage(rel, buf);
         }
     }
     ...


This code initializes B-Tree and load data in it when GTT index is 
access and is not initialized yet.
It looks a little bit hacker but it works.

I also wonder why you are keeping information about GTT in shared 
memory. Looks like the only information we really need to share is 
table's metadata.
But it is already shared though catalog. All other GTT related 
information is private to backend so I do not see reasons to place it in 
shared memory.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:

> 2019年11月8日 上午12:32,Konstantin Knizhnik <k.knizhnik@postgrespro.ru> 写道:
>
>
>
> On 07.11.2019 12:30, 曾文旌(义从) wrote:
>>
>>> May be the assumption is that all indexes has to be created before GTT start to be used.
>> Yes, Currently, GTT's index is only supported and created in an empty table state, and other sessions are not using
it.
>> There has two improvements pointer:
>> 1 Index can create on GTT(A) when the GTT(A)  in the current session is not empty, requiring the GTT table to be
emptyin the other session. 
>> Index_build needs to be done in the current session just like a normal table. This improvement is relatively easy.
>>
>> 2 Index can create on GTT(A)  when more than one session are using this GTT(A).
>> Because when I'm done creating an index of the GTT in this session and setting it to be an valid index, it's not
truefor the GTT in other sessions. 
>> Indexes on gtt in other sessions require "rebuild_index" before using it.
>> I don't have a better solution right now, maybe you have some suggestions.
> It is possible to create index on demand:
>
> Buffer
> _bt_getbuf(Relation rel, BlockNumber blkno, int access)
> {
>     Buffer        buf;
>
>     if (blkno != P_NEW)
>     {
>         /* Read an existing block of the relation */
>         buf = ReadBuffer(rel, blkno);
>         /* Session temporary relation may be not yet initialized for this backend. */
>         if (blkno == BTREE_METAPAGE && GlobalTempRelationPageIsNotInitialized(rel, BufferGetPage(buf)))
>         {
>             Relation heap = RelationIdGetRelation(rel->rd_index->indrelid);
>             ReleaseBuffer(buf);
>             DropRelFileNodeLocalBuffers(rel->rd_node, MAIN_FORKNUM, blkno);
>             btbuild(heap, rel, BuildIndexInfo(rel));
>             RelationClose(heap);
>             buf = ReadBuffer(rel, blkno);
>             LockBuffer(buf, access);
>         }
>         else
>         {
>             LockBuffer(buf, access);
>             _bt_checkpage(rel, buf);
>         }
>     }
>     ...
In my opinion, it is not a good idea to trigger a btbuild with a select or DML, the cost of which depends on the amount
ofdata in the GTT. 

>
>
> This code initializes B-Tree and load data in it when GTT index is access and is not initialized yet.
> It looks a little bit hacker but it works.
>
> I also wonder why you are keeping information about GTT in shared memory. Looks like the only information we really
needto share is table's metadata. 
> But it is already shared though catalog. All other GTT related information is private to backend so I do not see
reasonsto place it in shared memory. 
The shared hash structure tracks which backend has initialized the GTT storage in order to implement the DDL of the
GTT.
As for GTT, there is only one definition(include index on GTT), but each backend may have one data.
For the implementation of drop GTT, I assume that all data and definitions need to be deleted.

>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>
>



Attachment

Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:

On 08.11.2019 10:50, 曾文旌(义从) wrote:
> In my opinion, it is not a good idea to trigger a btbuild with a select or DML, the cost of which depends on the
amountof data in the GTT.
 
IMHO it is better than returning error.
Also index will be used only if cost of plan with index will be 
considered better than cost of plan without index. If you do not have 
index, then you have to scan the whole table.
Time of such scan is comparable with time of building index.

Yes, I agree that indexes for GTT are used to be created together with 
table itself before it is used by any application.
But if later DBA recognized that efficient execution of queries requires 
some more indexes,
it will be strange and dangerous to prevent him from adding such index 
until all clients which have accessed this table will drop their 
connections.
Also maintaining in shared memory information about attached backends 
seems to be overkill.

>>
>> This code initializes B-Tree and load data in it when GTT index is access and is not initialized yet.
>> It looks a little bit hacker but it works.
>>
>> I also wonder why you are keeping information about GTT in shared memory. Looks like the only information we really
needto share is table's metadata.
 
>> But it is already shared though catalog. All other GTT related information is private to backend so I do not see
reasonsto place it in shared memory.
 
> The shared hash structure tracks which backend has initialized the GTT storage in order to implement the DDL of the
GTT.
Sorry, I do not understand this argument.
DDL is performed on shared metadata present in global catalog.
Standard postgres invalidation mechanism is used to notify all backends 
about schema changes.
Why do we need to maintain some extra information in shared memory.
Can you give me example of DLL which does't work without such shared hash?

> As for GTT, there is only one definition(include index on GTT), but each backend may have one data.
> For the implementation of drop GTT, I assume that all data and definitions need to be deleted.

Data of dropped GTT is removed on normal backend termination or cleaned 
up at server restart in case of abnormal shutdown (as it is done for 
local temp tables).
I have not used any shared control structures for GTT in my 
implementation and that is why I wonder why do you need it and what are 
the expected problems with my
implementation?

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:
My comments for global_private_temp-4.patch

good side:
1 Lots of  index type on GTT. I think we need support for all kinds of indexes.
2 serial column on GTT.
3 INHERITS GTT.
4 PARTITION GTT.

I didn't choose to support them in the first release, but you did.

Other side:
1 case: create global temp table gtt2(a int primary key, b text) on commit delete rows;
I think you've lost the meaning of the on commit delete rows clause.
After the GTT is created, the other sessions feel that this is an on commit PRESERVE rows GTT.

truncate gtt, mybe this is a bug in DropRelFileNodeBuffers.
GTT's local buffer is not released.
Case:
postgres=# insert into gtt2 values(1,'xx');
INSERT 0 1
postgres=# truncate gtt2;
TRUNCATE TABLE
postgres=# insert into gtt2 values(1,'xx');
ERROR:  unexpected data beyond EOF in block 0 of relation base/13579/t3_16384
HINT:  This has been seen to occur with buggy kernels; consider updating your system.

3  lock type of truncate GTT.
I don't think it's a good idea to hold a big lock with truncate GTT, because it only needs to process private data.

4 GTT's ddl Those ddl that need to rewrite data files may need attention.
We have discussed in the previous email. This is why I used shared hash to track the GTT file.


5 There will be problems with DDL that will change relfilenode. Such as cluster GTT ,vacuum full GTT.
A session completes vacuum full gtt(a), and other sessions will immediately start reading and writing new storage files and existing data is also lost.
I disable them in my current version.

6 drop GTT
I think drop GTT should clean up all storage files and definitions. How do you think?

7 MVCC visibility clog clean
GTT data visibility rules, like regular tables, so GTT also need clog.
We need to avoid the clog that GTT needs to be cleaned up. 
At the same time, GTT does not do autovacuum, and retaining "too old data" will cause wraparound data loss.
I have given a solution in my design.


Zeng Wenjing

2019年11月1日 下午11:15,Konstantin Knizhnik <k.knizhnik@postgrespro.ru> 写道:



On 25.10.2019 20:00, Pavel Stehule wrote:

>
>> So except the limitation mentioned above (which I do not consider as critical) there is only one problem which was not addressed: maintaining statistics for GTT.
>> If all of the following conditions are true:
>>
>> 1) GTT are used in joins
>> 2) There are indexes defined for GTT
>> 3) Size and histogram of GTT in different backends can significantly vary.
>> 4) ANALYZE was explicitly called for GTT
>>
>> then query execution plan built in one backend will be also used for other backends where it can be inefficient.
>> I also do not consider this problem as "show stopper" for adding GTT to Postgres.
> I think that's *definitely* a show stopper.
Well, if both you and Pavel think that it is really "show stopper", then
this problem really has to be addressed.
I slightly confused about this opinion, because Pavel has told me
himself that 99% of users never create indexes for temp tables
or run "analyze" for them. And without it, this problem is not a problem
at all.


Users doesn't do ANALYZE on temp tables in 99%. It's true. But second fact is so users has lot of problems. It's very similar to wrong statistics on persistent tables. When data are small, then it is not problem for users, although from my perspective it's not optimal. When data are not small, then the problem can be brutal. Temporary tables are not a exception. And users and developers are people - we know only about fatal problems. There are lot of unoptimized queries, but because the problem is not fatal, then it is not reason for report it. And lot of people has not any idea how fast the databases can be. The knowledges of  users and app developers are sad book.

Pavel

It seems to me that I have found quite elegant solution for per-backend statistic for GTT: I just inserting it in backend's catalog cache, but not in pg_statistic table itself.
To do it I have to add InsertSysCache/InsertCatCache functions which insert pinned entry in the correspondent cache.
I wonder if there are some pitfalls of such approach?

New patch for GTT is attached.
-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
<global_private_temp-4.patch>

Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:


On 08.11.2019 18:06, 曾文旌(义从) wrote:
My comments for global_private_temp-4.patch

Thank you very much for inspecting my patch.

good side:
1 Lots of  index type on GTT. I think we need support for all kinds of indexes.
2 serial column on GTT.
3 INHERITS GTT.
4 PARTITION GTT.

I didn't choose to support them in the first release, but you did.

Other side:
1 case: create global temp table gtt2(a int primary key, b text) on commit delete rows;
I think you've lost the meaning of the on commit delete rows clause.
After the GTT is created, the other sessions feel that this is an on commit PRESERVE rows GTT.


Yes, there was bug in my implementation of ON COMMIT DELETE ROWS for GTT.
It is fixed in global_private_temp-6.patch

truncate gtt, mybe this is a bug in DropRelFileNodeBuffers.
GTT's local buffer is not released.
Case:
postgres=# insert into gtt2 values(1,'xx');
INSERT 0 1
postgres=# truncate gtt2;
TRUNCATE TABLE
postgres=# insert into gtt2 values(1,'xx');
ERROR:  unexpected data beyond EOF in block 0 of relation base/13579/t3_16384
HINT:  This has been seen to occur with buggy kernels; consider updating your system.


Yes another bug, also fixed in new version of the patch.

3  lock type of truncate GTT.
I don't think it's a good idea to hold a big lock with truncate GTT, because it only needs to process private data.

Sorry, I do not understand which lock you are talking about.
I have not introduced any special locks for GTT.

4 GTT's ddl Those ddl that need to rewrite data files may need attention.
We have discussed in the previous email. This is why I used shared hash to track the GTT file.


You are right.
But instead of prohibiting ALTER TABLE at all for GTT, we can check
that there are no other backends using it.
I do not think that we should maintain some hash in shared memory to check it.
As far as ALTER TABLE is rare and slow operation in any case, we can just check presence of GTT files
created by other backends.
I have implemented this check in global_private_temp-6.patch



5 There will be problems with DDL that will change relfilenode. Such as cluster GTT ,vacuum full GTT.
A session completes vacuum full gtt(a), and other sessions will immediately start reading and writing new storage files and existing data is also lost.
I disable them in my current version.

Thank you for noticing it.
Autovacuum full should really be prohibited for GTT.


6 drop GTT
I think drop GTT should clean up all storage files and definitions. How do you think?

Storage files will be cleaned in any case on backend termination.
Certainly if backend creates  and deletes huge number of GTT in the loop, it can cause space exhaustion.
But it seems to be very strange pattern of GTT usage.



7 MVCC visibility clog clean
GTT data visibility rules, like regular tables, so GTT also need clog.
We need to avoid the clog that GTT needs to be cleaned up. 
At the same time, GTT does not do autovacuum, and retaining "too old data" will cause wraparound data loss.
I have given a solution in my design.

But why do we need some special handling of visibility rules for GTT comparing with normal (local) temp tables?
Them are also not proceeded by autovacuum?

In principle, I have also implemented special visibility rules for GTT, but only for the case when them
are accessed at replica. And it is not included in this patch, because everybody think that access to GTT
replica should be considered in separate patch.


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
Tomas Vondra
Date:
Hi,

I think we need to do something with having two patches aiming to add
global temporary tables:

[1] https://commitfest.postgresql.org/26/2349/

[2] https://commitfest.postgresql.org/26/2233/

As a reviewer I have no idea which of the threads to look at - certainly
not without reading both threads, which I doubt anyone will really do.
The reviews and discussions are somewhat intermixed between those two
threads, which makes it even more confusing.

I think we should agree on a minimal patch combining the necessary/good
bits from the various patches, and terminate one of the threads (i.e.
mark it as rejected or RWF). And we need to do that now, otherwise
there's about 0% chance of getting this into v13.

In general, I agree with the sentiment Rober expressed in [1] - the
patch needs to be as small as possible, not adding "nice to have"
features (like support for parallel queries - I very much doubt just
using shared instead of local buffers is enough to make it work.)

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:
In the previous communication

1 we agreed on the general direction
1.1 gtt use local (private) buffer
1.2 no replica access in first version

2 We feel that gtt needs to maintain statistics, but there is no agreement on what it will be done.

3 Still no one commented on GTT's transaction information processing, they include
3.1 Should gtt's frozenxid need to be care?
3.2 gtt’s clog clean
3.3 How to deal with "too old" gtt data

I suggest we discuss further, reach an agreement, and merge the two patches to one.


Wenjing


> 2020年1月6日 上午4:06,Tomas Vondra <tomas.vondra@2ndquadrant.com> 写道:
>
> Hi,
>
> I think we need to do something with having two patches aiming to add
> global temporary tables:
>
> [1] https://commitfest.postgresql.org/26/2349/
>
> [2] https://commitfest.postgresql.org/26/2233/
>
> As a reviewer I have no idea which of the threads to look at - certainly
> not without reading both threads, which I doubt anyone will really do.
> The reviews and discussions are somewhat intermixed between those two
> threads, which makes it even more confusing.
>
> I think we should agree on a minimal patch combining the necessary/good
> bits from the various patches, and terminate one of the threads (i.e.
> mark it as rejected or RWF). And we need to do that now, otherwise
> there's about 0% chance of getting this into v13.
>
> In general, I agree with the sentiment Rober expressed in [1] - the
> patch needs to be as small as possible, not adding "nice to have"
> features (like support for parallel queries - I very much doubt just
> using shared instead of local buffers is enough to make it work.)
>
> regards
>
> --
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [Proposal] Global temporary tables

From
Tomas Vondra
Date:
On Mon, Jan 06, 2020 at 01:04:15PM +0800, 曾文旌(义从) wrote:
>In the previous communication
>
>1 we agreed on the general direction
>1.1 gtt use local (private) buffer
>1.2 no replica access in first version
>

OK, good.

>2 We feel that gtt needs to maintain statistics, but there is no
>agreement on what it will be done.
>

I certainly agree GTT needs to maintain statistics, otherwise it'll lead
to poor query plans. AFAIK the current patch stores the info in a hash
table in a backend private memory, and I don't see how else to do that
(e.g. storing it in a catalog would cause catalog bloat).

FWIW this is a reasons why I think just using shared buffers (instead of
local ones) is not sufficient to support parallel queriesl as proposed
by Alexander. The workers would not know the stats, breaking planning of
queries in PARALLEL SAFE plpgsql functions etc.

>3 Still no one commented on GTT's transaction information processing, they include
>3.1 Should gtt's frozenxid need to be care?
>3.2 gtt’s clog clean
>3.3 How to deal with "too old" gtt data
>

No idea what to do about this.

>I suggest we discuss further, reach an agreement, and merge the two patches to one.
>

OK, cool. Thanks for the clarification.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [Proposal] Global temporary tables

From
Dean Rasheed
Date:
On Mon, 6 Jan 2020 at 11:01, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>
> On Mon, Jan 06, 2020 at 01:04:15PM +0800, 曾文旌(义从) wrote:
>
> >2 We feel that gtt needs to maintain statistics, but there is no
> >agreement on what it will be done.
> >
>
> I certainly agree GTT needs to maintain statistics, otherwise it'll lead
> to poor query plans.

+1

> AFAIK the current patch stores the info in a hash
> table in a backend private memory, and I don't see how else to do that
> (e.g. storing it in a catalog would cause catalog bloat).
>

It sounds like it needs a pair of system GTTs to hold the table and
column statistics for other GTTs. One would probably have the same
columns as pg_statistic, and the other just the relevant columns from
pg_class. I can see it being useful for the user to be able to see
these stats, so perhaps they could be UNIONed into the existing stats
view.

Regards,
Dean



Re: [Proposal] Global temporary tables

From
Tomas Vondra
Date:
On Mon, Jan 06, 2020 at 12:17:43PM +0000, Dean Rasheed wrote:
>On Mon, 6 Jan 2020 at 11:01, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>
>> On Mon, Jan 06, 2020 at 01:04:15PM +0800, 曾文旌(义从) wrote:
>>
>> >2 We feel that gtt needs to maintain statistics, but there is no
>> >agreement on what it will be done.
>> >
>>
>> I certainly agree GTT needs to maintain statistics, otherwise it'll lead
>> to poor query plans.
>
>+1
>
>> AFAIK the current patch stores the info in a hash
>> table in a backend private memory, and I don't see how else to do that
>> (e.g. storing it in a catalog would cause catalog bloat).
>>
>
>It sounds like it needs a pair of system GTTs to hold the table and
>column statistics for other GTTs. One would probably have the same
>columns as pg_statistic, and the other just the relevant columns from
>pg_class. I can see it being useful for the user to be able to see
>these stats, so perhaps they could be UNIONed into the existing stats
>view.
>

Hmmm, yeah. A "temporary catalog" (not sure if it can work exactly the
same as GTT) storing pg_statistics data for GTTs might work, I think. It
would not have the catalog bloat issue, which is good.

I still think we'd need to integrate this with the regular pg_statistic
catalogs somehow, so that people don't have to care about two things. I
mean, extensions like hypopg do use pg_statistic data to propose indexes
etc. and it would be nice if we don't make them more complicated.

Not sure why we'd need a temporary version of pg_class, though?


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


po 6. 1. 2020 v 13:17 odesílatel Dean Rasheed <dean.a.rasheed@gmail.com> napsal:
On Mon, 6 Jan 2020 at 11:01, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>
> On Mon, Jan 06, 2020 at 01:04:15PM +0800, 曾文旌(义从) wrote:
>
> >2 We feel that gtt needs to maintain statistics, but there is no
> >agreement on what it will be done.
> >
>
> I certainly agree GTT needs to maintain statistics, otherwise it'll lead
> to poor query plans.

+1

> AFAIK the current patch stores the info in a hash
> table in a backend private memory, and I don't see how else to do that
> (e.g. storing it in a catalog would cause catalog bloat).
>

It sounds like it needs a pair of system GTTs to hold the table and
column statistics for other GTTs. One would probably have the same
columns as pg_statistic, and the other just the relevant columns from
pg_class. I can see it being useful for the user to be able to see
these stats, so perhaps they could be UNIONed into the existing stats
view.

+1

Pavel


Regards,
Dean

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年1月6日 下午8:17,Dean Rasheed <dean.a.rasheed@gmail.com> 写道:

On Mon, 6 Jan 2020 at 11:01, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On Mon, Jan 06, 2020 at 01:04:15PM +0800, 曾文旌(义从) wrote:

2 We feel that gtt needs to maintain statistics, but there is no
agreement on what it will be done.


I certainly agree GTT needs to maintain statistics, otherwise it'll lead
to poor query plans.

+1

AFAIK the current patch stores the info in a hash
table in a backend private memory, and I don't see how else to do that
(e.g. storing it in a catalog would cause catalog bloat).


It sounds like it needs a pair of system GTTs to hold the table and
column statistics for other GTTs. One would probably have the same
columns as pg_statistic, and the other just the relevant columns from
pg_class. I can see it being useful for the user to be able to see
these stats, so perhaps they could be UNIONed into the existing stats
view.
The current patch provides several functions as extension(pg_gtt) for read gtt statistics. 
Next I can move them to the kernel and let the view pg_stats can see gttstatistics.


Regards,
Dean

Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:

On 06.01.2020 8:04, 曾文旌(义从) wrote:
> In the previous communication
>
> 1 we agreed on the general direction
> 1.1 gtt use local (private) buffer
> 1.2 no replica access in first version
>
> 2 We feel that gtt needs to maintain statistics, but there is no agreement on what it will be done.
>
> 3 Still no one commented on GTT's transaction information processing, they include
> 3.1 Should gtt's frozenxid need to be care?
> 3.2 gtt’s clog clean
> 3.3 How to deal with "too old" gtt data
>
> I suggest we discuss further, reach an agreement, and merge the two patches to one.
>

I also hope that we should come to the common solution for GTT.
If we do not try to address parallel execution issues and access to temp 
tables at replicas (and I agreed
that it should be avoided in first version of the patch), then GTT patch 
becomes quite small.

The most complex and challenged task is to support GTT for all kind of 
indexes. Unfortunately I can not proposed some good universal solution 
for it.
Just patching all existed indexes implementation seems to be the only 
choice.

Statistic is another important case.
But once again I do not completely understand why we want to address all 
this issues with statistic in first version of the patch? It contradicts 
to the idea to make this patch as small as possible.
Also it seems to me that everybody agreed that users very rarely create 
indexes for temp tables and explicitly analyze them.
So I think GTT will be useful even with limited support of statistic. In 
my version statistics for GTT is provided by pushing correspondent 
information to backend's cache for pg_statistic table.
Also I provided pg_temp_statistic view for inspecting it by users. The 
idea to make pg_statistic a view which combines statistic of normal and 
temporary tables is overkill from my point of view.

I do not understand why do we need to maintain hash with some extra 
information for GTT in backends memory (as it was done in Wenjing patch).
Also idea to use create extension for accessing this information seems 
to be dubious.

-- 
Konstantin Knizhnik
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company




Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:

On 06.01.2020 14:01, Tomas Vondra wrote:
> On Mon, Jan 06, 2020 at 01:04:15PM +0800, 曾文旌(义从) wrote:
>> In the previous communication
>>
>> 1 we agreed on the general direction
>> 1.1 gtt use local (private) buffer
>> 1.2 no replica access in first version
>>
>
> OK, good.
>
>> 2 We feel that gtt needs to maintain statistics, but there is no
>> agreement on what it will be done.
>>
>
> I certainly agree GTT needs to maintain statistics, otherwise it'll lead
> to poor query plans. AFAIK the current patch stores the info in a hash
> table in a backend private memory, and I don't see how else to do that
> (e.g. storing it in a catalog would cause catalog bloat).
>
> FWIW this is a reasons why I think just using shared buffers (instead of
> local ones) is not sufficient to support parallel queriesl as proposed
> by Alexander. The workers would not know the stats, breaking planning of
> queries in PARALLEL SAFE plpgsql functions etc.


I do not think that "all or nothing" approach is so good for software 
development as for database transactions.
Yes, if we have function in PL/pgSQL which performs queries om temporary 
tables, then
parallel workers may build inefficient plan for this queries due to lack 
of statistics.
 From my point of view this is not a pitfall of GTT but result of lack 
of global plan cache in Postgres. And it should be fixed not at GTT level.

Also I never see real use cases with such functions, even in the systems 
which using hard temporary tables and stored procedures.
But there are many other real problems with temp tables  (except already 
mentioned in this thread).
In PgPro/EE we have fixes for some of them, for example:

1. Do not reserve space in the file for temp relations. Right now append 
of relation cause writing zero page to the disk by mdextend.
It cause useless disk IO for temp tables which in most cases fit in 
memory and should not be written at disk.

2. Implicitly perform analyze of temp table intermediately after storing 
data in it. Usually tables are analyzed by autovacuum in background.
But it doesn't work for temp tables which are not processes by 
autovacuum and are accessed immediately after filling them with data and 
lack of statistic  may cause
building very inefficient plan. We have online_analyze extension which 
force analyze of the table after appending some bulk of data to it.
It can be used for normal table but most of all it is useful for temp 
relations.

Unlike hypothetical example with parallel safe function working with 
temp tables,
this are real problems observed by some of our customers.
Them are applicable both to local and global temp tables and this is why 
I do not want to discuss them in context of GTT.


>
>> 3 Still no one commented on GTT's transaction information processing, 
>> they include
>> 3.1 Should gtt's frozenxid need to be care?
>> 3.2 gtt’s clog clean
>> 3.3 How to deal with "too old" gtt data
>>
>
> No idea what to do about this.
>

I wonder what is the specific of GTT here?
The same problem takes place for normal (local) temp tables, doesn't it?


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [Proposal] Global temporary tables

From
Tomas Vondra
Date:
On Thu, Jan 09, 2020 at 06:07:46PM +0300, Konstantin Knizhnik wrote:
>
>
>On 06.01.2020 14:01, Tomas Vondra wrote:
>>On Mon, Jan 06, 2020 at 01:04:15PM +0800, 曾文旌(义从) wrote:
>>>In the previous communication
>>>
>>>1 we agreed on the general direction 1.1 gtt use local (private)
>>>buffer 1.2 no replica access in first version
>>>
>>
>>OK, good.
>>
>>>2 We feel that gtt needs to maintain statistics, but there is no
>>>agreement on what it will be done.
>>>
>>
>>I certainly agree GTT needs to maintain statistics, otherwise it'll
>>lead to poor query plans. AFAIK the current patch stores the info in a
>>hash table in a backend private memory, and I don't see how else to do
>>that (e.g. storing it in a catalog would cause catalog bloat).
>>
>>FWIW this is a reasons why I think just using shared buffers (instead
>>of local ones) is not sufficient to support parallel queriesl as
>>proposed by Alexander. The workers would not know the stats, breaking
>>planning of queries in PARALLEL SAFE plpgsql functions etc.
>
>
>I do not think that "all or nothing" approach is so good for software
>development as for database transactions.

Well, sure. I'm not saying we need to have a perfect solution in v1. I'm
saying if we have two choices:

(1) Use shared buffers even if it means the parallel query plan may be
     arbitrarily bad.

(2) Use private buffers, even if it means no parallel queries with temp
     tables.

Then I'm voting for (2) because it's less likely to break down. I can
imagine allowing parallel queries with GTT when there's no risk of
having to plan in the worker, but that's not there yet.

If we can come up with a reasonable solution for the parallel case, we
can enable it later.

>Yes, if we have function in PL/pgSQL which performs queries om 
>temporary tables, then
>parallel workers may build inefficient plan for this queries due to 
>lack of statistics.

IMHO that's a pretty awful deficiency, because it essentially means
users may need to disable parallelism for such queries. Which means
we'll get complaints from users, and we'll have to come up with some
sort of solution. I'd rather not be in that position.

>From my point of view this is not a pitfall of GTT but result of lack 
>of global plan cache in Postgres. And it should be fixed not at GTT 
>level.
>

That doesn't give us free pass to just ignore the issue. Even if it
really was due to a lack of global plan cache, the fact is we don't have
that feature, so we have a problem. I mean, if you need infrastructure
that is not available, you either have to implement that infrastructure
or make it work properly without it.

>Also I never see real use cases with such functions, even in the 
>systems which using hard temporary tables and stored procedures.
>But there are many other real problems with temp tables  (except 
>already mentioned in this thread).

Oh, I'm sure there are pretty large plpgsql applications, and I'd be
surprised if at least some of those were not affected. And I'm sure
there are apps using UDF to do all sorts of stuff (e.g. I wonder if
PostGIS would have this issue - IIRC it's using SPI etc.).

The question is whether we should consider existing apps affected,
because they are using the regular temporary tables and not GTT. So
unless they switch to GTT there is no regression ...

But even in that case I don't think it's a good idea to accept this as
an acceptable limitation. I admit one of the reasons why I think that
may be that statistics and planning are my areas of interest, so I'm not
quite willing to accept incomplete stuff as OK.

>In PgPro/EE we have fixes for some of them, for example:
>
>1. Do not reserve space in the file for temp relations. Right now 
>append of relation cause writing zero page to the disk by mdextend.
>It cause useless disk IO for temp tables which in most cases fit in 
>memory and should not be written at disk.
>
>2. Implicitly perform analyze of temp table intermediately after 
>storing data in it. Usually tables are analyzed by autovacuum in 
>background.
>But it doesn't work for temp tables which are not processes by 
>autovacuum and are accessed immediately after filling them with data 
>and lack of statistic  may cause
>building very inefficient plan. We have online_analyze extension which 
>force analyze of the table after appending some bulk of data to it.
>It can be used for normal table but most of all it is useful for temp 
>relations.
>
>Unlike hypothetical example with parallel safe function working with 
>temp tables,
>this are real problems observed by some of our customers.
>Them are applicable both to local and global temp tables and this is 
>why I do not want to discuss them in context of GTT.
>

I think those are both interesting issues worth fixing, but I don't
think it makes the issue discussed here less important.

>
>>
>>>3 Still no one commented on GTT's transaction information 
>>>processing, they include
>>>3.1 Should gtt's frozenxid need to be care?
>>>3.2 gtt’s clog clean
>>>3.3 How to deal with "too old" gtt data
>>>
>>
>>No idea what to do about this.
>>
>
>I wonder what is the specific of GTT here?
>The same problem takes place for normal (local) temp tables, doesn't it?
>

Not sure. TBH I'm not sure I understand what the issue actually is.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [Proposal] Global temporary tables

From
Tomas Vondra
Date:
On Thu, Jan 09, 2020 at 02:17:08PM +0300, Konstantin Knizhnik wrote:
>
>
>On 06.01.2020 8:04, 曾文旌(义从) wrote:
>>In the previous communication
>>
>>1 we agreed on the general direction
>>1.1 gtt use local (private) buffer
>>1.2 no replica access in first version
>>
>>2 We feel that gtt needs to maintain statistics, but there is no agreement on what it will be done.
>>
>>3 Still no one commented on GTT's transaction information processing, they include
>>3.1 Should gtt's frozenxid need to be care?
>>3.2 gtt’s clog clean
>>3.3 How to deal with "too old" gtt data
>>
>>I suggest we discuss further, reach an agreement, and merge the two patches to one.
>>
>
>I also hope that we should come to the common solution for GTT.
>If we do not try to address parallel execution issues and access to 
>temp tables at replicas (and I agreed
>that it should be avoided in first version of the patch), then GTT 
>patch becomes quite small.
>

Well, that was kinda my goal - making the patch as small as possible by
eliminating bits that are contentious or where we don't know the
solution (like planning for parallel queries).

>The most complex and challenged task is to support GTT for all kind of 
>indexes. Unfortunately I can not proposed some good universal solution 
>for it.
>Just patching all existed indexes implementation seems to be the only 
>choice.
>

I haven't looked at the indexing issue closely, but IMO we need to
ensure that every session sees/uses only indexes on GTT that were
defined before the seesion started using the table.

Can't we track which indexes a particular session sees, somehow?

>Statistic is another important case.
>But once again I do not completely understand why we want to address 
>all this issues with statistic in first version of the patch?

I think the question is which "issues with statistic" you mean. I'm sure
we can ignore some of them, e.g. the one with parallel workers not
having any stats (assuming we consider functions using GTT to be
parallel restricted).

>It contradicts to the idea to make this patch as small as possible.

Well, there's "making patch as small as possible" vs. "patch behaving
correctly" trade-off ;-)

>Also it seems to me that everybody agreed that users very rarely 
>create indexes for temp tables and explicitly analyze them.

I certainly *disagree* with this.

We often see temporary tables as a fix or misestimates in complex
queries, and/or as a replacement for CTEs with statistics/indexes. In
fact it's a pretty valuable tool when helping customers with complex
queries affected by poor estimates.

>So I think GTT will be useful even with limited support of statistic. 
>In my version statistics for GTT is provided by pushing correspondent 
>information to backend's cache for pg_statistic table.

I think someone pointed out pushing stuff directly into the cache is
rather problematic, but I don't recall the details.

>Also I provided pg_temp_statistic view for inspecting it by users. The 
>idea to make pg_statistic a view which combines statistic of normal 
>and temporary tables is overkill from my point of view.
>
>I do not understand why do we need to maintain hash with some extra 
>information for GTT in backends memory (as it was done in Wenjing 
>patch).
>Also idea to use create extension for accessing this information seems 
>to be dubious.
>

I think the extension was more a PoC rather than a final solution.


regards
-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:

On 09.01.2020 19:48, Tomas Vondra wrote:
>
>> The most complex and challenged task is to support GTT for all kind 
>> of indexes. Unfortunately I can not proposed some good universal 
>> solution for it.
>> Just patching all existed indexes implementation seems to be the only 
>> choice.
>>
>
> I haven't looked at the indexing issue closely, but IMO we need to
> ensure that every session sees/uses only indexes on GTT that were
> defined before the seesion started using the table.

Why? It contradicts with behavior of normal tables.
Assume that you have active clients and at some point of time DBA 
recognizes that them are spending to much time in scanning some GTT.
It cab create index for this GTT but if existed client will not be able 
to use this index, then we need somehow make this clients to restart 
their sessions?
In my patch I have implemented building indexes for GTT on demand: if 
accessed index on GTT is not yet initialized, then it is filled with 
local data.
>
> Can't we track which indexes a particular session sees, somehow?
>
>> Statistic is another important case.
>> But once again I do not completely understand why we want to address 
>> all this issues with statistic in first version of the patch?
>
> I think the question is which "issues with statistic" you mean. I'm sure
> we can ignore some of them, e.g. the one with parallel workers not
> having any stats (assuming we consider functions using GTT to be
> parallel restricted).

If we do not use shared buffers for GTT then parallel processing of GTT 
is not possible at all, so there is no problem with statistic for 
parallel workers.

>
> I think someone pointed out pushing stuff directly into the cache is
> rather problematic, but I don't recall the details.
>
I have not encountered any problems, so if you can point me on what is 
wrong with this approach, I will think about alternative solution.


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:


On 09.01.2020 19:30, Tomas Vondra wrote:





3 Still no one commented on GTT's transaction information processing, they include
3.1 Should gtt's frozenxid need to be care?
3.2 gtt’s clog clean
3.3 How to deal with "too old" gtt data


No idea what to do about this.


I wonder what is the specific of GTT here?
The same problem takes place for normal (local) temp tables, doesn't it?


Not sure. TBH I'm not sure I understand what the issue actually is.

Just open session, create temporary table and insert some data in it.
Then in other session run 2^31 transactions (at my desktop it takes about 2 hours).
As far as temp tables are not proceeded by vacuum, database is stalled:

 ERROR:  database is not accepting commands to avoid wraparound data loss in database "postgres"

It seems to be quite dubious behavior and it is strange to me that nobody complains about it.
We discuss  many issues related with temp tables (statistic, parallel queries,...) which seems to be less critical.

But this problem is not specific to GTT - it can be reproduced with normal (local) temp tables.
This is why I wonder why do we need to solve it in GTT patch.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:
Hi all

This is the latest patch

The updates are as follows:
1. Support global temp Inherit table global temp partition table
2. Support serial column in GTT
3. Provide views pg_gtt_relstats pg_gtt_stats for GTT’s statistics
4. Provide view pg_gtt_attached_pids to manage GTT
5. Provide function pg_list_gtt_relfrozenxids() to manage GTT
6. Alter GTT or rename GTT is allowed under some conditions


Please give me feedback.

Wenjing





2020年1月6日 上午4:06,Tomas Vondra <tomas.vondra@2ndquadrant.com> 写道:

Hi,

I think we need to do something with having two patches aiming to add
global temporary tables:

[1] https://commitfest.postgresql.org/26/2349/

[2] https://commitfest.postgresql.org/26/2233/

As a reviewer I have no idea which of the threads to look at - certainly
not without reading both threads, which I doubt anyone will really do.
The reviews and discussions are somewhat intermixed between those two
threads, which makes it even more confusing.

I think we should agree on a minimal patch combining the necessary/good
bits from the various patches, and terminate one of the threads (i.e.
mark it as rejected or RWF). And we need to do that now, otherwise
there's about 0% chance of getting this into v13.

In general, I agree with the sentiment Rober expressed in [1] - the
patch needs to be as small as possible, not adding "nice to have"
features (like support for parallel queries - I very much doubt just
using shared instead of local buffers is enough to make it work.)

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:
Hi

so 11. 1. 2020 v 15:00 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:
Hi all

This is the latest patch

The updates are as follows:
1. Support global temp Inherit table global temp partition table
2. Support serial column in GTT
3. Provide views pg_gtt_relstats pg_gtt_stats for GTT’s statistics
4. Provide view pg_gtt_attached_pids to manage GTT
5. Provide function pg_list_gtt_relfrozenxids() to manage GTT
6. Alter GTT or rename GTT is allowed under some conditions


Please give me feedback.

I tested the functionality

1. i think so "ON COMMIT PRESERVE ROWS" should be default mode (like local temp tables).

I tested some simple scripts

test01.sql

CREATE TEMP TABLE foo(a int, b int);
INSERT INTO foo SELECT random()*100, random()*1000 FROM generate_series(1,1000);
ANALYZE foo;
SELECT sum(a), sum(b) FROM foo;
DROP TABLE foo; -- simulate disconnect


after 100 sec, the table pg_attribute has 3.2MB
and 64 tps, 6446 transaction

test02.sql

INSERT INTO foo SELECT random()*100, random()*1000 FROM generate_series(1,1000);
ANALYZE foo;
SELECT sum(a), sum(b) FROM foo;
DELETE FROM foo; -- simulate disconnect


after 100 sec, 1688 tps, 168830 transactions

So performance is absolutely different as we expected.

From my perspective, this functionality is great.

Todo:

pg_table_size function doesn't work

Regards

Pavel


Wenjing





2020年1月6日 上午4:06,Tomas Vondra <tomas.vondra@2ndquadrant.com> 写道:

Hi,

I think we need to do something with having two patches aiming to add
global temporary tables:

[1] https://commitfest.postgresql.org/26/2349/

[2] https://commitfest.postgresql.org/26/2233/

As a reviewer I have no idea which of the threads to look at - certainly
not without reading both threads, which I doubt anyone will really do.
The reviews and discussions are somewhat intermixed between those two
threads, which makes it even more confusing.

I think we should agree on a minimal patch combining the necessary/good
bits from the various patches, and terminate one of the threads (i.e.
mark it as rejected or RWF). And we need to do that now, otherwise
there's about 0% chance of getting this into v13.

In general, I agree with the sentiment Rober expressed in [1] - the
patch needs to be as small as possible, not adding "nice to have"
features (like support for parallel queries - I very much doubt just
using shared instead of local buffers is enough to make it work.)

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [Proposal] Global temporary tables

From
Tomas Vondra
Date:
On Fri, Jan 10, 2020 at 03:24:34PM +0300, Konstantin Knizhnik wrote:
>
>
>On 09.01.2020 19:30, Tomas Vondra wrote:
>
>
>>
>>>
>>>>
>>>>>3 Still no one commented on GTT's transaction information 
>>>>>processing, they include
>>>>>3.1 Should gtt's frozenxid need to be care?
>>>>>3.2 gtt’s clog clean
>>>>>3.3 How to deal with "too old" gtt data
>>>>>
>>>>
>>>>No idea what to do about this.
>>>>
>>>
>>>I wonder what is the specific of GTT here?
>>>The same problem takes place for normal (local) temp tables, doesn't it?
>>>
>>
>>Not sure. TBH I'm not sure I understand what the issue actually is.
>
>Just open session, create temporary table and insert some data in it.
>Then in other session run 2^31 transactions (at my desktop it takes 
>about 2 hours).
>As far as temp tables are not proceeded by vacuum, database is stalled:
>
> ERROR:  database is not accepting commands to avoid wraparound data 
>loss in database "postgres"
>
>It seems to be quite dubious behavior and it is strange to me that 
>nobody complains about it.
>We discuss  many issues related with temp tables (statistic, parallel 
>queries,...) which seems to be less critical.
>
>But this problem is not specific to GTT - it can be reproduced with 
>normal (local) temp tables.
>This is why I wonder why do we need to solve it in GTT patch.
>

Yeah, I think that's out of scope for GTT patch. Once we solve it for
plain temporary tables, we'll solve it for GTT too.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



Re: [Proposal] Global temporary tables

From
Tomas Vondra
Date:
On Fri, Jan 10, 2020 at 11:47:42AM +0300, Konstantin Knizhnik wrote:
>
>
>On 09.01.2020 19:48, Tomas Vondra wrote:
>>
>>>The most complex and challenged task is to support GTT for all 
>>>kind of indexes. Unfortunately I can not proposed some good 
>>>universal solution for it.
>>>Just patching all existed indexes implementation seems to be the 
>>>only choice.
>>>
>>
>>I haven't looked at the indexing issue closely, but IMO we need to
>>ensure that every session sees/uses only indexes on GTT that were
>>defined before the seesion started using the table.
>
>Why? It contradicts with behavior of normal tables.
>Assume that you have active clients and at some point of time DBA 
>recognizes that them are spending to much time in scanning some GTT.
>It cab create index for this GTT but if existed client will not be 
>able to use this index, then we need somehow make this clients to 
>restart their sessions?
>In my patch I have implemented building indexes for GTT on demand: if 
>accessed index on GTT is not yet initialized, then it is filled with 
>local data.

Yes, I know the behavior would be different from behavior for regular
tables. And yes, it would not allow fixing slow queries in sessions
without interrupting those sessions.

I proposed just ignoring those new indexes because it seems much simpler
than alternative solutions that I can think of, and it's not like those
other solutions don't have other issues.

For example, I've looked at the "on demand" building as implemented in
global_private_temp-8.patch, I kinda doubt adding a bunch of index build
calls into various places in index code seems somewht suspicious.

* brinbuild is added to brinRevmapInitialize, which is meant to
   initialize state for scanning. It seems wrong to build the index we're
   scanning from this function (layering and all that).

* btbuild is called from _bt_getbuf. That seems a bit ... suspicious?

... and so on for other index types. Also, what about custom indexes
implemented in extensions? It seems a bit strange each of them has to
support this separately.

IMHO if this really is the right solution, we need to make it work for
existing indexes without having to tweak them individually. Why don't we
track a flag whether an index on GTT was initialized in a given session,
and if it was not then call the build function before calling any other
function from the index AM? 

But let's talk about other issues caused by "on demand" build. Imagine
you have 50 sessions, each using the same GTT with a GB of per-session
data. Now you create a new index on the GTT, which forces the sessions
to build it's "local" index. Those builds will use maintenance_work_mem
each, so 50 * m_w_m. I doubt that's expected/sensible.

So I suggest we start by just ignoring the *new* indexes, and improve
this in the future (by building the indexes on demand or whatever).

>>
>>Can't we track which indexes a particular session sees, somehow?
>>
>>>Statistic is another important case.
>>>But once again I do not completely understand why we want to 
>>>address all this issues with statistic in first version of the 
>>>patch?
>>
>>I think the question is which "issues with statistic" you mean. I'm sure
>>we can ignore some of them, e.g. the one with parallel workers not
>>having any stats (assuming we consider functions using GTT to be
>>parallel restricted).
>
>If we do not use shared buffers for GTT then parallel processing of 
>GTT is not possible at all, so there is no problem with statistic for 
>parallel workers.
>

Right.

>>
>>I think someone pointed out pushing stuff directly into the cache is
>>rather problematic, but I don't recall the details.
>>
>I have not encountered any problems, so if you can point me on what is 
>wrong with this approach, I will think about alternative solution.
>

I meant this comment by Robert:

https://www.postgresql.org/message-id/CA%2BTgmoZFWaND4PpT_CJbeu6VZGZKi2rrTuSTL-Ykd97fexTN-w%40mail.gmail.com


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:

On 12.01.2020 4:51, Tomas Vondra wrote:
> On Fri, Jan 10, 2020 at 11:47:42AM +0300, Konstantin Knizhnik wrote:
>>
>>
>> On 09.01.2020 19:48, Tomas Vondra wrote:
>>>
>>>> The most complex and challenged task is to support GTT for all kind 
>>>> of indexes. Unfortunately I can not proposed some good universal 
>>>> solution for it.
>>>> Just patching all existed indexes implementation seems to be the 
>>>> only choice.
>>>>
>>>
>>> I haven't looked at the indexing issue closely, but IMO we need to
>>> ensure that every session sees/uses only indexes on GTT that were
>>> defined before the seesion started using the table.
>>
>> Why? It contradicts with behavior of normal tables.
>> Assume that you have active clients and at some point of time DBA 
>> recognizes that them are spending to much time in scanning some GTT.
>> It cab create index for this GTT but if existed client will not be 
>> able to use this index, then we need somehow make this clients to 
>> restart their sessions?
>> In my patch I have implemented building indexes for GTT on demand: if 
>> accessed index on GTT is not yet initialized, then it is filled with 
>> local data.
>
> Yes, I know the behavior would be different from behavior for regular
> tables. And yes, it would not allow fixing slow queries in sessions
> without interrupting those sessions.
>
> I proposed just ignoring those new indexes because it seems much simpler
> than alternative solutions that I can think of, and it's not like those
> other solutions don't have other issues.

Quit opposite: prohibiting sessions to see indexes created before 
session start to use GTT requires more efforts. We need to somehow 
maintain and check GTT first access time.

>
> For example, I've looked at the "on demand" building as implemented in
> global_private_temp-8.patch, I kinda doubt adding a bunch of index build
> calls into various places in index code seems somewht suspicious.

We in any case has to initialize GTT indexes on demand even if we 
prohibit usages of indexes created after first access by session to GTT.
So the difference is only in one thing: should we just initialize empty 
index or populate it with local data (if rules for index usability are 
the same for GTT as for normal tables).
 From implementation point of view there is no big difference. Actually 
building index in standard way is even simpler than constructing empty 
index. Originally I have implemented
first approach (I just forgot to consider case when GTT was already user 
by a session). Then I rewrited it using second approach and patch even 
became simpler.

>
> * brinbuild is added to brinRevmapInitialize, which is meant to
>   initialize state for scanning. It seems wrong to build the index we're
>   scanning from this function (layering and all that).
>
> * btbuild is called from _bt_getbuf. That seems a bit ... suspicious?


As I already mentioned - support of indexes for GTT is one of the most 
challenged things in my patch.
I didn't find good and universal solution. So I agreed that call of 
btbuild from _bt_getbuf may be considered as suspicious.
I will be pleased if you or sombody else can propose better elternative 
and not only for B-Tree, but for all other indexes.

But as I already wrote above, prohibiting session to used indexes 
created after first access to GTT doesn't solve the problem.
For normal tables (and for local temp tables) indexes are initialized at 
the time of their creation.
With GTT it doesn't work, because each session has its own local data of 
GTT.
We should either initialize/build index on demand (when it is first 
accessed), either at the moment of session start initialize indexes for 
all existed GTTs.
Last options seem to be much worser from my point of view: there may me 
huge number of GTT and session may not need to access GTT at all.
>
> ... and so on for other index types. Also, what about custom indexes
> implemented in extensions? It seems a bit strange each of them has to
> support this separately.

I have already complained about it: my patch supports GTT for all 
built-in indexes, but custom indexes has to handle it themselves.
Looks like to provide some generic solution we need to extend index API, 
providing two diffrent operations: creation and initialization.
But extending index API is very critical change... And also it doesn't 
solve the problem with all existed extensions: them in any case have
to be rewritten to implement new API version in order to support GTT.
>
> IMHO if this really is the right solution, we need to make it work for
> existing indexes without having to tweak them individually. Why don't we
> track a flag whether an index on GTT was initialized in a given session,
> and if it was not then call the build function before calling any other
> function from the index AM?
> But let's talk about other issues caused by "on demand" build. Imagine
> you have 50 sessions, each using the same GTT with a GB of per-session
> data. Now you create a new index on the GTT, which forces the sessions
> to build it's "local" index. Those builds will use maintenance_work_mem
> each, so 50 * m_w_m. I doubt that's expected/sensible.

I do not see principle difference here with scenario when 50 sessions 
create (local) temp table,
populate it with GB of data and create index for it.

>
> So I suggest we start by just ignoring the *new* indexes, and improve
> this in the future (by building the indexes on demand or whatever).

Sorry, but still do not agree with this suggestions:
- it doesn't simplify things
- it makes behavior of GTT incompatible with normal tables.
- it doesn't prevent some bad or unexpected behavior which can't be 
currently reproduced with normal (local) temp tables.

>
>>>
>>> I think someone pointed out pushing stuff directly into the cache is
>>> rather problematic, but I don't recall the details.
>>>
>> I have not encountered any problems, so if you can point me on what 
>> is wrong with this approach, I will think about alternative solution.
>>
>
> I meant this comment by Robert:
>
> https://www.postgresql.org/message-id/CA%2BTgmoZFWaND4PpT_CJbeu6VZGZKi2rrTuSTL-Ykd97fexTN-w%40mail.gmail.com 
>
>
"if any code tried to access the statistics directly from the table, 
rather than via the caches".

Currently optimizer is accessing statistic though caches. So this 
approach works. If somebody will rewrite optimizer or provide own custom 
optimizer in extension which access statistic directly
then it we really be a problem. But I wonder why bypassing catalog cache 
may be needed.

Moreover, if we implement alternative solution - for example make 
pg_statistic a view which combines results for normal tables and GTT, 
then existed optimizer has to be rewritten
because it can not access statistic in the way it is doing now. And 
there will be all problem with all existed extensions which are 
accessing statistic in most natural way - through system cache.



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [Proposal] Global temporary tables

From
Tomas Vondra
Date:
On Mon, Jan 13, 2020 at 11:08:40AM +0300, Konstantin Knizhnik wrote:
>
>
>On 12.01.2020 4:51, Tomas Vondra wrote:
>>On Fri, Jan 10, 2020 at 11:47:42AM +0300, Konstantin Knizhnik wrote:
>>>
>>>
>>>On 09.01.2020 19:48, Tomas Vondra wrote:
>>>>
>>>>>The most complex and challenged task is to support GTT for all 
>>>>>kind of indexes. Unfortunately I can not proposed some good 
>>>>>universal solution for it.
>>>>>Just patching all existed indexes implementation seems to be 
>>>>>the only choice.
>>>>>
>>>>
>>>>I haven't looked at the indexing issue closely, but IMO we need to
>>>>ensure that every session sees/uses only indexes on GTT that were
>>>>defined before the seesion started using the table.
>>>
>>>Why? It contradicts with behavior of normal tables.
>>>Assume that you have active clients and at some point of time DBA 
>>>recognizes that them are spending to much time in scanning some 
>>>GTT.
>>>It cab create index for this GTT but if existed client will not be 
>>>able to use this index, then we need somehow make this clients to 
>>>restart their sessions?
>>>In my patch I have implemented building indexes for GTT on demand: 
>>>if accessed index on GTT is not yet initialized, then it is filled 
>>>with local data.
>>
>>Yes, I know the behavior would be different from behavior for regular
>>tables. And yes, it would not allow fixing slow queries in sessions
>>without interrupting those sessions.
>>
>>I proposed just ignoring those new indexes because it seems much simpler
>>than alternative solutions that I can think of, and it's not like those
>>other solutions don't have other issues.
>
>Quit opposite: prohibiting sessions to see indexes created before 
>session start to use GTT requires more efforts. We need to somehow 
>maintain and check GTT first access time.
>

Hmmm, OK. I'd expect such check to be much simpler than the on-demand
index building, but I admit I haven't tried implementing either of those
options.

>>
>>For example, I've looked at the "on demand" building as implemented in
>>global_private_temp-8.patch, I kinda doubt adding a bunch of index build
>>calls into various places in index code seems somewht suspicious.
>
>We in any case has to initialize GTT indexes on demand even if we 
>prohibit usages of indexes created after first access by session to 
>GTT.
>So the difference is only in one thing: should we just initialize 
>empty index or populate it with local data (if rules for index 
>usability are the same for GTT as for normal tables).
>From implementation point of view there is no big difference. Actually 
>building index in standard way is even simpler than constructing empty 
>index. Originally I have implemented
>first approach (I just forgot to consider case when GTT was already 
>user by a session). Then I rewrited it using second approach and patch 
>even became simpler.
>
>>
>>* brinbuild is added to brinRevmapInitialize, which is meant to
>>  initialize state for scanning. It seems wrong to build the index we're
>>  scanning from this function (layering and all that).
>>
>>* btbuild is called from _bt_getbuf. That seems a bit ... suspicious?
>
>
>As I already mentioned - support of indexes for GTT is one of the most 
>challenged things in my patch.
>I didn't find good and universal solution. So I agreed that call of 
>btbuild from _bt_getbuf may be considered as suspicious.
>I will be pleased if you or sombody else can propose better 
>elternative and not only for B-Tree, but for all other indexes.
>
>But as I already wrote above, prohibiting session to used indexes 
>created after first access to GTT doesn't solve the problem.
>For normal tables (and for local temp tables) indexes are initialized 
>at the time of their creation.
>With GTT it doesn't work, because each session has its own local data 
>of GTT.
>We should either initialize/build index on demand (when it is first 
>accessed), either at the moment of session start initialize indexes 
>for all existed GTTs.
>Last options seem to be much worser from my point of view: there may 
>me huge number of GTT and session may not need to access GTT at all.
>>
>>... and so on for other index types. Also, what about custom indexes
>>implemented in extensions? It seems a bit strange each of them has to
>>support this separately.
>
>I have already complained about it: my patch supports GTT for all 
>built-in indexes, but custom indexes has to handle it themselves.
>Looks like to provide some generic solution we need to extend index 
>API, providing two diffrent operations: creation and initialization.
>But extending index API is very critical change... And also it doesn't 
>solve the problem with all existed extensions: them in any case have
>to be rewritten to implement new API version in order to support GTT.
>

Why not to allow creating only indexes implementing this new API method
(on GTT)?

>>
>>IMHO if this really is the right solution, we need to make it work for
>>existing indexes without having to tweak them individually. Why don't we
>>track a flag whether an index on GTT was initialized in a given session,
>>and if it was not then call the build function before calling any other
>>function from the index AM?
>>But let's talk about other issues caused by "on demand" build. Imagine
>>you have 50 sessions, each using the same GTT with a GB of per-session
>>data. Now you create a new index on the GTT, which forces the sessions
>>to build it's "local" index. Those builds will use maintenance_work_mem
>>each, so 50 * m_w_m. I doubt that's expected/sensible.
>
>I do not see principle difference here with scenario when 50 sessions 
>create (local) temp table,
>populate it with GB of data and create index for it.
>

I'd say the high memory consumption is pretty significant.

>>
>>So I suggest we start by just ignoring the *new* indexes, and improve
>>this in the future (by building the indexes on demand or whatever).
>
>Sorry, but still do not agree with this suggestions:
>- it doesn't simplify things
>- it makes behavior of GTT incompatible with normal tables.
>- it doesn't prevent some bad or unexpected behavior which can't be 
>currently reproduced with normal (local) temp tables.
>
>>
>>>>
>>>>I think someone pointed out pushing stuff directly into the cache is
>>>>rather problematic, but I don't recall the details.
>>>>
>>>I have not encountered any problems, so if you can point me on 
>>>what is wrong with this approach, I will think about alternative 
>>>solution.
>>>
>>
>>I meant this comment by Robert:
>>
>>https://www.postgresql.org/message-id/CA%2BTgmoZFWaND4PpT_CJbeu6VZGZKi2rrTuSTL-Ykd97fexTN-w%40mail.gmail.com
>>
>>
>"if any code tried to access the statistics directly from the table, 
>rather than via the caches".
>
>Currently optimizer is accessing statistic though caches. So this 
>approach works. If somebody will rewrite optimizer or provide own 
>custom optimizer in extension which access statistic directly
>then it we really be a problem. But I wonder why bypassing catalog 
>cache may be needed.
>

I don't know, but it seems extensions like hypopg do it.

>Moreover, if we implement alternative solution - for example make 
>pg_statistic a view which combines results for normal tables and GTT, 
>then existed optimizer has to be rewritten
>because it can not access statistic in the way it is doing now. And 
>there will be all problem with all existed extensions which are 
>accessing statistic in most natural way - through system cache.
>

Perhaps. I don't know enough about this part of the code to have a
strong opinion.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



Re: [Proposal] Global temporary tables

From
Julien Rouhaud
Date:
On Mon, Jan 13, 2020 at 05:32:53PM +0100, Tomas Vondra wrote:
> On Mon, Jan 13, 2020 at 11:08:40AM +0300, Konstantin Knizhnik wrote:
> >
> >"if any code tried to access the statistics directly from the table, 
> >rather than via the caches".
> >
> >Currently optimizer is accessing statistic though caches. So this 
> >approach works. If somebody will rewrite optimizer or provide own 
> >custom optimizer in extension which access statistic directly
> >then it we really be a problem. But I wonder why bypassing catalog 
> >cache may be needed.
> >
> 
> I don't know, but it seems extensions like hypopg do it.

AFAIR, hypopg only opens pg_statistic to use its tupledesc when creating
statistics on hypothetical partitions, but it should otherwise never reads or
need plain pg_statistic rows.



Re: [Proposal] Global temporary tables

From
Tomas Vondra
Date:
On Mon, Jan 13, 2020 at 09:12:38PM +0100, Julien Rouhaud wrote:
>On Mon, Jan 13, 2020 at 05:32:53PM +0100, Tomas Vondra wrote:
>> On Mon, Jan 13, 2020 at 11:08:40AM +0300, Konstantin Knizhnik wrote:
>> >
>> >"if any code tried to access the statistics directly from the table,
>> >rather than via the caches".
>> >
>> >Currently optimizer is accessing statistic though caches. So this
>> >approach works. If somebody will rewrite optimizer or provide own
>> >custom optimizer in extension which access statistic directly
>> >then it we really be a problem. But I wonder why bypassing catalog
>> >cache may be needed.
>> >
>>
>> I don't know, but it seems extensions like hypopg do it.
>
>AFAIR, hypopg only opens pg_statistic to use its tupledesc when creating
>statistics on hypothetical partitions, but it should otherwise never reads or
>need plain pg_statistic rows.

Ah, OK! Thanks for the clarification. I knew it does something with the
catalog, didn't realize it only gets the descriptor.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:
Thank you for review my patch.


2020年1月12日 上午4:27,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi

so 11. 1. 2020 v 15:00 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:
Hi all

This is the latest patch

The updates are as follows:
1. Support global temp Inherit table global temp partition table
2. Support serial column in GTT
3. Provide views pg_gtt_relstats pg_gtt_stats for GTT’s statistics
4. Provide view pg_gtt_attached_pids to manage GTT
5. Provide function pg_list_gtt_relfrozenxids() to manage GTT
6. Alter GTT or rename GTT is allowed under some conditions


Please give me feedback.

I tested the functionality

1. i think so "ON COMMIT PRESERVE ROWS" should be default mode (like local temp tables).
makes sense, I will fix it.


I tested some simple scripts

test01.sql

CREATE TEMP TABLE foo(a int, b int);
INSERT INTO foo SELECT random()*100, random()*1000 FROM generate_series(1,1000);
ANALYZE foo;
SELECT sum(a), sum(b) FROM foo;
DROP TABLE foo; -- simulate disconnect


after 100 sec, the table pg_attribute has 3.2MB
and 64 tps, 6446 transaction

test02.sql

INSERT INTO foo SELECT random()*100, random()*1000 FROM generate_series(1,1000);
ANALYZE foo;
SELECT sum(a), sum(b) FROM foo;
DELETE FROM foo; -- simulate disconnect


after 100 sec, 1688 tps, 168830 transactions

So performance is absolutely different as we expected.

From my perspective, this functionality is great.
Yes, frequent ddl causes catalog bloat, GTT avoids this problem.


Todo:

pg_table_size function doesn't work
Do you mean that function pg_table_size() need get the storage space used by the one GTT in the entire db(include all session) .


Regards

Pavel


Wenjing





2020年1月6日 上午4:06,Tomas Vondra <tomas.vondra@2ndquadrant.com> 写道:

Hi,

I think we need to do something with having two patches aiming to add
global temporary tables:

[1] https://commitfest.postgresql.org/26/2349/

[2] https://commitfest.postgresql.org/26/2233/

As a reviewer I have no idea which of the threads to look at - certainly
not without reading both threads, which I doubt anyone will really do.
The reviews and discussions are somewhat intermixed between those two
threads, which makes it even more confusing.

I think we should agree on a minimal patch combining the necessary/good
bits from the various patches, and terminate one of the threads (i.e.
mark it as rejected or RWF). And we need to do that now, otherwise
there's about 0% chance of getting this into v13.

In general, I agree with the sentiment Rober expressed in [1] - the
patch needs to be as small as possible, not adding "nice to have"
features (like support for parallel queries - I very much doubt just
using shared instead of local buffers is enough to make it work.)

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


út 14. 1. 2020 v 14:09 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:
Thank you for review my patch.


2020年1月12日 上午4:27,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi

so 11. 1. 2020 v 15:00 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:
Hi all

This is the latest patch

The updates are as follows:
1. Support global temp Inherit table global temp partition table
2. Support serial column in GTT
3. Provide views pg_gtt_relstats pg_gtt_stats for GTT’s statistics
4. Provide view pg_gtt_attached_pids to manage GTT
5. Provide function pg_list_gtt_relfrozenxids() to manage GTT
6. Alter GTT or rename GTT is allowed under some conditions


Please give me feedback.

I tested the functionality

1. i think so "ON COMMIT PRESERVE ROWS" should be default mode (like local temp tables).
makes sense, I will fix it.


I tested some simple scripts

test01.sql

CREATE TEMP TABLE foo(a int, b int);
INSERT INTO foo SELECT random()*100, random()*1000 FROM generate_series(1,1000);
ANALYZE foo;
SELECT sum(a), sum(b) FROM foo;
DROP TABLE foo; -- simulate disconnect


after 100 sec, the table pg_attribute has 3.2MB
and 64 tps, 6446 transaction

test02.sql

INSERT INTO foo SELECT random()*100, random()*1000 FROM generate_series(1,1000);
ANALYZE foo;
SELECT sum(a), sum(b) FROM foo;
DELETE FROM foo; -- simulate disconnect


after 100 sec, 1688 tps, 168830 transactions

So performance is absolutely different as we expected.

From my perspective, this functionality is great.
Yes, frequent ddl causes catalog bloat, GTT avoids this problem.


Todo:

pg_table_size function doesn't work
Do you mean that function pg_table_size() need get the storage space used by the one GTT in the entire db(include all session) .

It's question how much GTT tables should be similar to classic tables. But the reporting in psql should to work \dt+, \l+, \di+




Regards

Pavel


Wenjing





2020年1月6日 上午4:06,Tomas Vondra <tomas.vondra@2ndquadrant.com> 写道:

Hi,

I think we need to do something with having two patches aiming to add
global temporary tables:

[1] https://commitfest.postgresql.org/26/2349/

[2] https://commitfest.postgresql.org/26/2233/

As a reviewer I have no idea which of the threads to look at - certainly
not without reading both threads, which I doubt anyone will really do.
The reviews and discussions are somewhat intermixed between those two
threads, which makes it even more confusing.

I think we should agree on a minimal patch combining the necessary/good
bits from the various patches, and terminate one of the threads (i.e.
mark it as rejected or RWF). And we need to do that now, otherwise
there's about 0% chance of getting this into v13.

In general, I agree with the sentiment Rober expressed in [1] - the
patch needs to be as small as possible, not adding "nice to have"
features (like support for parallel queries - I very much doubt just
using shared instead of local buffers is enough to make it work.)

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:

> 2020年1月12日 上午9:14,Tomas Vondra <tomas.vondra@2ndquadrant.com> 写道:
>
> On Fri, Jan 10, 2020 at 03:24:34PM +0300, Konstantin Knizhnik wrote:
>>
>>
>> On 09.01.2020 19:30, Tomas Vondra wrote:
>>
>>
>>>
>>>>
>>>>>
>>>>>> 3 Still no one commented on GTT's transaction information processing, they include
>>>>>> 3.1 Should gtt's frozenxid need to be care?
>>>>>> 3.2 gtt’s clog clean
>>>>>> 3.3 How to deal with "too old" gtt data
>>>>>>
>>>>>
>>>>> No idea what to do about this.
>>>>>
>>>>
>>>> I wonder what is the specific of GTT here?
>>>> The same problem takes place for normal (local) temp tables, doesn't it?
>>>>
>>>
>>> Not sure. TBH I'm not sure I understand what the issue actually is.
>>
>> Just open session, create temporary table and insert some data in it.
>> Then in other session run 2^31 transactions (at my desktop it takes about 2 hours).
>> As far as temp tables are not proceeded by vacuum, database is stalled:
>>
>>  ERROR:  database is not accepting commands to avoid wraparound data loss in database "postgres"
>>
>> It seems to be quite dubious behavior and it is strange to me that nobody complains about it.
>> We discuss  many issues related with temp tables (statistic, parallel queries,...) which seems to be less critical.
>>
>> But this problem is not specific to GTT - it can be reproduced with normal (local) temp tables.
>> This is why I wonder why do we need to solve it in GTT patch.
>>
>
> Yeah, I think that's out of scope for GTT patch. Once we solve it for
> plain temporary tables, we'll solve it for GTT too.
1. The core problem is that the data contains transaction information (xid), which needs to be vacuum(freeze) regularly
toavoid running out of xid. 
The autovacuum supports vacuum regular table but local temp does not. autovacuum also does not support GTT.

2. However, the difference between the local temp table and the global temp table(GTT) is that
a) For local temp table: one table hava one piece of data. the frozenxid of one local temp table is store in the
catalog(pg_class). 
b) For global temp table: each session has a separate copy of data, one GTT may contain maxbackend frozenxid.
and I don't think it's a good idea to keep frozenxid of GTT in the catalog(pg_class).
It becomes a question: how to handle GTT transaction information?

I agree that problem 1 should be completely solved by a some feature, such as local transactions. It is definitely not
includedin the GTT patch. 

But, I think we need to ensure the durability of GTT data. For example, data in GTT cannot be lost due to the clog
beingcleaned up. It belongs to problem 2. 



Wenjing


>
> regards
>
> --
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年1月14日 下午9:20,Pavel Stehule <pavel.stehule@gmail.com> 写道:



út 14. 1. 2020 v 14:09 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:
Thank you for review my patch.


2020年1月12日 上午4:27,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi

so 11. 1. 2020 v 15:00 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:
Hi all

This is the latest patch

The updates are as follows:
1. Support global temp Inherit table global temp partition table
2. Support serial column in GTT
3. Provide views pg_gtt_relstats pg_gtt_stats for GTT’s statistics
4. Provide view pg_gtt_attached_pids to manage GTT
5. Provide function pg_list_gtt_relfrozenxids() to manage GTT
6. Alter GTT or rename GTT is allowed under some conditions


Please give me feedback.

I tested the functionality

1. i think so "ON COMMIT PRESERVE ROWS" should be default mode (like local temp tables).
makes sense, I will fix it.


I tested some simple scripts

test01.sql

CREATE TEMP TABLE foo(a int, b int);
INSERT INTO foo SELECT random()*100, random()*1000 FROM generate_series(1,1000);
ANALYZE foo;
SELECT sum(a), sum(b) FROM foo;
DROP TABLE foo; -- simulate disconnect


after 100 sec, the table pg_attribute has 3.2MB
and 64 tps, 6446 transaction

test02.sql

INSERT INTO foo SELECT random()*100, random()*1000 FROM generate_series(1,1000);
ANALYZE foo;
SELECT sum(a), sum(b) FROM foo;
DELETE FROM foo; -- simulate disconnect


after 100 sec, 1688 tps, 168830 transactions

So performance is absolutely different as we expected.

From my perspective, this functionality is great.
Yes, frequent ddl causes catalog bloat, GTT avoids this problem.


Todo:

pg_table_size function doesn't work
Do you mean that function pg_table_size() need get the storage space used by the one GTT in the entire db(include all session) .

It's question how much GTT tables should be similar to classic tables. But the reporting in psql should to work \dt+, \l+, \di+
Get it, I will fix it.




Regards

Pavel


Wenjing





2020年1月6日 上午4:06,Tomas Vondra <tomas.vondra@2ndquadrant.com> 写道:

Hi,

I think we need to do something with having two patches aiming to add
global temporary tables:

[1] https://commitfest.postgresql.org/26/2349/

[2] https://commitfest.postgresql.org/26/2233/

As a reviewer I have no idea which of the threads to look at - certainly
not without reading both threads, which I doubt anyone will really do.
The reviews and discussions are somewhat intermixed between those two
threads, which makes it even more confusing.

I think we should agree on a minimal patch combining the necessary/good
bits from the various patches, and terminate one of the threads (i.e.
mark it as rejected or RWF). And we need to do that now, otherwise
there's about 0% chance of getting this into v13.

In general, I agree with the sentiment Rober expressed in [1] - the
patch needs to be as small as possible, not adding "nice to have"
features (like support for parallel queries - I very much doubt just
using shared instead of local buffers is enough to make it work.)

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:

> 2020年1月13日 下午4:08,Konstantin Knizhnik <k.knizhnik@postgrespro.ru> 写道:
>
>
>
> On 12.01.2020 4:51, Tomas Vondra wrote:
>> On Fri, Jan 10, 2020 at 11:47:42AM +0300, Konstantin Knizhnik wrote:
>>>
>>>
>>> On 09.01.2020 19:48, Tomas Vondra wrote:
>>>>
>>>>> The most complex and challenged task is to support GTT for all kind of indexes. Unfortunately I can not proposed
somegood universal solution for it. 
>>>>> Just patching all existed indexes implementation seems to be the only choice.
>>>>>
>>>>
>>>> I haven't looked at the indexing issue closely, but IMO we need to
>>>> ensure that every session sees/uses only indexes on GTT that were
>>>> defined before the seesion started using the table.
>>>
>>> Why? It contradicts with behavior of normal tables.
>>> Assume that you have active clients and at some point of time DBA recognizes that them are spending to much time in
scanningsome GTT. 
>>> It cab create index for this GTT but if existed client will not be able to use this index, then we need somehow
makethis clients to restart their sessions? 
>>> In my patch I have implemented building indexes for GTT on demand: if accessed index on GTT is not yet initialized,
thenit is filled with local data. 
>>
>> Yes, I know the behavior would be different from behavior for regular
>> tables. And yes, it would not allow fixing slow queries in sessions
>> without interrupting those sessions.
>>
>> I proposed just ignoring those new indexes because it seems much simpler
>> than alternative solutions that I can think of, and it's not like those
>> other solutions don't have other issues.
>
> Quit opposite: prohibiting sessions to see indexes created before session start to use GTT requires more efforts. We
needto somehow maintain and check GTT first access time. 
>
>>
>> For example, I've looked at the "on demand" building as implemented in
>> global_private_temp-8.patch, I kinda doubt adding a bunch of index build
>> calls into various places in index code seems somewht suspicious.
>
> We in any case has to initialize GTT indexes on demand even if we prohibit usages of indexes created after first
accessby session to GTT. 
> So the difference is only in one thing: should we just initialize empty index or populate it with local data (if
rulesfor index usability are the same for GTT as for normal tables). 
> From implementation point of view there is no big difference. Actually building index in standard way is even simpler
thanconstructing empty index. Originally I have implemented 
> first approach (I just forgot to consider case when GTT was already user by a session). Then I rewrited it using
secondapproach and patch even became simpler. 
>
>>
>> * brinbuild is added to brinRevmapInitialize, which is meant to
>>   initialize state for scanning. It seems wrong to build the index we're
>>   scanning from this function (layering and all that).
>>
>> * btbuild is called from _bt_getbuf. That seems a bit ... suspicious?
>
>
> As I already mentioned - support of indexes for GTT is one of the most challenged things in my patch.
> I didn't find good and universal solution. So I agreed that call of btbuild from _bt_getbuf may be considered as
suspicious.
> I will be pleased if you or sombody else can propose better elternative and not only for B-Tree, but for all other
indexes.
>
> But as I already wrote above, prohibiting session to used indexes created after first access to GTT doesn't solve the
problem.
> For normal tables (and for local temp tables) indexes are initialized at the time of their creation.
> With GTT it doesn't work, because each session has its own local data of GTT.
> We should either initialize/build index on demand (when it is first accessed), either at the moment of session start
initializeindexes for all existed GTTs. 
> Last options seem to be much worser from my point of view: there may me huge number of GTT and session may not need
toaccess GTT at all. 
>>
>> ... and so on for other index types. Also, what about custom indexes
>> implemented in extensions? It seems a bit strange each of them has to
>> support this separately.
>
> I have already complained about it: my patch supports GTT for all built-in indexes, but custom indexes has to handle
itthemselves. 
> Looks like to provide some generic solution we need to extend index API, providing two diffrent operations: creation
andinitialization. 
> But extending index API is very critical change... And also it doesn't solve the problem with all existed extensions:
themin any case have 
> to be rewritten to implement new API version in order to support GTT.
>>
>> IMHO if this really is the right solution, we need to make it work for
>> existing indexes without having to tweak them individually. Why don't we
>> track a flag whether an index on GTT was initialized in a given session,
>> and if it was not then call the build function before calling any other
>> function from the index AM?
>> But let's talk about other issues caused by "on demand" build. Imagine
>> you have 50 sessions, each using the same GTT with a GB of per-session
>> data. Now you create a new index on the GTT, which forces the sessions
>> to build it's "local" index. Those builds will use maintenance_work_mem
>> each, so 50 * m_w_m. I doubt that's expected/sensible.
>
> I do not see principle difference here with scenario when 50 sessions create (local) temp table,
> populate it with GB of data and create index for it.
I think the problem is that when one session completes the creation of the index on GTT,
it will trigger the other sessions build own local index of GTT in a centralized time.
This will consume a lot of hardware resources (cpu io memory) in a short time,
and even the database service becomes slow, because 50 sessions are building index.
I think this is not what we expected.

>
>>
>> So I suggest we start by just ignoring the *new* indexes, and improve
>> this in the future (by building the indexes on demand or whatever).
>
> Sorry, but still do not agree with this suggestions:
> - it doesn't simplify things
> - it makes behavior of GTT incompatible with normal tables.
> - it doesn't prevent some bad or unexpected behavior which can't be currently reproduced with normal (local) temp
tables.
From a user perspective, this proposal is reasonable.
From an implementation perspective, the same GTT index needs to maintain different states (valid or invalid) in
differentsessions,  
which seems difficult to do in the current framework.

So in my first version, I chose to complete all index creation before using GTT.
I think this will satisfy most use cases.

>
>>
>>>>
>>>> I think someone pointed out pushing stuff directly into the cache is
>>>> rather problematic, but I don't recall the details.
>>>>
>>> I have not encountered any problems, so if you can point me on what is wrong with this approach, I will think about
alternativesolution. 
>>>
>>
>> I meant this comment by Robert:
>>
>> https://www.postgresql.org/message-id/CA%2BTgmoZFWaND4PpT_CJbeu6VZGZKi2rrTuSTL-Ykd97fexTN-w%40mail.gmail.com
>>
> "if any code tried to access the statistics directly from the table, rather than via the caches".
>
> Currently optimizer is accessing statistic though caches. So this approach works. If somebody will rewrite optimizer
orprovide own custom optimizer in extension which access statistic directly 
> then it we really be a problem. But I wonder why bypassing catalog cache may be needed.
>
> Moreover, if we implement alternative solution - for example make pg_statistic a view which combines results for
normaltables and GTT, then existed optimizer has to be rewritten 
> because it can not access statistic in the way it is doing now. And there will be all problem with all existed
extensionswhich are accessing statistic in most natural way - through system cache. 
>
>
>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company




Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:

On 15.01.2020 16:10, 曾文旌(义从) wrote:
>
>> I do not see principle difference here with scenario when 50 sessions create (local) temp table,
>> populate it with GB of data and create index for it.
> I think the problem is that when one session completes the creation of the index on GTT,
> it will trigger the other sessions build own local index of GTT in a centralized time.
> This will consume a lot of hardware resources (cpu io memory) in a short time,
> and even the database service becomes slow, because 50 sessions are building index.
> I think this is not what we expected.


First of all creating index for GTT ni one session doesn't immediately 
initiate building indexes in all other sessions.
Indexes are built on demand. If session is not using this GTT any more, 
then index for it will not build at all.
And if GTT is really are actively used by all sessions, then building 
index and using it for constructing optimal execution plan is better,
then continue to  use sequential scan and read all GTT data from the disk.

And as I already mentioned I do not see some principle difference in 
aspect of resource consumptions comparing with current usage of local 
temp tables.
If we have have many sessions, each creating temp table, populating it 
with data and building index for it, then we will
observe the same CPU utilization and memory resource consumption as in 
case of using GTT and creating index for it.

Sorry, but I still not convinced by your and Tomas arguments.
Yes, building GTT index may cause high memory consumption 
(maintenance_work_mem * n_backends).
But such consumption can be  observed also without GTT and it has to be 
taken in account when choosing value for maintenance_work_mem.
But from my point of view it is much more important to make behavior of 
GTT as much compatible with normal tables as possible.
Also from database administration point of view, necessity to restart 
sessions to make then use new indexes seems to be very strange and 
inconvenient.
Alternatively DBA can address the problem with high memory consumption 
by adjusting maintenance_work_mem, so this solution is more flexible.



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年1月14日 下午9:20,Pavel Stehule <pavel.stehule@gmail.com> 写道:



út 14. 1. 2020 v 14:09 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:
Thank you for review my patch.


2020年1月12日 上午4:27,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi

so 11. 1. 2020 v 15:00 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:
Hi all

This is the latest patch

The updates are as follows:
1. Support global temp Inherit table global temp partition table
2. Support serial column in GTT
3. Provide views pg_gtt_relstats pg_gtt_stats for GTT’s statistics
4. Provide view pg_gtt_attached_pids to manage GTT
5. Provide function pg_list_gtt_relfrozenxids() to manage GTT
6. Alter GTT or rename GTT is allowed under some conditions


Please give me feedback.

I tested the functionality

1. i think so "ON COMMIT PRESERVE ROWS" should be default mode (like local temp tables).
makes sense, I will fix it.


I tested some simple scripts

test01.sql

CREATE TEMP TABLE foo(a int, b int);
INSERT INTO foo SELECT random()*100, random()*1000 FROM generate_series(1,1000);
ANALYZE foo;
SELECT sum(a), sum(b) FROM foo;
DROP TABLE foo; -- simulate disconnect


after 100 sec, the table pg_attribute has 3.2MB
and 64 tps, 6446 transaction

test02.sql

INSERT INTO foo SELECT random()*100, random()*1000 FROM generate_series(1,1000);
ANALYZE foo;
SELECT sum(a), sum(b) FROM foo;
DELETE FROM foo; -- simulate disconnect


after 100 sec, 1688 tps, 168830 transactions

So performance is absolutely different as we expected.

From my perspective, this functionality is great.
Yes, frequent ddl causes catalog bloat, GTT avoids this problem.


Todo:

pg_table_size function doesn't work
Do you mean that function pg_table_size() need get the storage space used by the one GTT in the entire db(include all session) .

It's question how much GTT tables should be similar to classic tables. But the reporting in psql should to work \dt+, \l+, \di+

I have fixed this problem.

Please let me know where I need to improve.

Thanks


Wenjing








Regards

Pavel


Wenjing





2020年1月6日 上午4:06,Tomas Vondra <tomas.vondra@2ndquadrant.com> 写道:

Hi,

I think we need to do something with having two patches aiming to add
global temporary tables:

[1] https://commitfest.postgresql.org/26/2349/

[2] https://commitfest.postgresql.org/26/2233/

As a reviewer I have no idea which of the threads to look at - certainly
not without reading both threads, which I doubt anyone will really do.
The reviews and discussions are somewhat intermixed between those two
threads, which makes it even more confusing.

I think we should agree on a minimal patch combining the necessary/good
bits from the various patches, and terminate one of the threads (i.e.
mark it as rejected or RWF). And we need to do that now, otherwise
there's about 0% chance of getting this into v13.

In general, I agree with the sentiment Rober expressed in [1] - the
patch needs to be as small as possible, not adding "nice to have"
features (like support for parallel queries - I very much doubt just
using shared instead of local buffers is enough to make it work.)

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Attachment

Re: [Proposal] Global temporary tables

From
Erik Rijkers
Date:
On 2020-01-19 18:04, 曾文旌(义从) wrote:
>> 2020年1月14日 下午9:20,Pavel Stehule <pavel.stehule@gmail.com> 写道:
>> út 14. 1. 2020 v 14:09 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com 
>> <mailto:wenjing.zwj@alibaba-inc.com>> napsal:

>> [global_temporary_table_v4-pg13.patch ]

Hi,

This patch doesn't quiet apply for me:

patching file src/backend/access/common/reloptions.c
patching file src/backend/access/gist/gistutil.c
patching file src/backend/access/hash/hash.c
Hunk #1 succeeded at 149 (offset 3 lines).
patching file src/backend/access/heap/heapam_handler.c
patching file src/backend/access/heap/vacuumlazy.c
patching file src/backend/access/nbtree/nbtpage.c
patching file src/backend/access/table/tableam.c
patching file src/backend/access/transam/xlog.c
patching file src/backend/catalog/Makefile
Hunk #1 FAILED at 44.
1 out of 1 hunk FAILED -- saving rejects to file 
src/backend/catalog/Makefile.rej
[...]
    (The rest applies without errors)

src/backend/catalog/Makefile.rej contains:

------------------------
--- src/backend/catalog/Makefile
+++ src/backend/catalog/Makefile
@@ -44,6 +44,8 @@ OBJS = \
      storage.o \
      toasting.o

+OBJS += storage_gtt.o
+
  BKIFILES = postgres.bki postgres.description postgres.shdescription

  include $(top_srcdir)/src/backend/common.mk
------------------------

Can you have a look?


thanks,

Erik Rijkers









Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:

> 2020年1月20日 上午1:32,Erik Rijkers <er@xs4all.nl> 写道:
>
> On 2020-01-19 18:04, 曾文旌(义从) wrote:
>>> 2020年1月14日 下午9:20,Pavel Stehule <pavel.stehule@gmail.com> 写道:
>>> út 14. 1. 2020 v 14:09 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com <mailto:wenjing.zwj@alibaba-inc.com>>
napsal:
>
>>> [global_temporary_table_v4-pg13.patch ]
>
> Hi,
>
> This patch doesn't quiet apply for me:
>
> patching file src/backend/access/common/reloptions.c
> patching file src/backend/access/gist/gistutil.c
> patching file src/backend/access/hash/hash.c
> Hunk #1 succeeded at 149 (offset 3 lines).
> patching file src/backend/access/heap/heapam_handler.c
> patching file src/backend/access/heap/vacuumlazy.c
> patching file src/backend/access/nbtree/nbtpage.c
> patching file src/backend/access/table/tableam.c
> patching file src/backend/access/transam/xlog.c
> patching file src/backend/catalog/Makefile
> Hunk #1 FAILED at 44.
> 1 out of 1 hunk FAILED -- saving rejects to file src/backend/catalog/Makefile.rej
> [...]
>   (The rest applies without errors)
>
> src/backend/catalog/Makefile.rej contains:
>
> ------------------------
> --- src/backend/catalog/Makefile
> +++ src/backend/catalog/Makefile
> @@ -44,6 +44,8 @@ OBJS = \
>     storage.o \
>     toasting.o
>
> +OBJS += storage_gtt.o
> +
> BKIFILES = postgres.bki postgres.description postgres.shdescription
>
> include $(top_srcdir)/src/backend/common.mk
> ------------------------
>
> Can you have a look?
I updated the code and remade the patch.
Please give me feedback if you have any more questions.




> 
> 
> thanks,
> 
> Erik Rijkers
> 
> 
> 
> 
> 


Attachment

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:
Hi

I have a free time this evening, so I will check this patch

I have a one question

+ /* global temp table get relstats from localhash */
+ if (RELATION_IS_GLOBAL_TEMP(rel))
+ {
+ get_gtt_relstats(RelationGetRelid(rel),
+ &relpages, &reltuples, &relallvisible,
+ NULL, NULL);
+ }
+ else
+ {
+ /* coerce values in pg_class to more desirable types */
+ relpages = (BlockNumber) rel->rd_rel->relpages;
+ reltuples = (double) rel->rd_rel->reltuples;
+ relallvisible = (BlockNumber) rel->rd_rel->relallvisible;
+ }

Isbn't possible to fill the rd_rel structure too, so this branching can be reduced?

Regards

Pavel

po 20. 1. 2020 v 17:27 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


> 2020年1月20日 上午1:32,Erik Rijkers <er@xs4all.nl> 写道:
>
> On 2020-01-19 18:04, 曾文旌(义从) wrote:
>>> 2020年1月14日 下午9:20,Pavel Stehule <pavel.stehule@gmail.com> 写道:
>>> út 14. 1. 2020 v 14:09 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com <mailto:wenjing.zwj@alibaba-inc.com>> napsal:
>
>>> [global_temporary_table_v4-pg13.patch ]
>
> Hi,
>
> This patch doesn't quiet apply for me:
>
> patching file src/backend/access/common/reloptions.c
> patching file src/backend/access/gist/gistutil.c
> patching file src/backend/access/hash/hash.c
> Hunk #1 succeeded at 149 (offset 3 lines).
> patching file src/backend/access/heap/heapam_handler.c
> patching file src/backend/access/heap/vacuumlazy.c
> patching file src/backend/access/nbtree/nbtpage.c
> patching file src/backend/access/table/tableam.c
> patching file src/backend/access/transam/xlog.c
> patching file src/backend/catalog/Makefile
> Hunk #1 FAILED at 44.
> 1 out of 1 hunk FAILED -- saving rejects to file src/backend/catalog/Makefile.rej
> [...]
>   (The rest applies without errors)
>
> src/backend/catalog/Makefile.rej contains:
>
> ------------------------
> --- src/backend/catalog/Makefile
> +++ src/backend/catalog/Makefile
> @@ -44,6 +44,8 @@ OBJS = \
>       storage.o \
>       toasting.o
>
> +OBJS += storage_gtt.o
> +
> BKIFILES = postgres.bki postgres.description postgres.shdescription
>
> include $(top_srcdir)/src/backend/common.mk
> ------------------------
>
> Can you have a look?
I updated the code and remade the patch.
Please give me feedback if you have any more questions.




>
>
> thanks,
>
> Erik Rijkers
>
>
>
>
>

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年1月12日 上午4:27,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi

so 11. 1. 2020 v 15:00 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:
Hi all

This is the latest patch

The updates are as follows:
1. Support global temp Inherit table global temp partition table
2. Support serial column in GTT
3. Provide views pg_gtt_relstats pg_gtt_stats for GTT’s statistics
4. Provide view pg_gtt_attached_pids to manage GTT
5. Provide function pg_list_gtt_relfrozenxids() to manage GTT
6. Alter GTT or rename GTT is allowed under some conditions


Please give me feedback.

I tested the functionality

1. i think so "ON COMMIT PRESERVE ROWS" should be default mode (like local temp tables).

ON COMMIT PRESERVE ROWS is default mode now.


Wenjing




I tested some simple scripts

test01.sql

CREATE TEMP TABLE foo(a int, b int);
INSERT INTO foo SELECT random()*100, random()*1000 FROM generate_series(1,1000);
ANALYZE foo;
SELECT sum(a), sum(b) FROM foo;
DROP TABLE foo; -- simulate disconnect


after 100 sec, the table pg_attribute has 3.2MB
and 64 tps, 6446 transaction

test02.sql

INSERT INTO foo SELECT random()*100, random()*1000 FROM generate_series(1,1000);
ANALYZE foo;
SELECT sum(a), sum(b) FROM foo;
DELETE FROM foo; -- simulate disconnect


after 100 sec, 1688 tps, 168830 transactions

So performance is absolutely different as we expected.

From my perspective, this functionality is great.

Todo:

pg_table_size function doesn't work

Regards

Pavel


Wenjing





2020年1月6日 上午4:06,Tomas Vondra <tomas.vondra@2ndquadrant.com> 写道:

Hi,

I think we need to do something with having two patches aiming to add
global temporary tables:

[1] https://commitfest.postgresql.org/26/2349/

[2] https://commitfest.postgresql.org/26/2233/

As a reviewer I have no idea which of the threads to look at - certainly
not without reading both threads, which I doubt anyone will really do.
The reviews and discussions are somewhat intermixed between those two
threads, which makes it even more confusing.

I think we should agree on a minimal patch combining the necessary/good
bits from the various patches, and terminate one of the threads (i.e.
mark it as rejected or RWF). And we need to do that now, otherwise
there's about 0% chance of getting this into v13.

In general, I agree with the sentiment Rober expressed in [1] - the
patch needs to be as small as possible, not adding "nice to have"
features (like support for parallel queries - I very much doubt just
using shared instead of local buffers is enough to make it work.)

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Attachment

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


út 21. 1. 2020 v 9:46 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年1月12日 上午4:27,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi

so 11. 1. 2020 v 15:00 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:
Hi all

This is the latest patch

The updates are as follows:
1. Support global temp Inherit table global temp partition table
2. Support serial column in GTT
3. Provide views pg_gtt_relstats pg_gtt_stats for GTT’s statistics
4. Provide view pg_gtt_attached_pids to manage GTT
5. Provide function pg_list_gtt_relfrozenxids() to manage GTT
6. Alter GTT or rename GTT is allowed under some conditions


Please give me feedback.

I tested the functionality

1. i think so "ON COMMIT PRESERVE ROWS" should be default mode (like local temp tables).

ON COMMIT PRESERVE ROWS is default mode now.

Thank you

* I tried to create global temp table with index. When I tried to drop this table (and this table was used by second instance), then I got error message

postgres=# drop table foo;
ERROR:  can not drop index when other backend attached this global temp table

It is expected, but it is not too much user friendly. Is better to check if you can drop table, then lock it, and then drop all objects.

* tab complete can be nice for CREATE GLOBAL TEMP table

\dt+ \di+ doesn't work correctly, or maybe I don't understand to the implementation.

I see same size in all sessions. Global temp tables shares same files?

Regards

Pavel





Wenjing




I tested some simple scripts

test01.sql

CREATE TEMP TABLE foo(a int, b int);
INSERT INTO foo SELECT random()*100, random()*1000 FROM generate_series(1,1000);
ANALYZE foo;
SELECT sum(a), sum(b) FROM foo;
DROP TABLE foo; -- simulate disconnect


after 100 sec, the table pg_attribute has 3.2MB
and 64 tps, 6446 transaction

test02.sql

INSERT INTO foo SELECT random()*100, random()*1000 FROM generate_series(1,1000);
ANALYZE foo;
SELECT sum(a), sum(b) FROM foo;
DELETE FROM foo; -- simulate disconnect


after 100 sec, 1688 tps, 168830 transactions

So performance is absolutely different as we expected.

From my perspective, this functionality is great.

Todo:

pg_table_size function doesn't work

Regards

Pavel


Wenjing





2020年1月6日 上午4:06,Tomas Vondra <tomas.vondra@2ndquadrant.com> 写道:

Hi,

I think we need to do something with having two patches aiming to add
global temporary tables:

[1] https://commitfest.postgresql.org/26/2349/

[2] https://commitfest.postgresql.org/26/2233/

As a reviewer I have no idea which of the threads to look at - certainly
not without reading both threads, which I doubt anyone will really do.
The reviews and discussions are somewhat intermixed between those two
threads, which makes it even more confusing.

I think we should agree on a minimal patch combining the necessary/good
bits from the various patches, and terminate one of the threads (i.e.
mark it as rejected or RWF). And we need to do that now, otherwise
there's about 0% chance of getting this into v13.

In general, I agree with the sentiment Rober expressed in [1] - the
patch needs to be as small as possible, not adding "nice to have"
features (like support for parallel queries - I very much doubt just
using shared instead of local buffers is enough to make it work.)

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年1月21日 下午1:43,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi

I have a free time this evening, so I will check this patch

I have a one question

+ /* global temp table get relstats from localhash */
+ if (RELATION_IS_GLOBAL_TEMP(rel))
+ {
+ get_gtt_relstats(RelationGetRelid(rel),
+ &relpages, &reltuples, &relallvisible,
+ NULL, NULL);
+ }
+ else
+ {
+ /* coerce values in pg_class to more desirable types */
+ relpages = (BlockNumber) rel->rd_rel->relpages;
+ reltuples = (double) rel->rd_rel->reltuples;
+ relallvisible = (BlockNumber) rel->rd_rel->relallvisible;
+ }

Isbn't possible to fill the rd_rel structure too, so this branching can be reduced?
I'll make some improvements to optimize this part of the code.


Regards

Pavel

po 20. 1. 2020 v 17:27 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


> 2020年1月20日 上午1:32,Erik Rijkers <er@xs4all.nl> 写道:
>
> On 2020-01-19 18:04, 曾文旌(义从) wrote:
>>> 2020年1月14日 下午9:20,Pavel Stehule <pavel.stehule@gmail.com> 写道:
>>> út 14. 1. 2020 v 14:09 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com <mailto:wenjing.zwj@alibaba-inc.com>> napsal:
>
>>> [global_temporary_table_v4-pg13.patch ]
>
> Hi,
>
> This patch doesn't quiet apply for me:
>
> patching file src/backend/access/common/reloptions.c
> patching file src/backend/access/gist/gistutil.c
> patching file src/backend/access/hash/hash.c
> Hunk #1 succeeded at 149 (offset 3 lines).
> patching file src/backend/access/heap/heapam_handler.c
> patching file src/backend/access/heap/vacuumlazy.c
> patching file src/backend/access/nbtree/nbtpage.c
> patching file src/backend/access/table/tableam.c
> patching file src/backend/access/transam/xlog.c
> patching file src/backend/catalog/Makefile
> Hunk #1 FAILED at 44.
> 1 out of 1 hunk FAILED -- saving rejects to file src/backend/catalog/Makefile.rej
> [...]
>   (The rest applies without errors)
>
> src/backend/catalog/Makefile.rej contains:
>
> ------------------------
> --- src/backend/catalog/Makefile
> +++ src/backend/catalog/Makefile
> @@ -44,6 +44,8 @@ OBJS = \
>       storage.o \
>       toasting.o
>
> +OBJS += storage_gtt.o
> +
> BKIFILES = postgres.bki postgres.description postgres.shdescription
>
> include $(top_srcdir)/src/backend/common.mk
> ------------------------
>
> Can you have a look?
I updated the code and remade the patch.
Please give me feedback if you have any more questions.




>
>
> thanks,
>
> Erik Rijkers
>
>
>
>
>


Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年1月22日 上午2:51,Pavel Stehule <pavel.stehule@gmail.com> 写道:



út 21. 1. 2020 v 9:46 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年1月12日 上午4:27,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi

so 11. 1. 2020 v 15:00 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:
Hi all

This is the latest patch

The updates are as follows:
1. Support global temp Inherit table global temp partition table
2. Support serial column in GTT
3. Provide views pg_gtt_relstats pg_gtt_stats for GTT’s statistics
4. Provide view pg_gtt_attached_pids to manage GTT
5. Provide function pg_list_gtt_relfrozenxids() to manage GTT
6. Alter GTT or rename GTT is allowed under some conditions


Please give me feedback.

I tested the functionality

1. i think so "ON COMMIT PRESERVE ROWS" should be default mode (like local temp tables).

ON COMMIT PRESERVE ROWS is default mode now.

Thank you

* I tried to create global temp table with index. When I tried to drop this table (and this table was used by second instance), then I got error message

postgres=# drop table foo;
ERROR:  can not drop index when other backend attached this global temp table

It is expected, but it is not too much user friendly. Is better to check if you can drop table, then lock it, and then drop all objects.
I don't understand what needs to be improved. Could you describe it in detail?


* tab complete can be nice for CREATE GLOBAL TEMP table
Yes, I will improve it.

\dt+ \di+ doesn't work correctly, or maybe I don't understand to the implementation.


postgres=# create table t(a int primary key);
CREATE TABLE
postgres=# create global temp table gt(a int primary key);
CREATE TABLE
postgres=# insert into t values(generate_series(1,10000));
INSERT 0 10000
postgres=# insert into gt values(generate_series(1,10000));
INSERT 0 10000

postgres=# \dt+
                            List of relations
 Schema | Name | Type  |    Owner    | Persistence |  Size  | Description 
--------+------+-------+-------------+-------------+--------+-------------
 public | gt   | table | wenjing.zwj | session     | 384 kB | 
 public | t    | table | wenjing.zwj | permanent   | 384 kB | 
(2 rows)

postgres=# \di+
                                  List of relations
 Schema |  Name   | Type  |    Owner    | Table | Persistence |  Size  | Description 
--------+---------+-------+-------------+-------+-------------+--------+-------------
 public | gt_pkey | index | wenjing.zwj | gt    | session     | 240 kB | 
 public | t_pkey  | index | wenjing.zwj | t     | permanent   | 240 kB | 
(2 rows)


I see same size in all sessions. Global temp tables shares same files?
No, they use their own files.
But \dt+ \di+ counts the total file sizes in all sessions for each GTT.



Wenjing


Regards

Pavel





Wenjing




I tested some simple scripts

test01.sql

CREATE TEMP TABLE foo(a int, b int);
INSERT INTO foo SELECT random()*100, random()*1000 FROM generate_series(1,1000);
ANALYZE foo;
SELECT sum(a), sum(b) FROM foo;
DROP TABLE foo; -- simulate disconnect


after 100 sec, the table pg_attribute has 3.2MB
and 64 tps, 6446 transaction

test02.sql

INSERT INTO foo SELECT random()*100, random()*1000 FROM generate_series(1,1000);
ANALYZE foo;
SELECT sum(a), sum(b) FROM foo;
DELETE FROM foo; -- simulate disconnect


after 100 sec, 1688 tps, 168830 transactions

So performance is absolutely different as we expected.

From my perspective, this functionality is great.

Todo:

pg_table_size function doesn't work

Regards

Pavel


Wenjing





2020年1月6日 上午4:06,Tomas Vondra <tomas.vondra@2ndquadrant.com> 写道:

Hi,

I think we need to do something with having two patches aiming to add
global temporary tables:

[1] https://commitfest.postgresql.org/26/2349/

[2] https://commitfest.postgresql.org/26/2233/

As a reviewer I have no idea which of the threads to look at - certainly
not without reading both threads, which I doubt anyone will really do.
The reviews and discussions are somewhat intermixed between those two
threads, which makes it even more confusing.

I think we should agree on a minimal patch combining the necessary/good
bits from the various patches, and terminate one of the threads (i.e.
mark it as rejected or RWF). And we need to do that now, otherwise
there's about 0% chance of getting this into v13.

In general, I agree with the sentiment Rober expressed in [1] - the
patch needs to be as small as possible, not adding "nice to have"
features (like support for parallel queries - I very much doubt just
using shared instead of local buffers is enough to make it work.)

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


st 22. 1. 2020 v 7:16 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年1月22日 上午2:51,Pavel Stehule <pavel.stehule@gmail.com> 写道:



út 21. 1. 2020 v 9:46 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年1月12日 上午4:27,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi

so 11. 1. 2020 v 15:00 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:
Hi all

This is the latest patch

The updates are as follows:
1. Support global temp Inherit table global temp partition table
2. Support serial column in GTT
3. Provide views pg_gtt_relstats pg_gtt_stats for GTT’s statistics
4. Provide view pg_gtt_attached_pids to manage GTT
5. Provide function pg_list_gtt_relfrozenxids() to manage GTT
6. Alter GTT or rename GTT is allowed under some conditions


Please give me feedback.

I tested the functionality

1. i think so "ON COMMIT PRESERVE ROWS" should be default mode (like local temp tables).

ON COMMIT PRESERVE ROWS is default mode now.

Thank you

* I tried to create global temp table with index. When I tried to drop this table (and this table was used by second instance), then I got error message

postgres=# drop table foo;
ERROR:  can not drop index when other backend attached this global temp table

It is expected, but it is not too much user friendly. Is better to check if you can drop table, then lock it, and then drop all objects.
I don't understand what needs to be improved. Could you describe it in detail?

the error messages should be some like

can not drop table when other backend attached this global temp table.

It is little bit messy, when you try to drop table and you got message about index



* tab complete can be nice for CREATE GLOBAL TEMP table
Yes, I will improve it.

\dt+ \di+ doesn't work correctly, or maybe I don't understand to the implementation.


postgres=# create table t(a int primary key);
CREATE TABLE
postgres=# create global temp table gt(a int primary key);
CREATE TABLE
postgres=# insert into t values(generate_series(1,10000));
INSERT 0 10000
postgres=# insert into gt values(generate_series(1,10000));
INSERT 0 10000

postgres=# \dt+
                            List of relations
 Schema | Name | Type  |    Owner    | Persistence |  Size  | Description 
--------+------+-------+-------------+-------------+--------+-------------
 public | gt   | table | wenjing.zwj | session     | 384 kB | 
 public | t    | table | wenjing.zwj | permanent   | 384 kB | 
(2 rows)

postgres=# \di+
                                  List of relations
 Schema |  Name   | Type  |    Owner    | Table | Persistence |  Size  | Description 
--------+---------+-------+-------------+-------+-------------+--------+-------------
 public | gt_pkey | index | wenjing.zwj | gt    | session     | 240 kB | 
 public | t_pkey  | index | wenjing.zwj | t     | permanent   | 240 kB | 
(2 rows)


I see same size in all sessions. Global temp tables shares same files?
No, they use their own files.
But \dt+ \di+ counts the total file sizes in all sessions for each GTT.

I think so it is wrong. The data are independent and the sizes should be independent too




Wenjing


Regards

Pavel





Wenjing




I tested some simple scripts

test01.sql

CREATE TEMP TABLE foo(a int, b int);
INSERT INTO foo SELECT random()*100, random()*1000 FROM generate_series(1,1000);
ANALYZE foo;
SELECT sum(a), sum(b) FROM foo;
DROP TABLE foo; -- simulate disconnect


after 100 sec, the table pg_attribute has 3.2MB
and 64 tps, 6446 transaction

test02.sql

INSERT INTO foo SELECT random()*100, random()*1000 FROM generate_series(1,1000);
ANALYZE foo;
SELECT sum(a), sum(b) FROM foo;
DELETE FROM foo; -- simulate disconnect


after 100 sec, 1688 tps, 168830 transactions

So performance is absolutely different as we expected.

From my perspective, this functionality is great.

Todo:

pg_table_size function doesn't work

Regards

Pavel


Wenjing





2020年1月6日 上午4:06,Tomas Vondra <tomas.vondra@2ndquadrant.com> 写道:

Hi,

I think we need to do something with having two patches aiming to add
global temporary tables:

[1] https://commitfest.postgresql.org/26/2349/

[2] https://commitfest.postgresql.org/26/2233/

As a reviewer I have no idea which of the threads to look at - certainly
not without reading both threads, which I doubt anyone will really do.
The reviews and discussions are somewhat intermixed between those two
threads, which makes it even more confusing.

I think we should agree on a minimal patch combining the necessary/good
bits from the various patches, and terminate one of the threads (i.e.
mark it as rejected or RWF). And we need to do that now, otherwise
there's about 0% chance of getting this into v13.

In general, I agree with the sentiment Rober expressed in [1] - the
patch needs to be as small as possible, not adding "nice to have"
features (like support for parallel queries - I very much doubt just
using shared instead of local buffers is enough to make it work.)

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年1月22日 下午2:31,Pavel Stehule <pavel.stehule@gmail.com> 写道:



st 22. 1. 2020 v 7:16 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年1月22日 上午2:51,Pavel Stehule <pavel.stehule@gmail.com> 写道:



út 21. 1. 2020 v 9:46 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年1月12日 上午4:27,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi

so 11. 1. 2020 v 15:00 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:
Hi all

This is the latest patch

The updates are as follows:
1. Support global temp Inherit table global temp partition table
2. Support serial column in GTT
3. Provide views pg_gtt_relstats pg_gtt_stats for GTT’s statistics
4. Provide view pg_gtt_attached_pids to manage GTT
5. Provide function pg_list_gtt_relfrozenxids() to manage GTT
6. Alter GTT or rename GTT is allowed under some conditions


Please give me feedback.

I tested the functionality

1. i think so "ON COMMIT PRESERVE ROWS" should be default mode (like local temp tables).

ON COMMIT PRESERVE ROWS is default mode now.

Thank you

* I tried to create global temp table with index. When I tried to drop this table (and this table was used by second instance), then I got error message

postgres=# drop table foo;
ERROR:  can not drop index when other backend attached this global temp table

It is expected, but it is not too much user friendly. Is better to check if you can drop table, then lock it, and then drop all objects.
I don't understand what needs to be improved. Could you describe it in detail?

the error messages should be some like

can not drop table when other backend attached this global temp table.

It is little bit messy, when you try to drop table and you got message about index
It has been repaired in global_temporary_table_v7-pg13.patch




* tab complete can be nice for CREATE GLOBAL TEMP table
Yes, I will improve it.
It has been repaired in global_temporary_table_v7-pg13.patch


\dt+ \di+ doesn't work correctly, or maybe I don't understand to the implementation.


postgres=# create table t(a int primary key);
CREATE TABLE
postgres=# create global temp table gt(a int primary key);
CREATE TABLE
postgres=# insert into t values(generate_series(1,10000));
INSERT 0 10000
postgres=# insert into gt values(generate_series(1,10000));
INSERT 0 10000

postgres=# \dt+
                            List of relations
 Schema | Name | Type  |    Owner    | Persistence |  Size  | Description 
--------+------+-------+-------------+-------------+--------+-------------
 public | gt   | table | wenjing.zwj | session     | 384 kB | 
 public | t    | table | wenjing.zwj | permanent   | 384 kB | 
(2 rows)

postgres=# \di+
                                  List of relations
 Schema |  Name   | Type  |    Owner    | Table | Persistence |  Size  | Description 
--------+---------+-------+-------------+-------+-------------+--------+-------------
 public | gt_pkey | index | wenjing.zwj | gt    | session     | 240 kB | 
 public | t_pkey  | index | wenjing.zwj | t     | permanent   | 240 kB | 
(2 rows)


I see same size in all sessions. Global temp tables shares same files?
No, they use their own files.
But \dt+ \di+ counts the total file sizes in all sessions for each GTT.

I think so it is wrong. The data are independent and the sizes should be independent too
It has been repaired in global_temporary_table_v7-pg13.patch.


Wenjing







Wenjing


Regards

Pavel





Wenjing




I tested some simple scripts

test01.sql

CREATE TEMP TABLE foo(a int, b int);
INSERT INTO foo SELECT random()*100, random()*1000 FROM generate_series(1,1000);
ANALYZE foo;
SELECT sum(a), sum(b) FROM foo;
DROP TABLE foo; -- simulate disconnect


after 100 sec, the table pg_attribute has 3.2MB
and 64 tps, 6446 transaction

test02.sql

INSERT INTO foo SELECT random()*100, random()*1000 FROM generate_series(1,1000);
ANALYZE foo;
SELECT sum(a), sum(b) FROM foo;
DELETE FROM foo; -- simulate disconnect


after 100 sec, 1688 tps, 168830 transactions

So performance is absolutely different as we expected.

From my perspective, this functionality is great.

Todo:

pg_table_size function doesn't work

Regards

Pavel


Wenjing





2020年1月6日 上午4:06,Tomas Vondra <tomas.vondra@2ndquadrant.com> 写道:

Hi,

I think we need to do something with having two patches aiming to add
global temporary tables:

[1] https://commitfest.postgresql.org/26/2349/

[2] https://commitfest.postgresql.org/26/2233/

As a reviewer I have no idea which of the threads to look at - certainly
not without reading both threads, which I doubt anyone will really do.
The reviews and discussions are somewhat intermixed between those two
threads, which makes it even more confusing.

I think we should agree on a minimal patch combining the necessary/good
bits from the various patches, and terminate one of the threads (i.e.
mark it as rejected or RWF). And we need to do that now, otherwise
there's about 0% chance of getting this into v13.

In general, I agree with the sentiment Rober expressed in [1] - the
patch needs to be as small as possible, not adding "nice to have"
features (like support for parallel queries - I very much doubt just
using shared instead of local buffers is enough to make it work.)

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Attachment

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年1月22日 下午1:29,曾文旌(义从) <wenjing.zwj@alibaba-inc.com> 写道:



2020年1月21日 下午1:43,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi

I have a free time this evening, so I will check this patch

I have a one question

+ /* global temp table get relstats from localhash */
+ if (RELATION_IS_GLOBAL_TEMP(rel))
+ {
+ get_gtt_relstats(RelationGetRelid(rel),
+ &relpages, &reltuples, &relallvisible,
+ NULL, NULL);
+ }
+ else
+ {
+ /* coerce values in pg_class to more desirable types */
+ relpages = (BlockNumber) rel->rd_rel->relpages;
+ reltuples = (double) rel->rd_rel->reltuples;
+ relallvisible = (BlockNumber) rel->rd_rel->relallvisible;
+ }

Isbn't possible to fill the rd_rel structure too, so this branching can be reduced?
I'll make some improvements to optimize this part of the code.
I'm trying to improve this part of the implementation in global_temporary_table_v7-pg13.patch
Please check my patch and give me feedback.


Thanks

Wenjing





Regards

Pavel

po 20. 1. 2020 v 17:27 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


> 2020年1月20日 上午1:32,Erik Rijkers <er@xs4all.nl> 写道:
>
> On 2020-01-19 18:04, 曾文旌(义从) wrote:
>>> 2020年1月14日 下午9:20,Pavel Stehule <pavel.stehule@gmail.com> 写道:
>>> út 14. 1. 2020 v 14:09 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com <mailto:wenjing.zwj@alibaba-inc.com>> napsal:
>
>>> [global_temporary_table_v4-pg13.patch ]
>
> Hi,
>
> This patch doesn't quiet apply for me:
>
> patching file src/backend/access/common/reloptions.c
> patching file src/backend/access/gist/gistutil.c
> patching file src/backend/access/hash/hash.c
> Hunk #1 succeeded at 149 (offset 3 lines).
> patching file src/backend/access/heap/heapam_handler.c
> patching file src/backend/access/heap/vacuumlazy.c
> patching file src/backend/access/nbtree/nbtpage.c
> patching file src/backend/access/table/tableam.c
> patching file src/backend/access/transam/xlog.c
> patching file src/backend/catalog/Makefile
> Hunk #1 FAILED at 44.
> 1 out of 1 hunk FAILED -- saving rejects to file src/backend/catalog/Makefile.rej
> [...]
>   (The rest applies without errors)
>
> src/backend/catalog/Makefile.rej contains:
>
> ------------------------
> --- src/backend/catalog/Makefile
> +++ src/backend/catalog/Makefile
> @@ -44,6 +44,8 @@ OBJS = \
>       storage.o \
>       toasting.o
>
> +OBJS += storage_gtt.o
> +
> BKIFILES = postgres.bki postgres.description postgres.shdescription
>
> include $(top_srcdir)/src/backend/common.mk
> ------------------------
>
> Can you have a look?
I updated the code and remade the patch.
Please give me feedback if you have any more questions.




>
>
> thanks,
>
> Erik Rijkers
>
>
>
>
>



Attachment

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


čt 23. 1. 2020 v 17:28 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年1月22日 下午1:29,曾文旌(义从) <wenjing.zwj@alibaba-inc.com> 写道:



2020年1月21日 下午1:43,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi

I have a free time this evening, so I will check this patch

I have a one question

+ /* global temp table get relstats from localhash */
+ if (RELATION_IS_GLOBAL_TEMP(rel))
+ {
+ get_gtt_relstats(RelationGetRelid(rel),
+ &relpages, &reltuples, &relallvisible,
+ NULL, NULL);
+ }
+ else
+ {
+ /* coerce values in pg_class to more desirable types */
+ relpages = (BlockNumber) rel->rd_rel->relpages;
+ reltuples = (double) rel->rd_rel->reltuples;
+ relallvisible = (BlockNumber) rel->rd_rel->relallvisible;
+ }

Isbn't possible to fill the rd_rel structure too, so this branching can be reduced?
I'll make some improvements to optimize this part of the code.
I'm trying to improve this part of the implementation in global_temporary_table_v7-pg13.patch
Please check my patch and give me feedback.


It is looking better, still there are some strange things (I didn't tested functionality yet)

  elog(ERROR, "invalid relpersistence: %c",
  relation->rd_rel->relpersistence);
@@ -3313,6 +3336,10 @@ RelationBuildLocalRelation(const char *relname,
  rel->rd_backend = BackendIdForTempRelations();
  rel->rd_islocaltemp = true;
  break;
+ case RELPERSISTENCE_GLOBAL_TEMP:
+ rel->rd_backend = BackendIdForTempRelations();
+ rel->rd_islocaltemp = true;
+ break;
  default:

+ rel->rd_islocaltemp = true;  <<<<<<< if this is valid, then the name of field "rd_islocaltemp" is not probably best



regards

Pavel


 

Thanks

Wenjing





Regards

Pavel

po 20. 1. 2020 v 17:27 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


> 2020年1月20日 上午1:32,Erik Rijkers <er@xs4all.nl> 写道:
>
> On 2020-01-19 18:04, 曾文旌(义从) wrote:
>>> 2020年1月14日 下午9:20,Pavel Stehule <pavel.stehule@gmail.com> 写道:
>>> út 14. 1. 2020 v 14:09 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com <mailto:wenjing.zwj@alibaba-inc.com>> napsal:
>
>>> [global_temporary_table_v4-pg13.patch ]
>
> Hi,
>
> This patch doesn't quiet apply for me:
>
> patching file src/backend/access/common/reloptions.c
> patching file src/backend/access/gist/gistutil.c
> patching file src/backend/access/hash/hash.c
> Hunk #1 succeeded at 149 (offset 3 lines).
> patching file src/backend/access/heap/heapam_handler.c
> patching file src/backend/access/heap/vacuumlazy.c
> patching file src/backend/access/nbtree/nbtpage.c
> patching file src/backend/access/table/tableam.c
> patching file src/backend/access/transam/xlog.c
> patching file src/backend/catalog/Makefile
> Hunk #1 FAILED at 44.
> 1 out of 1 hunk FAILED -- saving rejects to file src/backend/catalog/Makefile.rej
> [...]
>   (The rest applies without errors)
>
> src/backend/catalog/Makefile.rej contains:
>
> ------------------------
> --- src/backend/catalog/Makefile
> +++ src/backend/catalog/Makefile
> @@ -44,6 +44,8 @@ OBJS = \
>       storage.o \
>       toasting.o
>
> +OBJS += storage_gtt.o
> +
> BKIFILES = postgres.bki postgres.description postgres.shdescription
>
> include $(top_srcdir)/src/backend/common.mk
> ------------------------
>
> Can you have a look?
I updated the code and remade the patch.
Please give me feedback if you have any more questions.




>
>
> thanks,
>
> Erik Rijkers
>
>
>
>
>



Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Sat, Jan 11, 2020 at 8:51 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> I proposed just ignoring those new indexes because it seems much simpler
> than alternative solutions that I can think of, and it's not like those
> other solutions don't have other issues.

+1.

> For example, I've looked at the "on demand" building as implemented in
> global_private_temp-8.patch, I kinda doubt adding a bunch of index build
> calls into various places in index code seems somewht suspicious.

+1. I can't imagine that's a safe or sane thing to do.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:


On 23.01.2020 19:28, 曾文旌(义从) wrote:

I'm trying to improve this part of the implementation in global_temporary_table_v7-pg13.patch
Please check my patch and give me feedback.


Thanks

Wenjing



Below is my short review of the patch:

+    /*
+     * For global temp table only
+     * use AccessExclusiveLock for ensure safety
+     */
+    {
+        {
+            "on_commit_delete_rows",
+            "global temp table on commit options",
+            RELOPT_KIND_HEAP | RELOPT_KIND_PARTITIONED,
+            ShareUpdateExclusiveLock
+        },
+        true
+    },   


The comment seems to be confusing: it says about AccessExclusiveLock but actually uses ShareUpdateExclusiveLock.

-    Assert(TransactionIdIsNormal(onerel->rd_rel->relfrozenxid));
-    Assert(MultiXactIdIsValid(onerel->rd_rel->relminmxid));
+    Assert((RELATION_IS_GLOBAL_TEMP(onerel) && onerel->rd_rel->relfrozenxid == InvalidTransactionId) ||
+        (!RELATION_IS_GLOBAL_TEMP(onerel) && TransactionIdIsNormal(onerel->rd_rel->relfrozenxid)));
+    Assert((RELATION_IS_GLOBAL_TEMP(onerel) && onerel->rd_rel->relminmxid == InvalidMultiXactId) ||
+        (!RELATION_IS_GLOBAL_TEMP(onerel) && MultiXactIdIsValid(onerel->rd_rel->relminmxid)));
 
It is actually equivalent to:

Assert(RELATION_IS_GLOBAL_TEMP(onerel) ^ TransactionIdIsNormal(onerel->rd_rel->relfrozenxid);
Assert(RELATION_IS_GLOBAL_TEMP(onerel) ^ MultiXactIdIsValid(onerel->rd_rel->relminmxid));

+    /* clean temp relation files */
+    if (max_active_gtt > 0)
+        RemovePgTempFiles();
+
     /*
 
I wonder why do we need some special check for GTT here.
From my point of view cleanup at startup of local storage of temp tables should be performed in the same way for local and global temp tables.


-    new_rel_reltup->relfrozenxid = relfrozenxid;
-    new_rel_reltup->relminmxid = relminmxid;
+    /* global temp table not remember transaction info in catalog */
+    if (relpersistence == RELPERSISTENCE_GLOBAL_TEMP)
+    {
+        new_rel_reltup->relfrozenxid = InvalidTransactionId;
+        new_rel_reltup->relminmxid = InvalidMultiXactId;
+    }
+    else
+    {
+        new_rel_reltup->relfrozenxid = relfrozenxid;
+        new_rel_reltup->relminmxid = relminmxid;
+    }
+


Why do we need to do it for GTT?
Did you check that there will be no problems with GTT in case of XID wraparound?
Right now if you create temp table and keep session open, then it will block XID wraparound.

+    /* We allow to drop global temp table only this session use it */
+    if (RELATION_IS_GLOBAL_TEMP(rel))
+    {
+        if (is_other_backend_use_gtt(rel->rd_node))
+            elog(ERROR, "can not drop relation when other backend attached this global temp table");
+    }
+

Here we once again introduce incompatibility with normal (permanent) tables.
Assume that DBA or programmer need to change format of GTT. But there are some active sessions which have used this GTT sometime in the past.
We will not be able to drop this GTT until all this sessions are terminated.
I do not think that it is acceptable behaviour.

+        LOCKMODE    lockmode = AccessExclusiveLock;
+
+        /* truncate global temp table only need RowExclusiveLock */
+        if (get_rel_persistence(rid) == RELPERSISTENCE_GLOBAL_TEMP)
+            lockmode = RowExclusiveLock;


What are the reasons of using RowExclusiveLock for GTT instead of AccessExclusiveLock?
Yes, GTT data is access only by one backend so no locking here seems to be needed at all.
But I wonder what are the motivations/benefits of using weaker lock level here?
There should be no conflicts in any case...

+        /* We allow to create index on global temp table only this session use it */
+        if (is_other_backend_use_gtt(heapRelation->rd_node))
+            elog(ERROR, "can not create index when have other backend attached this global temp table");
+

The same argument as in case of dropping GTT: I do not think that prohibiting DLL operations on GTT used by more than one backend is bad idea.

+    /* global temp table not support foreign key constraint yet */
+    if (RELATION_IS_GLOBAL_TEMP(pkrel))
+        ereport(ERROR,
+                (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+                 errmsg("referenced relation \"%s\" is not a global temp table",
+                        RelationGetRelationName(pkrel))));
+

Why do we need to prohibit foreign key constraint on GTT?

+    /*
+     * Global temp table get frozenxid from MyProc
+     * to avoid the vacuum truncate clog that gtt need.
+     */
+    if (max_active_gtt > 0)
+    {
+        TransactionId oldest_gtt_frozenxid =
+            list_all_session_gtt_frozenxids(0, NULL, NULL, NULL);
+
+        if (TransactionIdIsNormal(oldest_gtt_frozenxid) &&
+            TransactionIdPrecedes(oldest_gtt_frozenxid, newFrozenXid))
+        {
+            ereport(WARNING,
+                (errmsg("global temp table oldest FrozenXid is far in the past"),
+                 errhint("please truncate them or kill those sessions that use them.")));
+            newFrozenXid = oldest_gtt_frozenxid;
+        }
+    }
+

As far as I understand, content of GTT will never be processes by autovacuum.
So who will update frozenxid of GTT?
I see that up_gtt_relstats is invoked when:
- index is created on GTT
- GTT is truncated
- GTT is vacuumed
So unless GTT is explicitly vacuumed by user, its GTT is and them will not be taken in account
when computing new frozen xid value. Autovacumm will produce this warnings (which will ton be visible by end user and only appended to the log).
And at some moment of time wrap around happen and if there still some old active GTT, we will get incorrect results.



--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:

On 23.01.2020 23:47, Robert Haas wrote:
> On Sat, Jan 11, 2020 at 8:51 PM Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> I proposed just ignoring those new indexes because it seems much simpler
>> than alternative solutions that I can think of, and it's not like those
>> other solutions don't have other issues.
> +1.
>
>> For example, I've looked at the "on demand" building as implemented in
>> global_private_temp-8.patch, I kinda doubt adding a bunch of index build
>> calls into various places in index code seems somewht suspicious.
> +1. I can't imagine that's a safe or sane thing to do.
>

As far as you know there are two versions of GTT implementations now.
And we are going to merge them into single patch.
But there are some principle question concerning provided functionality 
which has to be be discussed:
should we prohibit DDL on GTT if there are more than one sessions using 
it. It includes creation/dropping indexes, dropping table, altering table...

If the answer is "yes", then the question whether to populate new 
indexes with data is no relevant at all, because such situation will not 
be possible.
But in this case we will get incompatible behavior with normal 
(permanent) tables and it seems to be very inconvenient from DBA point 
of view:
it will be necessary to enforce all clients to close their sessions to 
perform some DDL manipulations with GTT.
Some DDLs will be very difficult to implement if GTT is used by more 
than one backend, for example altering table schema.

My current solution is to allow creation/droping index on GTT and 
dropping table itself, while prohibit alter schema at all for GTT.
Wenjing's approach is to prohibit any DDL if GTT is used by more than 
one backend.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


pá 24. 1. 2020 v 9:39 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 23.01.2020 23:47, Robert Haas wrote:
> On Sat, Jan 11, 2020 at 8:51 PM Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> I proposed just ignoring those new indexes because it seems much simpler
>> than alternative solutions that I can think of, and it's not like those
>> other solutions don't have other issues.
> +1.
>
>> For example, I've looked at the "on demand" building as implemented in
>> global_private_temp-8.patch, I kinda doubt adding a bunch of index build
>> calls into various places in index code seems somewht suspicious.
> +1. I can't imagine that's a safe or sane thing to do.
>

As far as you know there are two versions of GTT implementations now.
And we are going to merge them into single patch.
But there are some principle question concerning provided functionality
which has to be be discussed:
should we prohibit DDL on GTT if there are more than one sessions using
it. It includes creation/dropping indexes, dropping table, altering table...

If the answer is "yes", then the question whether to populate new
indexes with data is no relevant at all, because such situation will not
be possible.
But in this case we will get incompatible behavior with normal
(permanent) tables and it seems to be very inconvenient from DBA point
of view:
it will be necessary to enforce all clients to close their sessions to
perform some DDL manipulations with GTT.
Some DDLs will be very difficult to implement if GTT is used by more
than one backend, for example altering table schema.

My current solution is to allow creation/droping index on GTT and
dropping table itself, while prohibit alter schema at all for GTT.
Wenjing's approach is to prohibit any DDL if GTT is used by more than
one backend.

When I create index on GTT in one session, then I don't expect creating same index in all other sessions that uses same GTT.

But I can imagine to creating index on GTT enforces index in current session, and for other sessions this index will be invalid to end of session.

Regards

Pavel

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:


On 24.01.2020 12:09, Pavel Stehule wrote:


pá 24. 1. 2020 v 9:39 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 23.01.2020 23:47, Robert Haas wrote:
> On Sat, Jan 11, 2020 at 8:51 PM Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> I proposed just ignoring those new indexes because it seems much simpler
>> than alternative solutions that I can think of, and it's not like those
>> other solutions don't have other issues.
> +1.
>
>> For example, I've looked at the "on demand" building as implemented in
>> global_private_temp-8.patch, I kinda doubt adding a bunch of index build
>> calls into various places in index code seems somewht suspicious.
> +1. I can't imagine that's a safe or sane thing to do.
>

As far as you know there are two versions of GTT implementations now.
And we are going to merge them into single patch.
But there are some principle question concerning provided functionality
which has to be be discussed:
should we prohibit DDL on GTT if there are more than one sessions using
it. It includes creation/dropping indexes, dropping table, altering table...

If the answer is "yes", then the question whether to populate new
indexes with data is no relevant at all, because such situation will not
be possible.
But in this case we will get incompatible behavior with normal
(permanent) tables and it seems to be very inconvenient from DBA point
of view:
it will be necessary to enforce all clients to close their sessions to
perform some DDL manipulations with GTT.
Some DDLs will be very difficult to implement if GTT is used by more
than one backend, for example altering table schema.

My current solution is to allow creation/droping index on GTT and
dropping table itself, while prohibit alter schema at all for GTT.
Wenjing's approach is to prohibit any DDL if GTT is used by more than
one backend.

When I create index on GTT in one session, then I don't expect creating same index in all other sessions that uses same GTT.

But I can imagine to creating index on GTT enforces index in current session, and for other sessions this index will be invalid to end of session.

So there are three possible alternatives:

1. Prohibit index creation of GTT when it used by more than once session.
2. Create index and populate them with data in all sessions using this GTT.
3. Create index only in current session and do not allow to use it in all other sessions already using this GTT (but definitely allow to use it in new sessions).

1 is Wenjing's approach, 2 - is my approach, 3 - is your suggestion :)

I can construct the following table with pro/cons of each approach:

Approach
Compatibility with normal table
User (DBA) friendly
Complexity of implementation
Consistency
1
-
1: requires restart of all sessions to perform operation
2: requires global cache of GTT
3: no man, no problem
2
+
3: if index is created then it is actually needed, isn't it?1: use existed functionality to create index
2: if alter schema is prohibited
3
-
2: requires restart of all sessions to use created index
3: requires some mechanism for prohibiting index created after first session access to GTT
1: can perform DDL but do no see effect of it



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


pá 24. 1. 2020 v 10:43 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 24.01.2020 12:09, Pavel Stehule wrote:


pá 24. 1. 2020 v 9:39 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 23.01.2020 23:47, Robert Haas wrote:
> On Sat, Jan 11, 2020 at 8:51 PM Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> I proposed just ignoring those new indexes because it seems much simpler
>> than alternative solutions that I can think of, and it's not like those
>> other solutions don't have other issues.
> +1.
>
>> For example, I've looked at the "on demand" building as implemented in
>> global_private_temp-8.patch, I kinda doubt adding a bunch of index build
>> calls into various places in index code seems somewht suspicious.
> +1. I can't imagine that's a safe or sane thing to do.
>

As far as you know there are two versions of GTT implementations now.
And we are going to merge them into single patch.
But there are some principle question concerning provided functionality
which has to be be discussed:
should we prohibit DDL on GTT if there are more than one sessions using
it. It includes creation/dropping indexes, dropping table, altering table...

If the answer is "yes", then the question whether to populate new
indexes with data is no relevant at all, because such situation will not
be possible.
But in this case we will get incompatible behavior with normal
(permanent) tables and it seems to be very inconvenient from DBA point
of view:
it will be necessary to enforce all clients to close their sessions to
perform some DDL manipulations with GTT.
Some DDLs will be very difficult to implement if GTT is used by more
than one backend, for example altering table schema.

My current solution is to allow creation/droping index on GTT and
dropping table itself, while prohibit alter schema at all for GTT.
Wenjing's approach is to prohibit any DDL if GTT is used by more than
one backend.

When I create index on GTT in one session, then I don't expect creating same index in all other sessions that uses same GTT.

But I can imagine to creating index on GTT enforces index in current session, and for other sessions this index will be invalid to end of session.

So there are three possible alternatives:

1. Prohibit index creation of GTT when it used by more than once session.
2. Create index and populate them with data in all sessions using this GTT.
3. Create index only in current session and do not allow to use it in all other sessions already using this GTT (but definitely allow to use it in new sessions).

1 is Wenjing's approach, 2 - is my approach, 3 - is your suggestion :)

I can construct the following table with pro/cons of each approach:

Approach
Compatibility with normal table
User (DBA) friendly
Complexity of implementation
Consistency
1
-
1: requires restart of all sessions to perform operation
2: requires global cache of GTT
3: no man, no problem
2
+
3: if index is created then it is actually needed, isn't it?1: use existed functionality to create index
2: if alter schema is prohibited
3
-
2: requires restart of all sessions to use created index
3: requires some mechanism for prohibiting index created after first session access to GTT
1: can perform DDL but do no see effect of it


You will see a effect of DDL in current session (where you did the change), all other sessions should to live without any any change do reconnect or to RESET connect

I don't like 2 - when I do index on global temp table, I don't would to wait on indexing on all other sessions. These operations should be maximally independent.

Regards

Pavel
 


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:


On 24.01.2020 15:15, Pavel Stehule wrote:
You will see a effect of DDL in current session (where you did the change), all other sessions should to live without any any change do reconnect or to RESET connect

Why? I found this requirement quit unnatural and contradicting to the behavior of normal tables.
Actually one of motivation for adding global tempo tables to Postgres is to provide compatibility with Oracle.
Although I know that Oracle design decisions were never considered as  axioms by Postgres community,
but ni case of GTT design I think that we should take in account Oracle approach.
And GTT in Oracle behaves exactly as in my implementation:

https://www.oracletutorial.com/oracle-basics/oracle-global-temporary-table/

It is not clear from this documentation whether index created for GTT in one session can be used in another session which already has some data in this GTT.
But I did experiment with install Oracle server and  can confirm that actually works in this way.

So I do not understand why do we need to complicate our GTT implementation in order to prohibit useful functionality and introduce inconsistency between behavior of normal and global temp tables.



I don't like 2 - when I do index on global temp table, I don't would to wait on indexing on all other sessions. These operations should be maximally independent.


Nobody suggest to wait building index in all sessions.
Indexes will be constructed on demand when session access this table.
If session will no access this table at all, then index will never be constructed.

Once again: logic of dealing with indexes in GTT is very simple.
For normal tables, indexes are initialized at the tame when them are created.
For GTT it is not true. We have to initialize index on demand when it is accessed first time in session.

So it has to be handled in any way.
The question is only whether we should allow creation of index for table already populated with some data?
Actually doesn't require some additional efforts. We can use existed build_index function which initialize index and populates it with data.
So the solution proposed for me is most natural, convenient and simplest solution at the same time. And compatible with Oracle.




Regards

Pavel
 


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


pá 24. 1. 2020 v 14:17 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 24.01.2020 15:15, Pavel Stehule wrote:
You will see a effect of DDL in current session (where you did the change), all other sessions should to live without any any change do reconnect or to RESET connect

Why? I found this requirement quit unnatural and contradicting to the behavior of normal tables.
Actually one of motivation for adding global tempo tables to Postgres is to provide compatibility with Oracle.
Although I know that Oracle design decisions were never considered as  axioms by Postgres community,
but ni case of GTT design I think that we should take in account Oracle approach.
And GTT in Oracle behaves exactly as in my implementation:

https://www.oracletutorial.com/oracle-basics/oracle-global-temporary-table/

It is not clear from this documentation whether index created for GTT in one session can be used in another session which already has some data in this GTT.
But I did experiment with install Oracle server and  can confirm that actually works in this way.

So I do not understand why do we need to complicate our GTT implementation in order to prohibit useful functionality and introduce inconsistency between behavior of normal and global temp tables.



I don't like 2 - when I do index on global temp table, I don't would to wait on indexing on all other sessions. These operations should be maximally independent.


Nobody suggest to wait building index in all sessions.
Indexes will be constructed on demand when session access this table.
If session will no access this table at all, then index will never be constructed.

Once again: logic of dealing with indexes in GTT is very simple.
For normal tables, indexes are initialized at the tame when them are created.
For GTT it is not true. We have to initialize index on demand when it is accessed first time in session.

So it has to be handled in any way.
The question is only whether we should allow creation of index for table already populated with some data?
Actually doesn't require some additional efforts. We can use existed build_index function which initialize index and populates it with data.
So the solution proposed for me is most natural, convenient and simplest solution at the same time. And compatible with Oracle.

I cannot to evaluate your proposal, and I am sure, so you know more about this code.

There is a question if we can allow to build local temp index on global temp table. It is different situation. When I work with global properties personally I prefer total asynchronous implementation of any DDL operations for other than current session. When it is true, then I have not any objection. For me, good enough design of any DDL can be based on catalog change without forcing to living tables.

I see following disadvantage of your proposal. See scenario

1. I have two sessions

A - small GTT with active owner
B - big GTT with some active application.

session A will do new index - it is fast, but if creating index is forced on B on demand (when B was touched), then this operation have to wait after index will be created.

So I afraid build a index on other sessions on GTT when GTT tables in other sessions will not be empty.

Regards

Pavel

 




Regards

Pavel
 


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:
Thank you for review patch.

2020年1月24日 下午4:20,Konstantin Knizhnik <k.knizhnik@postgrespro.ru> 写道:



On 23.01.2020 19:28, 曾文旌(义从) wrote:

I'm trying to improve this part of the implementation in global_temporary_table_v7-pg13.patch
Please check my patch and give me feedback.


Thanks

Wenjing



Below is my short review of the patch:

+    /*
+     * For global temp table only
+     * use AccessExclusiveLock for ensure safety
+     */
+    {
+        {
+            "on_commit_delete_rows",
+            "global temp table on commit options",
+            RELOPT_KIND_HEAP | RELOPT_KIND_PARTITIONED,
+            ShareUpdateExclusiveLock
+        },
+        true
+    },   


The comment seems to be confusing: it says about AccessExclusiveLock but actually uses ShareUpdateExclusiveLock.
There is a problem with the comment description, I will fix it.


-    Assert(TransactionIdIsNormal(onerel->rd_rel->relfrozenxid));
-    Assert(MultiXactIdIsValid(onerel->rd_rel->relminmxid));
+    Assert((RELATION_IS_GLOBAL_TEMP(onerel) && onerel->rd_rel->relfrozenxid == InvalidTransactionId) ||
+        (!RELATION_IS_GLOBAL_TEMP(onerel) && TransactionIdIsNormal(onerel->rd_rel->relfrozenxid)));
+    Assert((RELATION_IS_GLOBAL_TEMP(onerel) && onerel->rd_rel->relminmxid == InvalidMultiXactId) ||
+        (!RELATION_IS_GLOBAL_TEMP(onerel) && MultiXactIdIsValid(onerel->rd_rel->relminmxid)));
 
It is actually equivalent to:

Assert(RELATION_IS_GLOBAL_TEMP(onerel) ^ TransactionIdIsNormal(onerel->rd_rel->relfrozenxid);
Assert(RELATION_IS_GLOBAL_TEMP(onerel) ^ MultiXactIdIsValid(onerel->rd_rel->relminmxid));
Yes, Thank you for your points out, It's simpler.


+    /* clean temp relation files */
+    if (max_active_gtt > 0)
+        RemovePgTempFiles();
+
     /*
 
I wonder why do we need some special check for GTT here.
From my point of view cleanup at startup of local storage of temp tables should be performed in the same way for local and global temp tables.
After oom kill, In autovacuum, the Isolated local temp table will be cleaned like orphan temporary tables. The definition of local temp table is deleted with the storage file. 
But GTT can not do that. So we have the this implementation in my patch.
If you have other solutions, please let me know.



-    new_rel_reltup->relfrozenxid = relfrozenxid;
-    new_rel_reltup->relminmxid = relminmxid;
+    /* global temp table not remember transaction info in catalog */
+    if (relpersistence == RELPERSISTENCE_GLOBAL_TEMP)
+    {
+        new_rel_reltup->relfrozenxid = InvalidTransactionId;
+        new_rel_reltup->relminmxid = InvalidMultiXactId;
+    }
+    else
+    {
+        new_rel_reltup->relfrozenxid = relfrozenxid;
+        new_rel_reltup->relminmxid = relminmxid;
+    }
+


Why do we need to do it for GTT?
Did you check that there will be no problems with GTT in case of XID wraparound?
Right now if you create temp table and keep session open, then it will block XID wraparound.
In my design
1 Because different sessions have different transaction information, I choose to store the transaction information of GTT in MyProc,not catalog.
2 About the XID wraparound problem, the reason is the design of the temp table storage(local temp table and global temp table) that makes it can not to do vacuum by autovacuum. 
It should be completely solve at the storage level.


+    /* We allow to drop global temp table only this session use it */
+    if (RELATION_IS_GLOBAL_TEMP(rel))
+    {
+        if (is_other_backend_use_gtt(rel->rd_node))
+            elog(ERROR, "can not drop relation when other backend attached this global temp table");
+    }
+

Here we once again introduce incompatibility with normal (permanent) tables.
Assume that DBA or programmer need to change format of GTT. But there are some active sessions which have used this GTT sometime in the past.
We will not be able to drop this GTT until all this sessions are terminated.
I do not think that it is acceptable behaviour.
In fact, The dba can still complete the DDL of the GTT.
I've provided a set of functions for this case.
If the dba needs to modify a GTT A(or drop GTT or create index on GTT), he needs to do:
1 Use the pg_gtt_attached_pids view to list the pids for the session that is using the GTT A.
2 Use pg_terminate_backend(pid)terminate they except itself.
3 Do alter GTT A.


+        LOCKMODE    lockmode = AccessExclusiveLock;
+
+        /* truncate global temp table only need RowExclusiveLock */
+        if (get_rel_persistence(rid) == RELPERSISTENCE_GLOBAL_TEMP)
+            lockmode = RowExclusiveLock;


What are the reasons of using RowExclusiveLock for GTT instead of AccessExclusiveLock?
Yes, GTT data is access only by one backend so no locking here seems to be needed at all.
But I wonder what are the motivations/benefits of using weaker lock level here?
1 Truncate GTT deletes only the data in the session, so no need use high-level lock.
2 I think it still needs to be block by DDL of GTT, which is why I use RowExclusiveLock.

There should be no conflicts in any case...

+        /* We allow to create index on global temp table only this session use it */
+        if (is_other_backend_use_gtt(heapRelation->rd_node))
+            elog(ERROR, "can not create index when have other backend attached this global temp table");
+

The same argument as in case of dropping GTT: I do not think that prohibiting DLL operations on GTT used by more than one backend is bad idea.
The idea was to give the GTT almost all the features of a regular table with few code changes.
The current version DBA can still do all DDL for GTT, I've already described.


+    /* global temp table not support foreign key constraint yet */
+    if (RELATION_IS_GLOBAL_TEMP(pkrel))
+        ereport(ERROR,
+                (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+                 errmsg("referenced relation \"%s\" is not a global temp table",
+                        RelationGetRelationName(pkrel))));
+

Why do we need to prohibit foreign key constraint on GTT?
It may be possible to support FK on GTT in later versions. Before that, I need to check some code.


+    /*
+     * Global temp table get frozenxid from MyProc
+     * to avoid the vacuum truncate clog that gtt need.
+     */
+    if (max_active_gtt > 0)
+    {
+        TransactionId oldest_gtt_frozenxid =
+            list_all_session_gtt_frozenxids(0, NULL, NULL, NULL);
+
+        if (TransactionIdIsNormal(oldest_gtt_frozenxid) &&
+            TransactionIdPrecedes(oldest_gtt_frozenxid, newFrozenXid))
+        {
+            ereport(WARNING,
+                (errmsg("global temp table oldest FrozenXid is far in the past"),
+                 errhint("please truncate them or kill those sessions that use them.")));
+            newFrozenXid = oldest_gtt_frozenxid;
+        }
+    }
+

As far as I understand, content of GTT will never be processes by autovacuum.
So who will update frozenxid of GTT?
I see that up_gtt_relstats is invoked when:
- index is created on GTT
- GTT is truncated
- GTT is vacuumed
So unless GTT is explicitly vacuumed by user, its GTT is and them will not be taken in account
when computing new frozen xid value. Autovacumm will produce this warnings (which will ton be visible by end user and only appended to the log).
And at some moment of time wrap around happen and if there still some old active GTT, we will get incorrect results.
I have already described my point in previous emails.

1. The core problem is that the data contains transaction information (xid), which needs to be vacuum(freeze) regularly to avoid running out of xid.
The autovacuum supports vacuum regular table but local temp does not. autovacuum also does not support GTT.

2. However, the difference between the local temp table and the global temp table(GTT) is that
a) For local temp table: one table hava one piece of data. the frozenxid of one local temp table is store in the catalog(pg_class). 
b) For global temp table: each session has a separate copy of data, one GTT may contain maxbackend frozenxid.
and I don't think it's a good idea to keep frozenxid of GTT in the catalog(pg_class). 
It becomes a question: how to handle GTT transaction information?

I agree that problem 1 should be completely solved by a some feature, such as local transactions. It is definitely not included in the GTT patch.
But, I think we need to ensure the durability of GTT data. For example, data in GTT cannot be lost due to the clog being cleaned up. It belongs to problem 2.

For problem 2
If we ignore the frozenxid of GTT, when vacuum truncates the clog that GTT need, the GTT data in some sessions is completely lost.
Perhaps we could consider let aotuvacuum terminate those sessions that contain "too old" data, 
But It's not very friendly, so I didn't choose to implement it in the first version.
Maybe you have a better idea.


Wenjing




--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:


On 24.01.2020 22:39, Pavel Stehule wrote:
I cannot to evaluate your proposal, and I am sure, so you know more about this code.

There is a question if we can allow to build local temp index on global temp table. It is different situation. When I work with global properties personally I prefer total asynchronous implementation of any DDL operations for other than current session. When it is true, then I have not any objection. For me, good enough design of any DDL can be based on catalog change without forcing to living tables.


From my point of view there are two difference uses cases of temp tables:
1. Backend needs some private data source which is specific to this session and has no relation with activities of other sessions.
2. We need a table  containing private session data, but which is used in the same way by all database users.

In the first case current Postgres temp tables works well (if we forget for a moment about all known issues related with temp tables).
Global temp tables are used to address the second scenario.  Assume that we write some stored procedure or implement some business logic  outside database and
what to perform some complex analtic query which requires tepmp table for storing intermediate results. In this case we can create GTT with all needed index at the moment of database initialization
and do not perform any extra DDL during query execution. If will prevent catalog bloating and makes execution of query more efficient.

I do not see any reasons to allow build local indexes for global table. Yes,it can happen that some session will have small amount of data in particular GTT and another - small amount of data in this table. But if access pattern is the same  (and nature of GTT assumes it), then index in either appreciate, either useless in both cases.




I see following disadvantage of your proposal. See scenario

1. I have two sessions

A - small GTT with active owner
B - big GTT with some active application.

session A will do new index - it is fast, but if creating index is forced on B on demand (when B was touched), then this operation have to wait after index will be created.

So I afraid build a index on other sessions on GTT when GTT tables in other sessions will not be empty.


Yes, it is true. But is is not the most realistic scenario from my point of view.
As I explained above, GTT should be used when we need temporary storage accessed in the same way by all clients.
If (as with normal tables) at some moment of time DBA realizes, that efficient execution of some queries needs extra indexes,
then it should be able to do it. It is very inconvenient and unnatural to prohibit DBA to do it until all sessions using this GTT are closed (it may never happen)
or require all sessions to restart to be able to use this index.

So it is possible to imagine different scenarios of working with GTTs.
But from my point of view the only non-contradictory model of their behavior is to make it compatible with normal tables.
And do not forget about compatibility with Oracle. Simplifying of porting existed applications from Oracle to Postgres  may be the
main motivation of adding GTT to Postgres. And making them incompatible with Oracle will be very strange.


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:


On 25.01.2020 18:15, 曾文旌(义从) wrote:
I wonder why do we need some special check for GTT here.
From my point of view cleanup at startup of local storage of temp tables should be performed in the same way for local and global temp tables.
After oom kill, In autovacuum, the Isolated local temp table will be cleaned like orphan temporary tables. The definition of local temp table is deleted with the storage file. 
But GTT can not do that. So we have the this implementation in my patch.
If you have other solutions, please let me know.

I wonder if it is possible that autovacuum or some other Postgres process is killed by OOM and postmaster is not noticing it can doens't restart Postgres instance?
as far as I know, crash of any process connected to Postgres shared memory (and autovacuum definitely has such connection) cause Postgres restart.


In my design
1 Because different sessions have different transaction information, I choose to store the transaction information of GTT in MyProc,not catalog.
2 About the XID wraparound problem, the reason is the design of the temp table storage(local temp table and global temp table) that makes it can not to do vacuum by autovacuum. 
It should be completely solve at the storage level.


My point of view is that vacuuming of temp tables is common problem for local and global temp tables.
So it has to be addressed in the common way and so we should not try to fix this problem only for GTT.


In fact, The dba can still complete the DDL of the GTT.
I've provided a set of functions for this case.
If the dba needs to modify a GTT A(or drop GTT or create index on GTT), he needs to do:
1 Use the pg_gtt_attached_pids view to list the pids for the session that is using the GTT A.
2 Use pg_terminate_backend(pid)terminate they except itself.
3 Do alter GTT A.

IMHO forced terminated of client sessions is not acceptable solution.
And it is not an absolutely necessary requirement.
So from my point of view we should not add such limitations to GTT design.




What are the reasons of using RowExclusiveLock for GTT instead of AccessExclusiveLock?
Yes, GTT data is access only by one backend so no locking here seems to be needed at all.
But I wonder what are the motivations/benefits of using weaker lock level here?
1 Truncate GTT deletes only the data in the session, so no need use high-level lock.
2 I think it still needs to be block by DDL of GTT, which is why I use RowExclusiveLock.

Sorry, I do not understand your arguments: we do not need exclusive lock because we drop only local (private) data
but we need some kind of lock. I agree with 1) and not 2).


There should be no conflicts in any case...

+        /* We allow to create index on global temp table only this session use it */
+        if (is_other_backend_use_gtt(heapRelation->rd_node))
+            elog(ERROR, "can not create index when have other backend attached this global temp table");
+

The same argument as in case of dropping GTT: I do not think that prohibiting DLL operations on GTT used by more than one backend is bad idea.
The idea was to give the GTT almost all the features of a regular table with few code changes.
The current version DBA can still do all DDL for GTT, I've already described.

I absolutely agree with you that GTT should be given the same features as regular tables.
The irony is that this most natural and convenient behavior is most easy to implement without putting some extra restrictions.
Just let indexes for GTT be constructed on demand. It it can be done using the same function used for regular index creation.




+    /* global temp table not support foreign key constraint yet */
+    if (RELATION_IS_GLOBAL_TEMP(pkrel))
+        ereport(ERROR,
+                (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+                 errmsg("referenced relation \"%s\" is not a global temp table",
+                        RelationGetRelationName(pkrel))));
+

Why do we need to prohibit foreign key constraint on GTT?
It may be possible to support FK on GTT in later versions. Before that, I need to check some code.

Ok,  may be approach to prohibit everything except minimally required functionality  is safe and reliable.
But frankly speaking I prefer different approach: if I do not see any contradictions of new feature with existed operations
and it is passing tests, then we should  not prohibit this operations for new feature.


I have already described my point in previous emails.

1. The core problem is that the data contains transaction information (xid), which needs to be vacuum(freeze) regularly to avoid running out of xid.
The autovacuum supports vacuum regular table but local temp does not. autovacuum also does not support GTT.

2. However, the difference between the local temp table and the global temp table(GTT) is that
a) For local temp table: one table hava one piece of data. the frozenxid of one local temp table is store in the catalog(pg_class). 
b) For global temp table: each session has a separate copy of data, one GTT may contain maxbackend frozenxid.
and I don't think it's a good idea to keep frozenxid of GTT in the catalog(pg_class). 
It becomes a question: how to handle GTT transaction information?

I agree that problem 1 should be completely solved by a some feature, such as local transactions. It is definitely not included in the GTT patch.
But, I think we need to ensure the durability of GTT data. For example, data in GTT cannot be lost due to the clog being cleaned up. It belongs to problem 2.

For problem 2
If we ignore the frozenxid of GTT, when vacuum truncates the clog that GTT need, the GTT data in some sessions is completely lost.
Perhaps we could consider let aotuvacuum terminate those sessions that contain "too old" data, 
But It's not very friendly, so I didn't choose to implement it in the first version.
Maybe you have a better idea.

Sorry, I do not have better idea.
I prefer not to address this problem in first version of the patch at all.
fozen_xid of temp table is never changed unless user explicitly invoke vacuum on it.
I do not think that anybody is doing it (because it accentually contains temporary data which is not expected to live long time.
Certainly it is possible to imagine situation when session use GTT to store some local data which is valid during all session life time (which can be large enough).
But I am not sure that it is popular scenario.



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


po 27. 1. 2020 v 10:11 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 24.01.2020 22:39, Pavel Stehule wrote:
I cannot to evaluate your proposal, and I am sure, so you know more about this code.

There is a question if we can allow to build local temp index on global temp table. It is different situation. When I work with global properties personally I prefer total asynchronous implementation of any DDL operations for other than current session. When it is true, then I have not any objection. For me, good enough design of any DDL can be based on catalog change without forcing to living tables.


From my point of view there are two difference uses cases of temp tables:
1. Backend needs some private data source which is specific to this session and has no relation with activities of other sessions.
2. We need a table  containing private session data, but which is used in the same way by all database users.

In the first case current Postgres temp tables works well (if we forget for a moment about all known issues related with temp tables).
Global temp tables are used to address the second scenario.  Assume that we write some stored procedure or implement some business logic  outside database and
what to perform some complex analtic query which requires tepmp table for storing intermediate results. In this case we can create GTT with all needed index at the moment of database initialization
and do not perform any extra DDL during query execution. If will prevent catalog bloating and makes execution of query more efficient.

I do not see any reasons to allow build local indexes for global table. Yes,it can happen that some session will have small amount of data in particular GTT and another - small amount of data in this table. But if access pattern is the same  (and nature of GTT assumes it), then index in either appreciate, either useless in both cases.




I see following disadvantage of your proposal. See scenario

1. I have two sessions

A - small GTT with active owner
B - big GTT with some active application.

session A will do new index - it is fast, but if creating index is forced on B on demand (when B was touched), then this operation have to wait after index will be created.

So I afraid build a index on other sessions on GTT when GTT tables in other sessions will not be empty.


Yes, it is true. But is is not the most realistic scenario from my point of view.
As I explained above, GTT should be used when we need temporary storage accessed in the same way by all clients.
If (as with normal tables) at some moment of time DBA realizes, that efficient execution of some queries needs extra indexes,
then it should be able to do it. It is very inconvenient and unnatural to prohibit DBA to do it until all sessions using this GTT are closed (it may never happen)
or require all sessions to restart to be able to use this index.

So it is possible to imagine different scenarios of working with GTTs.
But from my point of view the only non-contradictory model of their behavior is to make it compatible with normal tables.
And do not forget about compatibility with Oracle. Simplifying of porting existed applications from Oracle to Postgres  may be the
main motivation of adding GTT to Postgres. And making them incompatible with Oracle will be very strange.

I don't think so compatibility with Oracle is valid point in this case. We need GTT, but the mechanism of index building should be designed for Postgres, and for users.

Maybe the method proposed by you can be activated by some option like CREATE INDEX IMMEDIATELY FOR ALL SESSION. When you use GTT without index, then
it should to work some time more, and if you use short life sessions, then index build can be last or almost last operation over table and can be suboptimal.

Anyway, this behave can be changed later without bigger complications - and now I am have strong opinion to prefer don't allow to any DDL (with index creation) on any active GTT in other sessions.
Probably your proposal - build indexes on other sessions when GTT is touched can share code with just modify metadata and wait on session reset or GTT reset

Usually it is not hard problem to refresh sessions, and what I know when you update plpgsql code, it is best practice to refresh session early.




-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年1月24日 上午4:47,Robert Haas <robertmhaas@gmail.com> 写道:

On Sat, Jan 11, 2020 at 8:51 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
I proposed just ignoring those new indexes because it seems much simpler
than alternative solutions that I can think of, and it's not like those
other solutions don't have other issues.

+1.
I complete the implementation of this feature.
When a session x create an index idx_a on GTT A then
For session x, idx_a is valid when after create index.
For session y, before session x create index done, GTT A has some data, then index_a is invalid.
For session z, before session x create index done, GTT A has no data, then index_a is valid.


For example, I've looked at the "on demand" building as implemented in
global_private_temp-8.patch, I kinda doubt adding a bunch of index build
calls into various places in index code seems somewht suspicious.

+1. I can't imagine that's a safe or sane thing to do.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Opinion by Pavel
+ rel->rd_islocaltemp = true;  <<<<<<< if this is valid, then the name of field "rd_islocaltemp" is not probably best
I renamed rd_islocaltemp

Opinion by Konstantin Knizhnik
1 Fixed comments
2 Fixed assertion


Please help me review.


Wenjing

Attachment

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


út 28. 1. 2020 v 17:01 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年1月24日 上午4:47,Robert Haas <robertmhaas@gmail.com> 写道:

On Sat, Jan 11, 2020 at 8:51 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
I proposed just ignoring those new indexes because it seems much simpler
than alternative solutions that I can think of, and it's not like those
other solutions don't have other issues.

+1.
I complete the implementation of this feature.
When a session x create an index idx_a on GTT A then
For session x, idx_a is valid when after create index.
For session y, before session x create index done, GTT A has some data, then index_a is invalid.
For session z, before session x create index done, GTT A has no data, then index_a is valid.


For example, I've looked at the "on demand" building as implemented in
global_private_temp-8.patch, I kinda doubt adding a bunch of index build
calls into various places in index code seems somewht suspicious.

+1. I can't imagine that's a safe or sane thing to do.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Opinion by Pavel
+ rel->rd_islocaltemp = true;  <<<<<<< if this is valid, then the name of field "rd_islocaltemp" is not probably best
I renamed rd_islocaltemp

I don't see any change?



Opinion by Konstantin Knizhnik
1 Fixed comments
2 Fixed assertion


Please help me review.


Wenjing

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年1月29日 上午12:40,Pavel Stehule <pavel.stehule@gmail.com> 写道:



út 28. 1. 2020 v 17:01 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年1月24日 上午4:47,Robert Haas <robertmhaas@gmail.com> 写道:

On Sat, Jan 11, 2020 at 8:51 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
I proposed just ignoring those new indexes because it seems much simpler
than alternative solutions that I can think of, and it's not like those
other solutions don't have other issues.

+1.
I complete the implementation of this feature.
When a session x create an index idx_a on GTT A then
For session x, idx_a is valid when after create index.
For session y, before session x create index done, GTT A has some data, then index_a is invalid.
For session z, before session x create index done, GTT A has no data, then index_a is valid.


For example, I've looked at the "on demand" building as implemented in
global_private_temp-8.patch, I kinda doubt adding a bunch of index build
calls into various places in index code seems somewht suspicious.

+1. I can't imagine that's a safe or sane thing to do.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Opinion by Pavel
+ rel->rd_islocaltemp = true;  <<<<<<< if this is valid, then the name of field "rd_islocaltemp" is not probably best
I renamed rd_islocaltemp

I don't see any change?
Rename rd_islocaltemp to rd_istemp  in global_temporary_table_v8-pg13.patch


Wenjing






Opinion by Konstantin Knizhnik
1 Fixed comments
2 Fixed assertion


Please help me review.


Wenjing


Attachment

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


út 28. 1. 2020 v 18:12 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年1月29日 上午12:40,Pavel Stehule <pavel.stehule@gmail.com> 写道:



út 28. 1. 2020 v 17:01 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年1月24日 上午4:47,Robert Haas <robertmhaas@gmail.com> 写道:

On Sat, Jan 11, 2020 at 8:51 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
I proposed just ignoring those new indexes because it seems much simpler
than alternative solutions that I can think of, and it's not like those
other solutions don't have other issues.

+1.
I complete the implementation of this feature.
When a session x create an index idx_a on GTT A then
For session x, idx_a is valid when after create index.
For session y, before session x create index done, GTT A has some data, then index_a is invalid.
For session z, before session x create index done, GTT A has no data, then index_a is valid.


For example, I've looked at the "on demand" building as implemented in
global_private_temp-8.patch, I kinda doubt adding a bunch of index build
calls into various places in index code seems somewht suspicious.

+1. I can't imagine that's a safe or sane thing to do.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Opinion by Pavel
+ rel->rd_islocaltemp = true;  <<<<<<< if this is valid, then the name of field "rd_islocaltemp" is not probably best
I renamed rd_islocaltemp

I don't see any change?
Rename rd_islocaltemp to rd_istemp  in global_temporary_table_v8-pg13.patch

ok :)

Pavel


Wenjing






Opinion by Konstantin Knizhnik
1 Fixed comments
2 Fixed assertion


Please help me review.


Wenjing


Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


út 28. 1. 2020 v 18:13 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:


út 28. 1. 2020 v 18:12 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年1月29日 上午12:40,Pavel Stehule <pavel.stehule@gmail.com> 写道:



út 28. 1. 2020 v 17:01 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年1月24日 上午4:47,Robert Haas <robertmhaas@gmail.com> 写道:

On Sat, Jan 11, 2020 at 8:51 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
I proposed just ignoring those new indexes because it seems much simpler
than alternative solutions that I can think of, and it's not like those
other solutions don't have other issues.

+1.
I complete the implementation of this feature.
When a session x create an index idx_a on GTT A then
For session x, idx_a is valid when after create index.
For session y, before session x create index done, GTT A has some data, then index_a is invalid.
For session z, before session x create index done, GTT A has no data, then index_a is valid.


For example, I've looked at the "on demand" building as implemented in
global_private_temp-8.patch, I kinda doubt adding a bunch of index build
calls into various places in index code seems somewht suspicious.

+1. I can't imagine that's a safe or sane thing to do.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Opinion by Pavel
+ rel->rd_islocaltemp = true;  <<<<<<< if this is valid, then the name of field "rd_islocaltemp" is not probably best
I renamed rd_islocaltemp

I don't see any change?
Rename rd_islocaltemp to rd_istemp  in global_temporary_table_v8-pg13.patch

ok :)

I found a bug

postgres=# create global temp table x(a int);
CREATE TABLE
postgres=# insert into x values(1);
INSERT 0 1
postgres=# create index on x (a);
CREATE INDEX
postgres=# create index on x((a + 1));
CREATE INDEX
postgres=# analyze x;
WARNING:  oid 16468 not a relation
ANALYZE

other behave looks well for me.

Regards

Pavel


Pavel


Wenjing






Opinion by Konstantin Knizhnik
1 Fixed comments
2 Fixed assertion


Please help me review.


Wenjing


Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:


On 27.01.2020 22:44, Pavel Stehule wrote:

I don't think so compatibility with Oracle is valid point in this case. We need GTT, but the mechanism of index building should be designed for Postgres, and for users.

Maybe the method proposed by you can be activated by some option like CREATE INDEX IMMEDIATELY FOR ALL SESSION. When you use GTT without index, then
it should to work some time more, and if you use short life sessions, then index build can be last or almost last operation over table and can be suboptimal.

Anyway, this behave can be changed later without bigger complications - and now I am have strong opinion to prefer don't allow to any DDL (with index creation) on any active GTT in other sessions.
Probably your proposal - build indexes on other sessions when GTT is touched can share code with just modify metadata and wait on session reset or GTT reset

Well, compatibility with Oracle was never treated as important argument in this group:)
But I hope that you agree that it real argument against your proposal.
Much more important argument is incompatibility with behavior of regular table.
If you propose such incompatibility, then you should have some very strong arguments for such behavior which will definitely confuse users.

But I heard only two arguments:

1. Concurrent building of indexes by all backends may consume much memory (n_backends * maintenance_work_mem) and consume a lot of disk/CPU resources.

First of all it is not completely true. Indexes will be created on demand when GTT will be accessed and chance that all sessions will become building indexes simultaneously is very small.

But what will happen if we prohibit access to this index for existed sessions? If we need index for GTT, then most likely it is used for joins.
If there is no index, then optimizer has to choose some other plan to perform this join. For example use hash join. Hash join also requires memory,
so if all backends will perform such join simultaneously, then them consume (n_backends * work_mem) memory.
Yes, work_mem is used to be smaller than maintenance_work_mem. But in any case DBA has a choice to adjust this parameters to avoid this problem.
And in case of your proposal (prohibit access to this index) you give him no choice to optimize query execution in existed sessions.

Also if all sessions will simultaneously perform sequential scan of GTT instead of building index for it, then them will read the same amount of data and consume comparable CPU time.
So prohibiting access to the indexes will not save us from high resources consumption if all existed sessions are really actively working with this GTT.

2. GTT in one session can contains large amount of data and we need index for it, but small amount of data in another session and we do not need index for it.

Such situation definitely can happen. But it contradicts to the main assumption of GTT use case (that it is accessed in the same way by all sessions).
Also I may be agree with this argument if you propose to create indexes locally for each sessions.
But your proposal is to prohibit access to the index to the sessions which already have populated GTT with data but allow it for sessions which have not accessed this GTT yet.
So if some session stores some data in GTT after index was created, then it will build index for it, doesn't matter whether size of table is small or large.
Why do we make an exception for sessions which already have data in GTT in this case?

So from my point of view both arguments are doubtful and can not explain why rules of index usability for GTT should be different from regular tables.

Usually it is not hard problem to refresh sessions, and what I know when you update plpgsql code, it is best practice to refresh session early.


I know may systems where session is established once client is connected to the system and not closed until client is disconnected.
And any attempt to force termination of the session will cause application errors which are not expected by the client.


Sorry, I think that it is principle point in discussion concerning GTT design.
Implementation of GTT can be changed in future, but it is bad if behavior of GTT will be changed.
It is not clear for me why from the very beginning we should provide inconsistent behavior which is even more difficult to implement than behavior compatible with regular tables.
And say that in the future it can be changed...

Sorry, but I do not consider proposals to create indexes locally for each session (i.e. global tables but private indexes) or use some special complicated SQL syntax constructions like
CREATE INDEX IMMEDIATELY FOR ALL SESSION as some real alternatives which have to be discussed.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年1月29日 上午1:54,Pavel Stehule <pavel.stehule@gmail.com> 写道:



út 28. 1. 2020 v 18:13 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:


út 28. 1. 2020 v 18:12 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年1月29日 上午12:40,Pavel Stehule <pavel.stehule@gmail.com> 写道:



út 28. 1. 2020 v 17:01 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年1月24日 上午4:47,Robert Haas <robertmhaas@gmail.com> 写道:

On Sat, Jan 11, 2020 at 8:51 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
I proposed just ignoring those new indexes because it seems much simpler
than alternative solutions that I can think of, and it's not like those
other solutions don't have other issues.

+1.
I complete the implementation of this feature.
When a session x create an index idx_a on GTT A then
For session x, idx_a is valid when after create index.
For session y, before session x create index done, GTT A has some data, then index_a is invalid.
For session z, before session x create index done, GTT A has no data, then index_a is valid.


For example, I've looked at the "on demand" building as implemented in
global_private_temp-8.patch, I kinda doubt adding a bunch of index build
calls into various places in index code seems somewht suspicious.

+1. I can't imagine that's a safe or sane thing to do.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Opinion by Pavel
+ rel->rd_islocaltemp = true;  <<<<<<< if this is valid, then the name of field "rd_islocaltemp" is not probably best
I renamed rd_islocaltemp

I don't see any change?
Rename rd_islocaltemp to rd_istemp  in global_temporary_table_v8-pg13.patch

ok :)

I found a bug

postgres=# create global temp table x(a int);
CREATE TABLE
postgres=# insert into x values(1);
INSERT 0 1
postgres=# create index on x (a);
CREATE INDEX
postgres=# create index on x((a + 1));
CREATE INDEX
postgres=# analyze x;
WARNING:  oid 16468 not a relation
ANALYZE
Thanks for review.

The index expression need to store statistics on index, I missed it and I'll fix it later.


Wenjing


other behave looks well for me.

Regards

Pavel


Pavel


Wenjing






Opinion by Konstantin Knizhnik
1 Fixed comments
2 Fixed assertion


Please help me review.


Wenjing



Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Mon, Jan 27, 2020 at 4:11 AM Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
> I do not see any reasons to allow build local indexes for global table. Yes,it can happen that some session will have
smallamount of data in particular GTT and another - small amount of data in this table. But if access pattern is the
same (and nature of GTT assumes it), then index in either appreciate, either useless in both cases. 

I agree. I think allowing different backends to have different indexes
is overly complicated.

Regarding another point that was raised, I think it's not a good idea
to prohibit DDL on global temporary tables altogether. It should be
fine to change things when no sessions are using the GTT. Going
further and allowing changes when there are attached sessions would be
nice, but I think we shouldn't try. Getting this feature committed is
going to be a huge amount of work with even a minimal feature set;
complicating the problem by adding what are essentially new
DDL-concurrency features on top of the basic feature seems very much
unwise.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Tue, Jan 28, 2020 at 12:12 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
>> Opinion by Pavel
>> + rel->rd_islocaltemp = true;  <<<<<<< if this is valid, then the name of field "rd_islocaltemp" is not probably
best
>> I renamed rd_islocaltemp
>
> I don't see any change?
>
> Rename rd_islocaltemp to rd_istemp  in global_temporary_table_v8-pg13.patch

In view of commit 6919b7e3294702adc39effd16634b2715d04f012, I think
that this has approximately a 0% chance of being acceptable. If you're
setting a field in a way that is inconsistent with the current use of
the field, you're probably doing it wrong, because the field has an
existing purpose to which new code must conform. And if you're not
doing that, then you don't need to rename it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Wed, Jan 29, 2020 at 3:13 AM Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
> But I heard only two arguments:
>
> 1. Concurrent building of indexes by all backends may consume much memory (n_backends * maintenance_work_mem) and
consumea lot of disk/CPU resources.
 
> 2. GTT in one session can contains large amount of data and we need index for it, but small amount of data in another
sessionand we do not need index for it.
 

You seem to be ignoring the fact that two committers told you this
probably wasn't safe.

Perhaps your view is that those people made no argument, and therefore
you don't have to respond to it. But the onus is not on somebody else
to tell you why a completely novel idea is not safe. The onus is on
you to analyze it in detail and prove that it is safe. What you need
to show is that there is no code anywhere in the system which will be
confused by an index springing into existence at whatever time you're
creating it.

One problem is that there are various backend-local data structures in
the relcache, the planner, and the executor that remember information
about indexes, and that may not respond well to having more indexes
show up unexpectedly. On the one hand, they might crash; on the other
hand, they might ignore the new index when they shouldn't. Another
problem is that the code which creates indexes might fail or misbehave
when run in an environment different from the one in which it
currently runs. I haven't really studied your code, so I don't know
exactly what it does, but for example it would be really bad to try to
build an index while holding a buffer lock, both because it might
cause (low-probability) undetected deadlocks and also because it might
block another process that wants that buffer lock in a
non-interruptible wait state for a long time.

Now, maybe you can make an argument that you only create indexes at
points in the query that are "safe." But I am skeptical, because of
this example:

rhaas=# create table foo (a int primary key, b text, c text, d text);
CREATE TABLE
rhaas=# create function blump() returns trigger as $$begin create
index on foo (b); return new; end$$ language plpgsql;
CREATE FUNCTION
rhaas=# create trigger thud before insert on foo execute function blump();
CREATE TRIGGER
rhaas=# insert into foo (a) select generate_series(1,10);
ERROR:  cannot CREATE INDEX "foo" because it is being used by active
queries in this session
CONTEXT:  SQL statement "create index on foo (b)"
PL/pgSQL function blump() line 1 at SQL statement

That prohibition is there for some reason. Someone did not just decide
to arbitrarily prohibit it. A CREATE INDEX command run in that context
won't run afoul of many of the things that might be problems in other
places -- e.g. there won't be a buffer lock held. Yet, despite the
fact that a trigger context is safe for executing a wide variety of
user-defined code, this particular operation is not allowed here. That
is the sort of thing that should worry you.

At any rate, even if this somehow were or could be made safe,
on-the-fly index creation is a feature that cannot and should not be
combined with a patch to implement global temporary tables. Surely, it
will require a lot of study and work to get the details right. And so
will GTT. As I said in the other email I wrote, this feature is hard
enough without adding this kind of thing to it. There's a reason why I
never got around to implementing this ten years ago when I did
unlogged tables; I was intending that to be a precursor to the GTT
work. I found that it was too hard and I gave up. I'm glad to see
people trying again, but the idea that we can afford to add in extra
features, or frankly that either of the dueling patches on this thread
are close to committable, is just plain wrong.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:

On 29.01.2020 17:47, Robert Haas wrote:
> On Wed, Jan 29, 2020 at 3:13 AM Konstantin Knizhnik
> <k.knizhnik@postgrespro.ru> wrote:
>> But I heard only two arguments:
>>
>> 1. Concurrent building of indexes by all backends may consume much memory (n_backends * maintenance_work_mem) and
consumea lot of disk/CPU resources.
 
>> 2. GTT in one session can contains large amount of data and we need index for it, but small amount of data in
anothersession and we do not need index for it.
 
> You seem to be ignoring the fact that two committers told you this
> probably wasn't safe.
>
> Perhaps your view is that those people made no argument, and therefore
> you don't have to respond to it. But the onus is not on somebody else
> to tell you why a completely novel idea is not safe. The onus is on
> you to analyze it in detail and prove that it is safe. What you need
> to show is that there is no code anywhere in the system which will be
> confused by an index springing into existence at whatever time you're
> creating it.
>
> One problem is that there are various backend-local data structures in
> the relcache, the planner, and the executor that remember information
> about indexes, and that may not respond well to having more indexes
> show up unexpectedly. On the one hand, they might crash; on the other
> hand, they might ignore the new index when they shouldn't. Another
> problem is that the code which creates indexes might fail or misbehave
> when run in an environment different from the one in which it
> currently runs. I haven't really studied your code, so I don't know
> exactly what it does, but for example it would be really bad to try to
> build an index while holding a buffer lock, both because it might
> cause (low-probability) undetected deadlocks and also because it might
> block another process that wants that buffer lock in a
> non-interruptible wait state for a long time.
>
> Now, maybe you can make an argument that you only create indexes at
> points in the query that are "safe." But I am skeptical, because of
> this example:
>
> rhaas=# create table foo (a int primary key, b text, c text, d text);
> CREATE TABLE
> rhaas=# create function blump() returns trigger as $$begin create
> index on foo (b); return new; end$$ language plpgsql;
> CREATE FUNCTION
> rhaas=# create trigger thud before insert on foo execute function blump();
> CREATE TRIGGER
> rhaas=# insert into foo (a) select generate_series(1,10);
> ERROR:  cannot CREATE INDEX "foo" because it is being used by active
> queries in this session
> CONTEXT:  SQL statement "create index on foo (b)"
> PL/pgSQL function blump() line 1 at SQL statement
>
> That prohibition is there for some reason. Someone did not just decide
> to arbitrarily prohibit it. A CREATE INDEX command run in that context
> won't run afoul of many of the things that might be problems in other
> places -- e.g. there won't be a buffer lock held. Yet, despite the
> fact that a trigger context is safe for executing a wide variety of
> user-defined code, this particular operation is not allowed here. That
> is the sort of thing that should worry you.
>
> At any rate, even if this somehow were or could be made safe,
> on-the-fly index creation is a feature that cannot and should not be
> combined with a patch to implement global temporary tables. Surely, it
> will require a lot of study and work to get the details right. And so
> will GTT. As I said in the other email I wrote, this feature is hard
> enough without adding this kind of thing to it. There's a reason why I
> never got around to implementing this ten years ago when I did
> unlogged tables; I was intending that to be a precursor to the GTT
> work. I found that it was too hard and I gave up. I'm glad to see
> people trying again, but the idea that we can afford to add in extra
> features, or frankly that either of the dueling patches on this thread
> are close to committable, is just plain wrong.
>

Sorry, I really didn't consider statements containing word "probably" as 
arguments.
But I agree with you: it is task of developer of new feature to prove 
that proposed approach is safe rather than of reviewers to demonstrate 
that it is unsafe.
Can I provide such proof now? I afraid that not.
But please consider two arguments:

1. Index for GTT in any case has to be initialized on demand. In case of 
regular tables index is initialized at the moment of its creation. In 
case of GTT it doesn't work.
So we should somehow detect that accessed index is not initialized and 
perform lazy initialization of the index.
The only difference with the approach proposed by Pavel  (allow index 
for empty GTT but prohibit it for GTT filled with data) is whether we 
also need to populate index with data or not.
I can imagine that implicit initialization of index in read-only query 
(select) can be unsafe and cause some problems. I have not encountered 
such problems yet after performing many tests with GTTs, but certainly I 
have not covered all possible scenarios (not sure that it is possible at 
all).
But I do not understand how populating  index with data can add some 
extra unsafety.

So I can not prove that building index for GTT on demand is safe, but it 
is not more unsafe than initialization of index on demand which is 
required in any case.

2. Actually I do not propose some completely new approach. I try to 
provide behavior with is compatible with regular tables.
If you create index for regular table, then it can be used in all 
sessions, right?
And all "various backend-local data structures in the relcache, the 
planner, and the executor that remember information about indexes"
have to be properly updated.  It is done using invalidation mechanism. 
The same mechanism is used in case of DDL operations with GTT, because 
we change system catalog.

So my point here is that creation index of GTT is almost the same as 
creation of index for regular tables and the same mechanism will be used 
to provide correctness of this operation.


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:



2. Actually I do not propose some completely new approach. I try to
provide behavior with is compatible with regular tables.
If you create index for regular table, then it can be used in all
sessions, right?

I don't understand to this point. Regular tables shares data, shares files. You cannot to separate it. More - you have to uses relatively aggressive locks to be this operation safe.

Nothing from these points are valid for GTT.

Regards

Pavel
 
And all "various backend-local data structures in the relcache, the
planner, and the executor that remember information about indexes"
have to be properly updated.  It is done using invalidation mechanism.
The same mechanism is used in case of DDL operations with GTT, because
we change system catalog.

So my point here is that creation index of GTT is almost the same as
creation of index for regular tables and the same mechanism will be used
to provide correctness of this operation.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:


On 29.01.2020 20:08, Pavel Stehule wrote:



2. Actually I do not propose some completely new approach. I try to
provide behavior with is compatible with regular tables.
If you create index for regular table, then it can be used in all
sessions, right?

I don't understand to this point. Regular tables shares data, shares files. You cannot to separate it. More - you have to uses relatively aggressive locks to be this operation safe.

Nothing from these points are valid for GTT.

GTT shares metadata.
As far as them are not sharing data, then GTT are safer than regular table, aren't them?
"Safer" means that we need less "aggressive" locks for them: we need to protect only metadata, not data itself.

My point is that if we allow other sessions to access created indexes for regular tables, then it will be not more complex to support it for GTT.
Actually "not more complex" in this case means "no extra efforts are needed".

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


st 29. 1. 2020 v 18:21 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 29.01.2020 20:08, Pavel Stehule wrote:



2. Actually I do not propose some completely new approach. I try to
provide behavior with is compatible with regular tables.
If you create index for regular table, then it can be used in all
sessions, right?

I don't understand to this point. Regular tables shares data, shares files. You cannot to separate it. More - you have to uses relatively aggressive locks to be this operation safe.

Nothing from these points are valid for GTT.

GTT shares metadata.
As far as them are not sharing data, then GTT are safer than regular table, aren't them?
"Safer" means that we need less "aggressive" locks for them: we need to protect only metadata, not data itself.

My point is that if we allow other sessions to access created indexes for regular tables, then it will be not more complex to support it for GTT.
Actually "not more complex" in this case means "no extra efforts are needed".

It is hard to say. I see a significant difference. When I do index on regular table, then I don't change a context of other processes. I have to wait for lock, and after I got a lock then other processes waiting.

With GTT, I don't want to wait for others - and other processes should build indexes inside - without expected sequence of operations. Maybe it can have positive effect, but it can have negative effect too. In this case I prefer (in this moment) zero effect on other sessions. So I would to build index in my session and I don't would to wait for other sessions, and if it is possible other sessions doesn't need to interact or react on my action too. It should be independent what is possible. The most simple solution is request on unique usage. I understand so it can be not too practical. Better is allow to usage GTT by other tables, but the changes are invisible in other sessions to session reset. It is minimalistic strategy. It has not benefits for other sessions, but it has not negative impacts too.



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Wed, Jan 29, 2020 at 10:30 AM Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
> But please consider two arguments:
>
> 1. Index for GTT in any case has to be initialized on demand. In case of
> regular tables index is initialized at the moment of its creation. In
> case of GTT it doesn't work.
> So we should somehow detect that accessed index is not initialized and
> perform lazy initialization of the index.
> The only difference with the approach proposed by Pavel  (allow index
> for empty GTT but prohibit it for GTT filled with data) is whether we
> also need to populate index with data or not.
> I can imagine that implicit initialization of index in read-only query
> (select) can be unsafe and cause some problems. I have not encountered
> such problems yet after performing many tests with GTTs, but certainly I
> have not covered all possible scenarios (not sure that it is possible at
> all).
> But I do not understand how populating  index with data can add some
> extra unsafety.
>
> So I can not prove that building index for GTT on demand is safe, but it
> is not more unsafe than initialization of index on demand which is
> required in any case.

I think that the idea of calling ambuild() on the fly is not going to
work, because, again, I don't think that calling that from random
places in the code is safe. What I expect we're going to need to do
here is model this on the approach used for unlogged tables. For an
unlogged table, each table and index has an init fork which contains
the correct initial contents for that relation - which is nothing at
all for a heap table, and a couple of boilerplate pages for an index.
In the case of an unlogged table, the init forks get copied over the
main forks after crash recovery, and then we have a brand-new, empty
relation with brand-new empty indexes which everyone can use. In the
case of global temporary tables, I think that we should do the same
kind of copying, but at the time when the session first tries to
access the table. There is some fuzziness in my mind about what
exactly constitutes accessing the table - it probably shouldn't be
when the relcache entry is built, because that seems too early, but
I'm not sure what is exactly right. In any event, it's easier to find
a context where copying some files on disk (that are certain not to be
changing) is safe than it is to find a context where index builds are
safe.

> 2. Actually I do not propose some completely new approach. I try to
> provide behavior with is compatible with regular tables.
> If you create index for regular table, then it can be used in all
> sessions, right?

Yes. :-)

> And all "various backend-local data structures in the relcache, the
> planner, and the executor that remember information about indexes"
> have to be properly updated.  It is done using invalidation mechanism.
> The same mechanism is used in case of DDL operations with GTT, because
> we change system catalog.

I mean, that's not really a valid argument. Invalidations can only
take effect at certain points in the code, and the whole argument here
is about which places in the code are safe for which operations, so
the fact that some things (like accepting invalidations) are safe at
some points in the code (like the places where we accept them) does
not prove that other things (like calling ambuild) are safe at other
points in the code (like wherever you are proposing to call it). In
particular, if you've got a relation open, there's currently no way
for another index to show up while you've still got that relation
open. That means that the planner and executor (which keep the
relevant relations open) don't ever have to worry about updating their
data structures, because it can never be necessary. It also means that
any code anywhere in the system that keeps a lock on a relation can
count on the list of indexes for that relation staying the same until
it releases the lock. In fact, it can hold on to pointers to data
allocated by the relcache and count on those pointers being stable for
as long as it holds the lock, and RelationClearRelation contain
specific code that aims to make sure that certain objects don't get
deallocated and reallocated at a different address precisely for that
reason. That code, however, only works as long as nothing actually
changes. The upshot is that it's entirely possible for changing
catalog entries in one backend with an inadequate lock level -- or at
unexpected point in the code -- to cause a core dump either in that
backend or in some other backend. This stuff is really subtle, and
super-easy to screw up.

I am speaking a bit generally here, because I haven't really studied
*exactly* what might go wrong in the relcache, or elsewhere, as a
result of creating an index on the fly. However, I'm very sure that a
general appeal to invalidation messages is not sufficient to make
something like what you want to do safe. Invalidation messages are a
complex, ancient, under-documented, fragile system for solving a very
specific problem that is not the one you are hoping they'll solve
here. They could justifiably be called magic, but it's not the sort of
magic where the fairy godmother waves her wand and solves all of your
problems; it's more like the kind where you go explore the forbidden
forest and are never seen or heard from again.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:


On 29.01.2020 20:37, Pavel Stehule wrote:


st 29. 1. 2020 v 18:21 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 29.01.2020 20:08, Pavel Stehule wrote:



2. Actually I do not propose some completely new approach. I try to
provide behavior with is compatible with regular tables.
If you create index for regular table, then it can be used in all
sessions, right?

I don't understand to this point. Regular tables shares data, shares files. You cannot to separate it. More - you have to uses relatively aggressive locks to be this operation safe.

Nothing from these points are valid for GTT.

GTT shares metadata.
As far as them are not sharing data, then GTT are safer than regular table, aren't them?
"Safer" means that we need less "aggressive" locks for them: we need to protect only metadata, not data itself.

My point is that if we allow other sessions to access created indexes for regular tables, then it will be not more complex to support it for GTT.
Actually "not more complex" in this case means "no extra efforts are needed".

It is hard to say. I see a significant difference. When I do index on regular table, then I don't change a context of other processes. I have to wait for lock, and after I got a lock then other processes waiting.

With GTT, I don't want to wait for others - and other processes should build indexes inside - without expected sequence of operations. Maybe it can have positive effect, but it can have negative effect too. In this case I prefer (in this moment) zero effect on other sessions. So I would to build index in my session and I don't would to wait for other sessions, and if it is possible other sessions doesn't need to interact or react on my action too. It should be independent what is possible. The most simple solution is request on unique usage. I understand so it can be not too practical. Better is allow to usage GTT by other tables, but the changes are invisible in other sessions to session reset. It is minimalistic strategy. It has not benefits for other sessions, but it has not negative impacts too.


Building regular index requires two kinds of lock:
1. You have to lock pg_class to make changes in system catalog.
2. You need to lock heap relation  to pervent concurrent updates while building index.

GTT requires 1)  but not 2).
Once backend inserts information about new index in system catalog, all other sessions may use it. pg_class lock prevents any race condition here.
And building index itself doesn't affect any other backends.


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


čt 30. 1. 2020 v 9:45 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 29.01.2020 20:37, Pavel Stehule wrote:


st 29. 1. 2020 v 18:21 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 29.01.2020 20:08, Pavel Stehule wrote:



2. Actually I do not propose some completely new approach. I try to
provide behavior with is compatible with regular tables.
If you create index for regular table, then it can be used in all
sessions, right?

I don't understand to this point. Regular tables shares data, shares files. You cannot to separate it. More - you have to uses relatively aggressive locks to be this operation safe.

Nothing from these points are valid for GTT.

GTT shares metadata.
As far as them are not sharing data, then GTT are safer than regular table, aren't them?
"Safer" means that we need less "aggressive" locks for them: we need to protect only metadata, not data itself.

My point is that if we allow other sessions to access created indexes for regular tables, then it will be not more complex to support it for GTT.
Actually "not more complex" in this case means "no extra efforts are needed".

It is hard to say. I see a significant difference. When I do index on regular table, then I don't change a context of other processes. I have to wait for lock, and after I got a lock then other processes waiting.

With GTT, I don't want to wait for others - and other processes should build indexes inside - without expected sequence of operations. Maybe it can have positive effect, but it can have negative effect too. In this case I prefer (in this moment) zero effect on other sessions. So I would to build index in my session and I don't would to wait for other sessions, and if it is possible other sessions doesn't need to interact or react on my action too. It should be independent what is possible. The most simple solution is request on unique usage. I understand so it can be not too practical. Better is allow to usage GTT by other tables, but the changes are invisible in other sessions to session reset. It is minimalistic strategy. It has not benefits for other sessions, but it has not negative impacts too.


Building regular index requires two kinds of lock:
1. You have to lock pg_class to make changes in system catalog.
2. You need to lock heap relation  to pervent concurrent updates while building index.

GTT requires 1)  but not 2).
Once backend inserts information about new index in system catalog, all other sessions may use it. pg_class lock prevents any race condition here.
And building index itself doesn't affect any other backends.

It is true. The difference for GTT, so any other sessions have to build index (in your proposal) as extra operation against original plan.

Pavel



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:

On 29.01.2020 21:16, Robert Haas wrote:
> On Wed, Jan 29, 2020 at 10:30 AM Konstantin Knizhnik
> <k.knizhnik@postgrespro.ru> wrote:
>
> I think that the idea of calling ambuild() on the fly is not going to
> work, because, again, I don't think that calling that from random
> places in the code is safe.

It is not a random place in the code.
Actually it is just one place - _bt_getbuf
Why it can be unsafe if it affects only private backends data?


> What I expect we're going to need to do
> here is model this on the approach used for unlogged tables. For an
> unlogged table, each table and index has an init fork which contains
> the correct initial contents for that relation - which is nothing at
> all for a heap table, and a couple of boilerplate pages for an index.
> In the case of an unlogged table, the init forks get copied over the
> main forks after crash recovery, and then we have a brand-new, empty
> relation with brand-new empty indexes which everyone can use. In the
> case of global temporary tables, I think that we should do the same
> kind of copying, but at the time when the session first tries to
> access the table. There is some fuzziness in my mind about what
> exactly constitutes accessing the table - it probably shouldn't be
> when the relcache entry is built, because that seems too early, but
> I'm not sure what is exactly right. In any event, it's easier to find
> a context where copying some files on disk (that are certain not to be
> changing) is safe than it is to find a context where index builds are
> safe.

I do not think that approach used for unlogged tables is good for GTT.
Unlogged tables has to be reinitialized only after server restart.
GTT to should be initialized by each backend on demand.
It seems to me that init fork is used for unlogged table because 
recovery process to not have enough context to be able to reintialize 
table and indexes.
It is much safer and simpler for recovery process just to copy files. 
But GTT case is different. Heap and indexes can be easily initialized by 
backend  using existed functions.

Approach with just calling btbuild is much simpler than you propose with 
creating extra forks and copying data from it.
You say that it not safe. But you have not explained why it is unsafe. 
Yes, I agree that it is my responsibility to prove that it is safe.
And as I already wrote, I can not provide such proof now. I will be 
pleased if you or anybody else can help to convince that this approach 
is safe or demonstrate problems with this approach.

Copying data from fork doesn't help to provide the same behavior of GTT 
indexes as regular indexes.
And from my point of view compatibility with regular tables is most 
important point in GTT design.
If for some reasons it is not possible, than we should think about other 
solutions.
But right now I do not know such problems. We have two working 
prototypes of GTT. Certainly it is not mean lack of problems with the 
current implementations.
But I really like to receive more constructive critics rather than "this 
approach is wrong because it is unsafe".

>> planner, and the executor that remember information about indexes"
>> have to be properly updated.  It is done using invalidation mechanism.
>> The same mechanism is used in case of DDL operations with GTT, because
>> we change system catalog.
> I mean, that's not really a valid argument. Invalidations can only
> take effect at certain points in the code, and the whole argument here
> is about which places in the code are safe for which operations, so
> the fact that some things (like accepting invalidations) are safe at
> some points in the code (like the places where we accept them) does
> not prove that other things (like calling ambuild) are safe at other
> points in the code (like wherever you are proposing to call it). In
> particular, if you've got a relation open, there's currently no way
> for another index to show up while you've still got that relation
> open.
The same is true for GTT. Right now building GTT index also locks the 
relation.
It may be not absolutely needed, because data of relation is local and 
can not be changed by some other backend.
But I have not added some special handling of GTT here.
Mostly because I want to follow the same way as with regular indexes and 
prevent possible problems which as you mention can happen
if we somehow changing locking policy.


> That means that the planner and executor (which keep the
> relevant relations open) don't ever have to worry about updating their
> data structures, because it can never be necessary. It also means that
> any code anywhere in the system that keeps a lock on a relation can
> count on the list of indexes for that relation staying the same until
> it releases the lock. In fact, it can hold on to pointers to data
> allocated by the relcache and count on those pointers being stable for
> as long as it holds the lock, and RelationClearRelation contain
> specific code that aims to make sure that certain objects don't get
> deallocated and reallocated at a different address precisely for that
> reason. That code, however, only works as long as nothing actually
> changes. The upshot is that it's entirely possible for changing
> catalog entries in one backend with an inadequate lock level -- or at
> unexpected point in the code -- to cause a core dump either in that
> backend or in some other backend. This stuff is really subtle, and
> super-easy to screw up.
>
> I am speaking a bit generally here, because I haven't really studied
> *exactly* what might go wrong in the relcache, or elsewhere, as a
> result of creating an index on the fly. However, I'm very sure that a
> general appeal to invalidation messages is not sufficient to make
> something like what you want to do safe. Invalidation messages are a
> complex, ancient, under-documented, fragile system for solving a very
> specific problem that is not the one you are hoping they'll solve
> here. They could justifiably be called magic, but it's not the sort of
> magic where the fairy godmother waves her wand and solves all of your
> problems; it's more like the kind where you go explore the forbidden
> forest and are never seen or heard from again.

Actually index is not created on the fly.
Index is created is usual way, by executing "create index" command.
So all  components of the Postgres (planner, executor,...) treat GTT 
indexes in the same way as regular indexes.
Locking and invalidations policies are exactly the same for them.
The only difference is that content of GTT index is constructed  on 
demand using private backend data.
Is it safe or not? We are just reading data from local buffers/files and 
writing them here.
May be I missed something but I do not see any unsafety here.
There are issues with updating statistic but them can be solved.

-- 

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:


On 30.01.2020 12:23, Pavel Stehule wrote:

Building regular index requires two kinds of lock:
1. You have to lock pg_class to make changes in system catalog.
2. You need to lock heap relation  to pervent concurrent updates while building index.

GTT requires 1)  but not 2).
Once backend inserts information about new index in system catalog, all other sessions may use it. pg_class lock prevents any race condition here.
And building index itself doesn't affect any other backends.

It is true. The difference for GTT, so any other sessions have to build index (in your proposal) as extra operation against original plan.

What is "index"?
For most parts of Postgres it is just an entry in system catalog.
And only executor deals with its particular implementation and content.

My point is that if we process GTT index metadata in the same way as regular index metadata,
then there will be no differences for the postgres between GTT and regular indexes.
And we can provide the same behavior.

Concerning actual content of the index - it is local to the backend and it is safe to construct it a t any point of time (on demand).
It depends only on private data and can not be somehow affected by other backends (taken in account that we preserve locking policy of regular tables).


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


čt 30. 1. 2020 v 10:44 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 30.01.2020 12:23, Pavel Stehule wrote:

Building regular index requires two kinds of lock:
1. You have to lock pg_class to make changes in system catalog.
2. You need to lock heap relation  to pervent concurrent updates while building index.

GTT requires 1)  but not 2).
Once backend inserts information about new index in system catalog, all other sessions may use it. pg_class lock prevents any race condition here.
And building index itself doesn't affect any other backends.

It is true. The difference for GTT, so any other sessions have to build index (in your proposal) as extra operation against original plan.

What is "index"?
For most parts of Postgres it is just an entry in system catalog.
And only executor deals with its particular implementation and content.

My point is that if we process GTT index metadata in the same way as regular index metadata,
then there will be no differences for the postgres between GTT and regular indexes.
And we can provide the same behavior.

There should be a difference - index on regular table is created by one process. Same thing is not possible on GTT. So there should be a difference every time.

You can reduce some differences, but minimally me and Robert don't feel it well. Starting a building index from routine, that is used for reading from buffer doesn't look well. I can accept some stranges, but I need to have feeling so it is necessary. I don't think so it is necessary in this case.

Regards

Pavel


Concerning actual content of the index - it is local to the backend and it is safe to construct it a t any point of time (on demand).
It depends only on private data and can not be somehow affected by other backends (taken in account that we preserve locking policy of regular tables).


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:


On 30.01.2020 12:52, Pavel Stehule wrote:


čt 30. 1. 2020 v 10:44 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 30.01.2020 12:23, Pavel Stehule wrote:

Building regular index requires two kinds of lock:
1. You have to lock pg_class to make changes in system catalog.
2. You need to lock heap relation  to pervent concurrent updates while building index.

GTT requires 1)  but not 2).
Once backend inserts information about new index in system catalog, all other sessions may use it. pg_class lock prevents any race condition here.
And building index itself doesn't affect any other backends.

It is true. The difference for GTT, so any other sessions have to build index (in your proposal) as extra operation against original plan.

What is "index"?
For most parts of Postgres it is just an entry in system catalog.
And only executor deals with its particular implementation and content.

My point is that if we process GTT index metadata in the same way as regular index metadata,
then there will be no differences for the postgres between GTT and regular indexes.
And we can provide the same behavior.

There should be a difference - index on regular table is created by one process. Same thing is not possible on GTT. So there should be a difference every time.

Metadata of GTT index is also created by one process. And actual content of the index is not interesting for most parts of Postgres.


You can reduce some differences, but minimally me and Robert don't feel it well. Starting a building index from routine, that is used for reading from buffer doesn't look well. I can accept some stranges, but I need to have feeling so it is necessary. I don't think so it is necessary in this case.

Sorry, but "don't feel it well", "doesn't look well" looks more like literary criticism rather than code review;)
Yes, I agree that it is unnatural to call btindex from _bt_getbuf. But what can go wrong here?

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


čt 30. 1. 2020 v 11:02 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 30.01.2020 12:52, Pavel Stehule wrote:


čt 30. 1. 2020 v 10:44 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 30.01.2020 12:23, Pavel Stehule wrote:

Building regular index requires two kinds of lock:
1. You have to lock pg_class to make changes in system catalog.
2. You need to lock heap relation  to pervent concurrent updates while building index.

GTT requires 1)  but not 2).
Once backend inserts information about new index in system catalog, all other sessions may use it. pg_class lock prevents any race condition here.
And building index itself doesn't affect any other backends.

It is true. The difference for GTT, so any other sessions have to build index (in your proposal) as extra operation against original plan.

What is "index"?
For most parts of Postgres it is just an entry in system catalog.
And only executor deals with its particular implementation and content.

My point is that if we process GTT index metadata in the same way as regular index metadata,
then there will be no differences for the postgres between GTT and regular indexes.
And we can provide the same behavior.

There should be a difference - index on regular table is created by one process. Same thing is not possible on GTT. So there should be a difference every time.

Metadata of GTT index is also created by one process. And actual content of the index is not interesting for most parts of Postgres.


You can reduce some differences, but minimally me and Robert don't feel it well. Starting a building index from routine, that is used for reading from buffer doesn't look well. I can accept some stranges, but I need to have feeling so it is necessary. I don't think so it is necessary in this case.

Sorry, but "don't feel it well", "doesn't look well" looks more like literary criticism rather than code review;)

The design is subjective. I am sure, so your solution can work, like mine, or any other. But I am not sure, so your solution is good for practical usage.
 
Yes, I agree that it is unnatural to call btindex from _bt_getbuf. But what can go wrong here?

creating index as side effect of table reading. Just the side effect too much big.


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年1月29日 上午1:54,Pavel Stehule <pavel.stehule@gmail.com> 写道:



út 28. 1. 2020 v 18:13 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:


út 28. 1. 2020 v 18:12 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年1月29日 上午12:40,Pavel Stehule <pavel.stehule@gmail.com> 写道:



út 28. 1. 2020 v 17:01 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年1月24日 上午4:47,Robert Haas <robertmhaas@gmail.com> 写道:

On Sat, Jan 11, 2020 at 8:51 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
I proposed just ignoring those new indexes because it seems much simpler
than alternative solutions that I can think of, and it's not like those
other solutions don't have other issues.

+1.
I complete the implementation of this feature.
When a session x create an index idx_a on GTT A then
For session x, idx_a is valid when after create index.
For session y, before session x create index done, GTT A has some data, then index_a is invalid.
For session z, before session x create index done, GTT A has no data, then index_a is valid.


For example, I've looked at the "on demand" building as implemented in
global_private_temp-8.patch, I kinda doubt adding a bunch of index build
calls into various places in index code seems somewht suspicious.

+1. I can't imagine that's a safe or sane thing to do.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Opinion by Pavel
+ rel->rd_islocaltemp = true;  <<<<<<< if this is valid, then the name of field "rd_islocaltemp" is not probably best
I renamed rd_islocaltemp

I don't see any change?
Rename rd_islocaltemp to rd_istemp  in global_temporary_table_v8-pg13.patch

ok :)

I found a bug

postgres=# create global temp table x(a int);
CREATE TABLE
postgres=# insert into x values(1);
INSERT 0 1
postgres=# create index on x (a);
CREATE INDEX
postgres=# create index on x((a + 1));
CREATE INDEX
postgres=# analyze x;
WARNING:  oid 16468 not a relation
ANALYZE
The bug has been fixed in the global_temporary_table_v9-pg13.patch



Wenjing





other behave looks well for me.

Regards

Pavel


Pavel


Wenjing






Opinion by Konstantin Knizhnik
1 Fixed comments
2 Fixed assertion


Please help me review.


Wenjing



Attachment

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:

> 2020年1月29日 下午9:48,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Tue, Jan 28, 2020 at 12:12 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
>>> Opinion by Pavel
>>> + rel->rd_islocaltemp = true;  <<<<<<< if this is valid, then the name of field "rd_islocaltemp" is not probably
best
>>> I renamed rd_islocaltemp
>>
>> I don't see any change?
>>
>> Rename rd_islocaltemp to rd_istemp  in global_temporary_table_v8-pg13.patch
>
> In view of commit 6919b7e3294702adc39effd16634b2715d04f012, I think
> that this has approximately a 0% chance of being acceptable. If you're
> setting a field in a way that is inconsistent with the current use of
> the field, you're probably doing it wrong, because the field has an
> existing purpose to which new code must conform. And if you're not
> doing that, then you don't need to rename it.
Thank you for pointing it out.
I've rolled back the rename.
But I still need rd_localtemp to be true, The reason is that
1 GTT The GTT needs to support DML in read-only transactions ,like local temp table.
2 GTT does not need to hold the lock before modifying the index buffer ,also like local temp table.

Please give me feedback.


Wenjing




> 
> -- 
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company


Attachment

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


čt 30. 1. 2020 v 15:17 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


> 2020年1月29日 下午9:48,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Tue, Jan 28, 2020 at 12:12 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
>>> Opinion by Pavel
>>> + rel->rd_islocaltemp = true;  <<<<<<< if this is valid, then the name of field "rd_islocaltemp" is not probably best
>>> I renamed rd_islocaltemp
>>
>> I don't see any change?
>>
>> Rename rd_islocaltemp to rd_istemp  in global_temporary_table_v8-pg13.patch
>
> In view of commit 6919b7e3294702adc39effd16634b2715d04f012, I think
> that this has approximately a 0% chance of being acceptable. If you're
> setting a field in a way that is inconsistent with the current use of
> the field, you're probably doing it wrong, because the field has an
> existing purpose to which new code must conform. And if you're not
> doing that, then you don't need to rename it.
Thank you for pointing it out.
I've rolled back the rename.
But I still need rd_localtemp to be true, The reason is that
1 GTT The GTT needs to support DML in read-only transactions ,like local temp table.
2 GTT does not need to hold the lock before modifying the index buffer ,also like local temp table.

Please give me feedback.

maybe some like

rel->rd_globaltemp = true;

and somewhere else

if (rel->rd_localtemp || rel->rd_globaltemp)
{
  ...
}



Wenjing




>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Thu, Jan 30, 2020 at 4:33 AM Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
> On 29.01.2020 21:16, Robert Haas wrote:
> > On Wed, Jan 29, 2020 at 10:30 AM Konstantin Knizhnik
> > <k.knizhnik@postgrespro.ru> wrote:
> >
> > I think that the idea of calling ambuild() on the fly is not going to
> > work, because, again, I don't think that calling that from random
> > places in the code is safe.
>
> It is not a random place in the code.
> Actually it is just one place - _bt_getbuf
> Why it can be unsafe if it affects only private backends data?

Because, as I already said, not every operation is safe at every point
in the code. This is true even when there's no concurrency involved.
For example, executing user-defined code is not safe while holding a
buffer lock, because the user-defined code might try to do something
that locks the same buffer, which would cause an undetected,
uninterruptible deadlock.

> But GTT case is different. Heap and indexes can be easily initialized by
> backend  using existed functions.

That would be nice if we could make it work. Someone would need to
show, however, that it's safe.

> You say that it not safe. But you have not explained why it is unsafe.
> Yes, I agree that it is my responsibility to prove that it is safe.
> And as I already wrote, I can not provide such proof now. I will be
> pleased if you or anybody else can help to convince that this approach
> is safe or demonstrate problems with this approach.

That's fair, but nobody's obliged to spend time on that.

> But I really like to receive more constructive critics rather than "this
> approach is wrong because it is unsafe".

I'm sure, and that's probably valid. Equally, however, I'd like to
receive more analysis of why it is safe than "I don't see anything
wrong with it so it's probably fine." And I think that's pretty valid,
too.

> Actually index is not created on the fly.
> Index is created is usual way, by executing "create index" command.
> So all  components of the Postgres (planner, executor,...) treat GTT
> indexes in the same way as regular indexes.
> Locking and invalidations policies are exactly the same for them.
> The only difference is that content of GTT index is constructed  on
> demand using private backend data.
> Is it safe or not? We are just reading data from local buffers/files and
> writing them here.
> May be I missed something but I do not see any unsafety here.
> There are issues with updating statistic but them can be solved.

But that's not all you are doing. To build the index, you'll have to
sort the data. To sort the data, you'll have to call btree support
functions. Those functions can be user-defined, and can do complex
operations like catalog access that depend on a good transaction
state, no buffer locks being held, and nothing already in progress
within this backend that can get confused as a result of this
operation.

Just as a quick test, I tried doing this in _bt_getbuf:

+    if (InterruptHoldoffCount != 0)
+        elog(WARNING, "INTERRUPTS ARE HELD OFF");

That causes 103 out of 196 regression tests to fail, which means that
it's pretty common to arrive in _bt_getbuf() with interrupts held off.
At the very least, that means that the index build would be
uninterruptible, which already seems unacceptable. Probably, it means
that the calling code is already holding an LWLock, because that's
normally what causes HOLD_INTERRUPTS() to happen. And as I have
already explained, that is super-problematic, because of deadlock
risks, and because it risks putting other backends into
non-interruptible waits if they should happen to need the LWLock we're
holding here.

I really don't understand why the basic point here remains obscure. In
general, it tends to be unsafe to call high-level code from low-level
code, not just in PostgreSQL but in pretty much any program. Do you
think that we can safely add a GUC that executes a user-defined SQL
query every time an LWLock is acquired? If you do, why don't you try
adding code to do that to LWLockAcquire and testing it out a little
bit? Try making the SQL query do something like query pg_class, find a
table name that's not in use, and create a table by that name. Then
run the regression tests with the GUC set to run that query and see
how it goes. I always hate to say that things are "obvious," because
what's obvious to me may not be obvious to somebody else, but it is
clear to me, at least, that this has no chance of working. Even though
I can't say exactly what will break, or what will break first, I'm
very sure that a lot of things will break and that most of them are
unfixable.

Now, your idea is not quite as crazy as that, but it has the same
basic problem: you can't insert code into a low-level facility that
uses a high level facility which may in turn use and depend on that
very same low-level facility to not be in the middle of an operation.
If you do, it's going to break somehow.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年1月30日 下午10:21,Pavel Stehule <pavel.stehule@gmail.com> 写道:



čt 30. 1. 2020 v 15:17 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


> 2020年1月29日 下午9:48,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Tue, Jan 28, 2020 at 12:12 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
>>> Opinion by Pavel
>>> + rel->rd_islocaltemp = true;  <<<<<<< if this is valid, then the name of field "rd_islocaltemp" is not probably best
>>> I renamed rd_islocaltemp
>>
>> I don't see any change?
>>
>> Rename rd_islocaltemp to rd_istemp  in global_temporary_table_v8-pg13.patch
>
> In view of commit 6919b7e3294702adc39effd16634b2715d04f012, I think
> that this has approximately a 0% chance of being acceptable. If you're
> setting a field in a way that is inconsistent with the current use of
> the field, you're probably doing it wrong, because the field has an
> existing purpose to which new code must conform. And if you're not
> doing that, then you don't need to rename it.
Thank you for pointing it out.
I've rolled back the rename.
But I still need rd_localtemp to be true, The reason is that
1 GTT The GTT needs to support DML in read-only transactions ,like local temp table.
2 GTT does not need to hold the lock before modifying the index buffer ,also like local temp table.

Please give me feedback.

maybe some like

rel->rd_globaltemp = true;

and somewhere else

if (rel->rd_localtemp || rel->rd_globaltemp)
{
  ...
}
I tried to optimize code in global_temporary_table_v10-pg13.patch


Please give me feedback.

Wenjing






Wenjing




>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company


Attachment

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年1月27日 下午5:38,Konstantin Knizhnik <k.knizhnik@postgrespro.ru> 写道:



On 25.01.2020 18:15, 曾文旌(义从) wrote:
I wonder why do we need some special check for GTT here.
From my point of view cleanup at startup of local storage of temp tables should be performed in the same way for local and global temp tables.
After oom kill, In autovacuum, the Isolated local temp table will be cleaned like orphan temporary tables. The definition of local temp table is deleted with the storage file. 
But GTT can not do that. So we have the this implementation in my patch.
If you have other solutions, please let me know.

I wonder if it is possible that autovacuum or some other Postgres process is killed by OOM and postmaster is not noticing it can doens't restart Postgres instance?
as far as I know, crash of any process connected to Postgres shared memory (and autovacuum definitely has such connection) cause Postgres restart.
Postmaster will not restart after oom happen, but the startup process will. GTT data files are cleaned up in the startup process.


In my design
1 Because different sessions have different transaction information, I choose to store the transaction information of GTT in MyProc,not catalog.
2 About the XID wraparound problem, the reason is the design of the temp table storage(local temp table and global temp table) that makes it can not to do vacuum by autovacuum. 
It should be completely solve at the storage level.


My point of view is that vacuuming of temp tables is common problem for local and global temp tables.
So it has to be addressed in the common way and so we should not try to fix this problem only for GTT.
I think I agree with you this point.
However, this does not mean that GTT transaction information stored in pg_class is correct.
If you keep it that way, like in global_private_temp-8.patch, It may cause data loss in GTT after aotuvauum.



In fact, The dba can still complete the DDL of the GTT.
I've provided a set of functions for this case.
If the dba needs to modify a GTT A(or drop GTT or create index on GTT), he needs to do:
1 Use the pg_gtt_attached_pids view to list the pids for the session that is using the GTT A.
2 Use pg_terminate_backend(pid)terminate they except itself.
3 Do alter GTT A.

IMHO forced terminated of client sessions is not acceptable solution.
And it is not an absolutely necessary requirement.
So from my point of view we should not add such limitations to GTT design.
This limitation makes it possible for the GTT to do all the DDL.
IMHO even oracle's GTT has similar limitations.





What are the reasons of using RowExclusiveLock for GTT instead of AccessExclusiveLock?
Yes, GTT data is access only by one backend so no locking here seems to be needed at all.
But I wonder what are the motivations/benefits of using weaker lock level here?
1 Truncate GTT deletes only the data in the session, so no need use high-level lock.
2 I think it still needs to be block by DDL of GTT, which is why I use RowExclusiveLock.

Sorry, I do not understand your arguments: we do not need exclusive lock because we drop only local (private) data
but we need some kind of lock. I agree with 1) and not 2).
Yes, we don't need lock for private data, but metadata need.


There should be no conflicts in any case...

+        /* We allow to create index on global temp table only this session use it */
+        if (is_other_backend_use_gtt(heapRelation->rd_node))
+            elog(ERROR, "can not create index when have other backend attached this global temp table");
+

The same argument as in case of dropping GTT: I do not think that prohibiting DLL operations on GTT used by more than one backend is bad idea.
The idea was to give the GTT almost all the features of a regular table with few code changes.
The current version DBA can still do all DDL for GTT, I've already described.

I absolutely agree with you that GTT should be given the same features as regular tables.
The irony is that this most natural and convenient behavior is most easy to implement without putting some extra restrictions.
Just let indexes for GTT be constructed on demand. It it can be done using the same function used for regular index creation.
The limitation on index creation have been improved in global_temporary_table_v10-pg13.patch.





+    /* global temp table not support foreign key constraint yet */
+    if (RELATION_IS_GLOBAL_TEMP(pkrel))
+        ereport(ERROR,
+                (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+                 errmsg("referenced relation \"%s\" is not a global temp table",
+                        RelationGetRelationName(pkrel))));
+

Why do we need to prohibit foreign key constraint on GTT?
It may be possible to support FK on GTT in later versions. Before that, I need to check some code.

Ok,  may be approach to prohibit everything except minimally required functionality  is safe and reliable.
But frankly speaking I prefer different approach: if I do not see any contradictions of new feature with existed operations
and it is passing tests, then we should  not prohibit this operations for new feature.


I have already described my point in previous emails.

1. The core problem is that the data contains transaction information (xid), which needs to be vacuum(freeze) regularly to avoid running out of xid.
The autovacuum supports vacuum regular table but local temp does not. autovacuum also does not support GTT.

2. However, the difference between the local temp table and the global temp table(GTT) is that
a) For local temp table: one table hava one piece of data. the frozenxid of one local temp table is store in the catalog(pg_class). 
b) For global temp table: each session has a separate copy of data, one GTT may contain maxbackend frozenxid.
and I don't think it's a good idea to keep frozenxid of GTT in the catalog(pg_class). 
It becomes a question: how to handle GTT transaction information?

I agree that problem 1 should be completely solved by a some feature, such as local transactions. It is definitely not included in the GTT patch.
But, I think we need to ensure the durability of GTT data. For example, data in GTT cannot be lost due to the clog being cleaned up. It belongs to problem 2.

For problem 2
If we ignore the frozenxid of GTT, when vacuum truncates the clog that GTT need, the GTT data in some sessions is completely lost.
Perhaps we could consider let aotuvacuum terminate those sessions that contain "too old" data, 
But It's not very friendly, so I didn't choose to implement it in the first version.
Maybe you have a better idea.

Sorry, I do not have better idea.
I prefer not to address this problem in first version of the patch at all.
fozen_xid of temp table is never changed unless user explicitly invoke vacuum on it.
I do not think that anybody is doing it (because it accentually contains temporary data which is not expected to live long time.
Certainly it is possible to imagine situation when session use GTT to store some local data which is valid during all session life time (which can be large enough).
But I am not sure that it is popular scenario.
As global_private_temp-8.patch, think about:
1 session X tale several hours doing some statistical work with the GTT A, which generated some data using transaction 100, The work is not over.
2 Then session Y vacuumed A, and the GTT's relfrozenxid (in pg_class) was updated to 1000 0000.
3 Then the aotuvacuum happened, the clog  before 1000 0000 was cleaned up.
4 The data in session A could be lost due to missing clog, The analysis task failed.

However This is likely to happen because you allowed the GTT do vacuum. 
And this is not a common problem, that not happen with local temp tables.
I feel uneasy about leaving such a question. We can improve it.




-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


so 1. 2. 2020 v 14:39 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年1月30日 下午10:21,Pavel Stehule <pavel.stehule@gmail.com> 写道:



čt 30. 1. 2020 v 15:17 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


> 2020年1月29日 下午9:48,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Tue, Jan 28, 2020 at 12:12 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
>>> Opinion by Pavel
>>> + rel->rd_islocaltemp = true;  <<<<<<< if this is valid, then the name of field "rd_islocaltemp" is not probably best
>>> I renamed rd_islocaltemp
>>
>> I don't see any change?
>>
>> Rename rd_islocaltemp to rd_istemp  in global_temporary_table_v8-pg13.patch
>
> In view of commit 6919b7e3294702adc39effd16634b2715d04f012, I think
> that this has approximately a 0% chance of being acceptable. If you're
> setting a field in a way that is inconsistent with the current use of
> the field, you're probably doing it wrong, because the field has an
> existing purpose to which new code must conform. And if you're not
> doing that, then you don't need to rename it.
Thank you for pointing it out.
I've rolled back the rename.
But I still need rd_localtemp to be true, The reason is that
1 GTT The GTT needs to support DML in read-only transactions ,like local temp table.
2 GTT does not need to hold the lock before modifying the index buffer ,also like local temp table.

Please give me feedback.

maybe some like

rel->rd_globaltemp = true;

and somewhere else

if (rel->rd_localtemp || rel->rd_globaltemp)
{
  ...
}
I tried to optimize code in global_temporary_table_v10-pg13.patch


Please give me feedback.

I tested this patch and I have not any objections - from my user perspective it is work as I expect

+#define RELATION_IS_TEMP(relation) \
+ ((relation)->rd_islocaltemp || \
+ (relation)->rd_rel->relpersistence == RELPERSISTENCE_GLOBAL_TEMP)
 
It looks little bit unbalanced

maybe is better to inject rd_isglobaltemp to relation structure

and then

it should to like

+#define RELATION_IS_TEMP(relation) \
+ ((relation)->rd_islocaltemp || \
+ (relation)->rd_isglobaltemp))

But I have not idea if it helps in complex







Wenjing






Wenjing




>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company


Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:

On 31.01.2020 22:38, Robert Haas wrote:
> Now, your idea is not quite as crazy as that, but it has the same
> basic problem: you can't insert code into a low-level facility that
> uses a high level facility which may in turn use and depend on that
> very same low-level facility to not be in the middle of an operation.
> If you do, it's going to break somehow.
>

Thank you for explanation.
You convinced me that building indexes from _bt_getbuf is not good idea.
What do you think about idea to check and build indexes for GTT prior to 
query execution?

In this case we do not need to patch code of all indexes - it can be 
done just in one place.
We can use build function of access method to initialize index and 
populate it with data.

So right now when building query execution plan, optimizer checks if 
index is valid.
If index belongs to GTT, it an check that first page of the index is 
initialized and if not - call build method for this index.

If building index during building query plan is not desirable, we can 
just construct list of indexes which should be checked and
perform check itself and building indexes somewhere after building plan 
but for execution of the query.

Do you seem some problems with such approach?


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:


On 01.02.2020 19:14, 曾文旌(义从) wrote:


2020年1月27日 下午5:38,Konstantin Knizhnik <k.knizhnik@postgrespro.ru> 写道:



On 25.01.2020 18:15, 曾文旌(义从) wrote:
I wonder why do we need some special check for GTT here.
From my point of view cleanup at startup of local storage of temp tables should be performed in the same way for local and global temp tables.
After oom kill, In autovacuum, the Isolated local temp table will be cleaned like orphan temporary tables. The definition of local temp table is deleted with the storage file. 
But GTT can not do that. So we have the this implementation in my patch.
If you have other solutions, please let me know.

I wonder if it is possible that autovacuum or some other Postgres process is killed by OOM and postmaster is not noticing it can doens't restart Postgres instance?
as far as I know, crash of any process connected to Postgres shared memory (and autovacuum definitely has such connection) cause Postgres restart.
Postmaster will not restart after oom happen, but the startup process will. GTT data files are cleaned up in the startup process.

Yes, exactly.
But it is still not clear to me why do we need some special handling for GTT?
Shared memory is reinitialized and storage of temporary tables is removed.
It is true for both local and global temp tables.



In my design
1 Because different sessions have different transaction information, I choose to store the transaction information of GTT in MyProc,not catalog.
2 About the XID wraparound problem, the reason is the design of the temp table storage(local temp table and global temp table) that makes it can not to do vacuum by autovacuum. 
It should be completely solve at the storage level.


My point of view is that vacuuming of temp tables is common problem for local and global temp tables.
So it has to be addressed in the common way and so we should not try to fix this problem only for GTT.
I think I agree with you this point.
However, this does not mean that GTT transaction information stored in pg_class is correct.
If you keep it that way, like in global_private_temp-8.patch, It may cause data loss in GTT after aotuvauum.

In my patch autovacuum is prohibited for GTT.

IMHO forced terminated of client sessions is not acceptable solution.
And it is not an absolutely necessary requirement.
So from my point of view we should not add such limitations to GTT design.
This limitation makes it possible for the GTT to do all the DDL.
IMHO even oracle's GTT has similar limitations.

I have checked that Oracle is not preventing creation of index for GTT if there are some active sessions working with this table. And this index becomes visible for all this sessions.


As global_private_temp-8.patch, think about:
1 session X tale several hours doing some statistical work with the GTT A, which generated some data using transaction 100, The work is not over.
2 Then session Y vacuumed A, and the GTT's relfrozenxid (in pg_class) was updated to 1000 0000.
3 Then the aotuvacuum happened, the clog  before 1000 0000 was cleaned up.
4 The data in session A could be lost due to missing clog, The analysis task failed.

However This is likely to happen because you allowed the GTT do vacuum. 
And this is not a common problem, that not happen with local temp tables.
I feel uneasy about leaving such a question. We can improve it.


May be the easies solution is to prohibit explicit vacuum of GTT?

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年2月2日 上午2:00,Pavel Stehule <pavel.stehule@gmail.com> 写道:



so 1. 2. 2020 v 14:39 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年1月30日 下午10:21,Pavel Stehule <pavel.stehule@gmail.com> 写道:



čt 30. 1. 2020 v 15:17 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


> 2020年1月29日 下午9:48,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Tue, Jan 28, 2020 at 12:12 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
>>> Opinion by Pavel
>>> + rel->rd_islocaltemp = true;  <<<<<<< if this is valid, then the name of field "rd_islocaltemp" is not probably best
>>> I renamed rd_islocaltemp
>>
>> I don't see any change?
>>
>> Rename rd_islocaltemp to rd_istemp  in global_temporary_table_v8-pg13.patch
>
> In view of commit 6919b7e3294702adc39effd16634b2715d04f012, I think
> that this has approximately a 0% chance of being acceptable. If you're
> setting a field in a way that is inconsistent with the current use of
> the field, you're probably doing it wrong, because the field has an
> existing purpose to which new code must conform. And if you're not
> doing that, then you don't need to rename it.
Thank you for pointing it out.
I've rolled back the rename.
But I still need rd_localtemp to be true, The reason is that
1 GTT The GTT needs to support DML in read-only transactions ,like local temp table.
2 GTT does not need to hold the lock before modifying the index buffer ,also like local temp table.

Please give me feedback.

maybe some like

rel->rd_globaltemp = true;

and somewhere else

if (rel->rd_localtemp || rel->rd_globaltemp)
{
  ...
}
I tried to optimize code in global_temporary_table_v10-pg13.patch


Please give me feedback.

I tested this patch and I have not any objections - from my user perspective it is work as I expect

+#define RELATION_IS_TEMP(relation) \
+ ((relation)->rd_islocaltemp || \
+ (relation)->rd_rel->relpersistence == RELPERSISTENCE_GLOBAL_TEMP)
 
It looks little bit unbalanced

maybe is better to inject rd_isglobaltemp to relation structure

and then

it should to like

+#define RELATION_IS_TEMP(relation) \
+ ((relation)->rd_islocaltemp || \
+ (relation)->rd_isglobaltemp))

But I have not idea if it helps in complex
In my opinion
For local temp table we need (relation)->rd_rel->relpersistence == RELPERSISTENCE_TEMP 
and because one local temp table belongs to only one session, need to mark one sessions rd_islocaltemp = true ,and other to rd_islocaltemp = false.

But For GTT, just need (relation)->rd_rel->relpersistence == RELPERSISTENCE_GLOBAL_GLOBAL_TEMP
One GTT can be used for every session, so no need rd_isglobaltemp anymore. This seems duplicated and redundant.








Wenjing






Wenjing




>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


po 3. 2. 2020 v 14:03 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年2月2日 上午2:00,Pavel Stehule <pavel.stehule@gmail.com> 写道:



so 1. 2. 2020 v 14:39 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年1月30日 下午10:21,Pavel Stehule <pavel.stehule@gmail.com> 写道:



čt 30. 1. 2020 v 15:17 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


> 2020年1月29日 下午9:48,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Tue, Jan 28, 2020 at 12:12 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
>>> Opinion by Pavel
>>> + rel->rd_islocaltemp = true;  <<<<<<< if this is valid, then the name of field "rd_islocaltemp" is not probably best
>>> I renamed rd_islocaltemp
>>
>> I don't see any change?
>>
>> Rename rd_islocaltemp to rd_istemp  in global_temporary_table_v8-pg13.patch
>
> In view of commit 6919b7e3294702adc39effd16634b2715d04f012, I think
> that this has approximately a 0% chance of being acceptable. If you're
> setting a field in a way that is inconsistent with the current use of
> the field, you're probably doing it wrong, because the field has an
> existing purpose to which new code must conform. And if you're not
> doing that, then you don't need to rename it.
Thank you for pointing it out.
I've rolled back the rename.
But I still need rd_localtemp to be true, The reason is that
1 GTT The GTT needs to support DML in read-only transactions ,like local temp table.
2 GTT does not need to hold the lock before modifying the index buffer ,also like local temp table.

Please give me feedback.

maybe some like

rel->rd_globaltemp = true;

and somewhere else

if (rel->rd_localtemp || rel->rd_globaltemp)
{
  ...
}
I tried to optimize code in global_temporary_table_v10-pg13.patch


Please give me feedback.

I tested this patch and I have not any objections - from my user perspective it is work as I expect

+#define RELATION_IS_TEMP(relation) \
+ ((relation)->rd_islocaltemp || \
+ (relation)->rd_rel->relpersistence == RELPERSISTENCE_GLOBAL_TEMP)
 
It looks little bit unbalanced

maybe is better to inject rd_isglobaltemp to relation structure

and then

it should to like

+#define RELATION_IS_TEMP(relation) \
+ ((relation)->rd_islocaltemp || \
+ (relation)->rd_isglobaltemp))

But I have not idea if it helps in complex
In my opinion
For local temp table we need (relation)->rd_rel->relpersistence == RELPERSISTENCE_TEMP 
and because one local temp table belongs to only one session, need to mark one sessions rd_islocaltemp = true ,and other to rd_islocaltemp = false.

so it means so table is assigned to current session or not. In this moment I think so name "islocaltemp" is not the best - because there can be "local temp table" that has this value false.

The name should be better describe so this table as attached to only current session, and not for other, or is accessed by all session.

In this case I can understand why for GTT is possible to write ..rd_islocaltemp = true. But is signal so rd_islocaltemp is not good name and rd_istemp is not good name too.



But For GTT, just need (relation)->rd_rel->relpersistence == RELPERSISTENCE_GLOBAL_GLOBAL_TEMP
One GTT can be used for every session, so no need rd_isglobaltemp anymore. This seems duplicated and redundant.

I didn't understand well the sematic of rd_islocaltemp so my ideas in this topics was not good. Now I think so rd_islocalname is not good name and can be renamed if some body find better name. "istemptable" is not good too, because there is important if relation is attached to the session or not.










Wenjing






Wenjing




>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年2月3日 下午4:16,Konstantin Knizhnik <k.knizhnik@postgrespro.ru> 写道:



On 01.02.2020 19:14, 曾文旌(义从) wrote:


2020年1月27日 下午5:38,Konstantin Knizhnik <k.knizhnik@postgrespro.ru> 写道:



On 25.01.2020 18:15, 曾文旌(义从) wrote:
I wonder why do we need some special check for GTT here.
From my point of view cleanup at startup of local storage of temp tables should be performed in the same way for local and global temp tables.
After oom kill, In autovacuum, the Isolated local temp table will be cleaned like orphan temporary tables. The definition of local temp table is deleted with the storage file. 
But GTT can not do that. So we have the this implementation in my patch.
If you have other solutions, please let me know.

I wonder if it is possible that autovacuum or some other Postgres process is killed by OOM and postmaster is not noticing it can doens't restart Postgres instance?
as far as I know, crash of any process connected to Postgres shared memory (and autovacuum definitely has such connection) cause Postgres restart.
Postmaster will not restart after oom happen, but the startup process will. GTT data files are cleaned up in the startup process.

Yes, exactly.
But it is still not clear to me why do we need some special handling for GTT?
Shared memory is reinitialized and storage of temporary tables is removed.
It is true for both local and global temp tables.
Of course not. The local temp table cleans up the entire table (including catalog buffer and datafile). GTT is not.




In my design
1 Because different sessions have different transaction information, I choose to store the transaction information of GTT in MyProc,not catalog.
2 About the XID wraparound problem, the reason is the design of the temp table storage(local temp table and global temp table) that makes it can not to do vacuum by autovacuum. 
It should be completely solve at the storage level.


My point of view is that vacuuming of temp tables is common problem for local and global temp tables.
So it has to be addressed in the common way and so we should not try to fix this problem only for GTT.
I think I agree with you this point.
However, this does not mean that GTT transaction information stored in pg_class is correct.
If you keep it that way, like in global_private_temp-8.patch, It may cause data loss in GTT after aotuvauum.

In my patch autovacuum is prohibited for GTT.
But vacuum GTT is not prohibited. 


IMHO forced terminated of client sessions is not acceptable solution.
And it is not an absolutely necessary requirement.
So from my point of view we should not add such limitations to GTT design.
This limitation makes it possible for the GTT to do all the DDL.
IMHO even oracle's GTT has similar limitations.

I have checked that Oracle is not preventing creation of index for GTT if there are some active sessions working with this table. And this index becomes visible for all this sessions.
1 Yes The creation of inde gtt has been improved in global_temporary_table_v10-pg13.patch
2 But alter GTT ; drop GTT ; drop index on GTT is blocked by other sessions

SQL> drop table gtt;
drop table gtt
           *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already
in use


SQL> ALTER TABLE gtt add b int ; 
ALTER TABLE gtt add b int
*
ERROR at line 1:
ORA-14450: attempt to access a transactional temp table already in use

SQL> drop index idx_gtt;
drop index idx_gtt
           *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already
in use

I'm not saying we should do this, but from an implementation perspective we face similar issues.
If a dba changes a GTT, he can do it. Therefore, I think it is acceptable to do so.



As global_private_temp-8.patch, think about:
1 session X tale several hours doing some statistical work with the GTT A, which generated some data using transaction 100, The work is not over.
2 Then session Y vacuumed A, and the GTT's relfrozenxid (in pg_class) was updated to 1000 0000.
3 Then the aotuvacuum happened, the clog  before 1000 0000 was cleaned up.
4 The data in session A could be lost due to missing clog, The analysis task failed.

However This is likely to happen because you allowed the GTT do vacuum. 
And this is not a common problem, that not happen with local temp tables.
I feel uneasy about leaving such a question. We can improve it.


May be the easies solution is to prohibit explicit vacuum of GTT?
I think vacuum is an important part of GTT.

Looking back at previous emails, robert once said that vacuum GTT is pretty important.
https://www.postgresql.org/message-id/CA%2BTgmob%3DL1k0cpXRcipdsaE07ok%2BOn%3DtTjRiw7FtD_D2T%3DJwhg%40mail.gmail.com


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:


On 04.02.2020 18:01, 曾文旌(义从) wrote:



Yes, exactly.
But it is still not clear to me why do we need some special handling for GTT?
Shared memory is reinitialized and storage of temporary tables is removed.
It is true for both local and global temp tables.
Of course not. The local temp table cleans up the entire table (including catalog buffer and datafile). GTT is not.


What do you mean by "catalog buffer"?
Yes, cleanup of local temp table requires deletion of correspondent entry from catalog and GTT should not do it.
But  I am speaking only about cleanup of data files of temp relations. It is done in the same way for local and global temp tables.


In my patch autovacuum is prohibited for GTT.
But vacuum GTT is not prohibited. 

Yes, but the simplest solution is to prohibit also explicit vacuum of GTT, isn't it?


IMHO forced terminated of client sessions is not acceptable solution.
And it is not an absolutely necessary requirement.
So from my point of view we should not add such limitations to GTT design.
This limitation makes it possible for the GTT to do all the DDL.
IMHO even oracle's GTT has similar limitations.

I have checked that Oracle is not preventing creation of index for GTT if there are some active sessions working with this table. And this index becomes visible for all this sessions.
1 Yes The creation of inde gtt has been improved in global_temporary_table_v10-pg13.patch
2 But alter GTT ; drop GTT ; drop index on GTT is blocked by other sessions

Yes, you are right.
Orale documetation says:
>  1) DDL operation on global temporary tables

> It is not possible to perform a DDL operation (except TRUNCATE) on an existing global temporary table if one or more sessions are currently bound to that table. But looks like create index is not considered as DDL operation on GTT and is also supported by Oracle.

Your approach with prohibiting such accessed using shared cache is certainly better then my attempt to prohibit such DDLs for GTT at all.
I just what to eliminate maintenance of such shared cache to simplify the patch.

But I still think that we should allow truncation of GTT and creating/dropping indexes on it without any limitations.


May be the easies solution is to prohibit explicit vacuum of GTT?
I think vacuum is an important part of GTT.

Looking back at previous emails, robert once said that vacuum GTT is pretty important.
https://www.postgresql.org/message-id/CA%2BTgmob%3DL1k0cpXRcipdsaE07ok%2BOn%3DtTjRiw7FtD_D2T%3DJwhg%40mail.gmail.com


Well, may be I am not right.
I never saw use cases where temp table are used not like append-only storage (when temp table tuples are updated multiple times).
But I think that if such problem actually exists then solution is to support autovacuum for temp tables, rather than allow manual vacuum.
Certainly it can not be done by another  worker because it has no access to private backend's data. But it can done incrementally by backend itself.


Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Sat, Feb 1, 2020 at 11:14 AM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
> As global_private_temp-8.patch, think about:
> 1 session X tale several hours doing some statistical work with the GTT A, which generated some data using
transaction100, The work is not over. 
> 2 Then session Y vacuumed A, and the GTT's relfrozenxid (in pg_class) was updated to 1000 0000.
> 3 Then the aotuvacuum happened, the clog  before 1000 0000 was cleaned up.
> 4 The data in session A could be lost due to missing clog, The analysis task failed.
>
> However This is likely to happen because you allowed the GTT do vacuum.
> And this is not a common problem, that not happen with local temp tables.
> I feel uneasy about leaving such a question. We can improve it.

Each session is going to need to maintain its own notion of the
relfrozenxid and relminmxid of each GTT to which it is attached.
Storing the values in pg_class makes no sense and is completely
unacceptable.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Mon, Feb 3, 2020 at 3:08 AM Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
> Thank you for explanation.
> You convinced me that building indexes from _bt_getbuf is not good idea.
> What do you think about idea to check and build indexes for GTT prior to
> query execution?
>
> In this case we do not need to patch code of all indexes - it can be
> done just in one place.
> We can use build function of access method to initialize index and
> populate it with data.
>
> So right now when building query execution plan, optimizer checks if
> index is valid.
> If index belongs to GTT, it an check that first page of the index is
> initialized and if not - call build method for this index.
>
> If building index during building query plan is not desirable, we can
> just construct list of indexes which should be checked and
> perform check itself and building indexes somewhere after building plan
> but for execution of the query.
>
> Do you seem some problems with such approach?

My guess it that the right time to do this work is just after we
acquire locks, at the end of parse analysis. I think trying to do it
during execution is too late, since the planner looks at indexes, and
trying to do it in the planner instead of before we start planning
seems more likely to cause bugs and has no real advantages. It's just
better to do complicated things (like creating indexes) separately
rather than in the middle of some other complicated thing (like
planning). I could tie my shoelaces the first time they get tangled up
with my break pedal but it's better to do it before I get in the car.

And I'm still inclined to do it by flat-copying files rather than
calling ambuild. It will be slightly faster, but also importantly, it
will guarantee that (1) every backend gets exactly the same initial
state and (2) it has fewer ways to fail because it doesn't involve
calling any user-defined code. Those seem like fairly compelling
advantages, and I don't see what the disadvantages are. I think
calling ambuild() at the point in time proposed in the preceding
paragraph would be fairly safe and would probably work OK most of the
time, but I can't think of any reason it would be better.

Incidentally, what I'd be inclined to do is - if the session is
running a query that does only read-only operations, let it continue
to point to the "master" copy of the GTT and its indexes, which is
stored in the relfilenodes indicated for those relations in pg_class.
If it's going to acquire a lock heavier than AccessShareLock, then
give it is own copies of the table and indexes, stored in a temporary
relfilenode (tXXX_YYY) and redirect all future access to that GTT by
this backend to there. Maybe there's some reason this won't work, but
it seems nice to avoid saying that we've "attached" to the GTT if all
we did is read the empty table.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:

On 05.02.2020 00:38, Robert Haas wrote:
>
> My guess it that the right time to do this work is just after we
> acquire locks, at the end of parse analysis. I think trying to do it
> during execution is too late, since the planner looks at indexes, and
> trying to do it in the planner instead of before we start planning
> seems more likely to cause bugs and has no real advantages. It's just
> better to do complicated things (like creating indexes) separately
> rather than in the middle of some other complicated thing (like
> planning). I could tie my shoelaces the first time they get tangled up
> with my break pedal but it's better to do it before I get in the car.
I have implemented this approach in my new patch

https://www.postgresql.org/message-id/3e88b59f-73e8-685e-4983-9026f94c57c5%40postgrespro.ru

I have added check whether index is initialized or not to plancat.c 
where optimizer checks if index is valid.
Now it should work for all kinds of indexes (B-Tree, hash, user defined 
access methods...).
>
> And I'm still inclined to do it by flat-copying files rather than
> calling ambuild. It will be slightly faster, but also importantly, it
> will guarantee that (1) every backend gets exactly the same initial
> state and (2) it has fewer ways to fail because it doesn't involve
> calling any user-defined code. Those seem like fairly compelling
> advantages, and I don't see what the disadvantages are. I think
> calling ambuild() at the point in time proposed in the preceding
> paragraph would be fairly safe and would probably work OK most of the
> time, but I can't think of any reason it would be better.

There is very important reason (from my point of view): allow other 
sessions to use created index and
so provide compatible behavior with regular tables (and with Oracle).
So we should be able to populate index with existed GTT data.
And ambuild will do it.

>
> Incidentally, what I'd be inclined to do is - if the session is
> running a query that does only read-only operations, let it continue
> to point to the "master" copy of the GTT and its indexes, which is
> stored in the relfilenodes indicated for those relations in pg_class.
> If it's going to acquire a lock heavier than AccessShareLock, then
> give it is own copies of the table and indexes, stored in a temporary
> relfilenode (tXXX_YYY) and redirect all future access to that GTT by
> this backend to there. Maybe there's some reason this won't work, but
> it seems nice to avoid saying that we've "attached" to the GTT if all
> we did is read the empty table.
>
Sorry, I do not understand the benefits of such optimization. It seems 
to be very rare situation when session will try to access temp table 
which was not previously filled with data. But even if it happen, 
keeping "master" copy will not safe much: we in any case have shared 
metadata and no data. Yes, with current approach, first access to GTT 
will cause creation of empty indexes. But It is just initialization of 
1-3 pages. I do not think that delaying index initialization can be 
really useful.

In any case, calling ambuild is the simplest and most universal 
approach, providing desired and compatible behavior.
I really do not understand why we should try yo invent some alternative 
solution.





Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:

> 2020年2月5日 上午4:57,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Sat, Feb 1, 2020 at 11:14 AM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
>> As global_private_temp-8.patch, think about:
>> 1 session X tale several hours doing some statistical work with the GTT A, which generated some data using
transaction100, The work is not over. 
>> 2 Then session Y vacuumed A, and the GTT's relfrozenxid (in pg_class) was updated to 1000 0000.
>> 3 Then the aotuvacuum happened, the clog  before 1000 0000 was cleaned up.
>> 4 The data in session A could be lost due to missing clog, The analysis task failed.
>>
>> However This is likely to happen because you allowed the GTT do vacuum.
>> And this is not a common problem, that not happen with local temp tables.
>> I feel uneasy about leaving such a question. We can improve it.
>
> Each session is going to need to maintain its own notion of the
> relfrozenxid and relminmxid of each GTT to which it is attached.
> Storing the values in pg_class makes no sense and is completely
> unacceptable.
Yes, I've implemented it in global_temporary_table_v10-pg13.patch

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company




Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年2月5日 上午12:47,Konstantin Knizhnik <k.knizhnik@postgrespro.ru> 写道:



On 04.02.2020 18:01, 曾文旌(义从) wrote:



Yes, exactly.
But it is still not clear to me why do we need some special handling for GTT?
Shared memory is reinitialized and storage of temporary tables is removed.
It is true for both local and global temp tables.
Of course not. The local temp table cleans up the entire table (including catalog buffer and datafile). GTT is not.


What do you mean by "catalog buffer"?
Yes, cleanup of local temp table requires deletion of correspondent entry from catalog and GTT should not do it.
But  I am speaking only about cleanup of data files of temp relations. It is done in the same way for local and global temp tables.
For native pg, the data file of temp table will not be cleaned up direct after oom happen.
Because the orphan local temp table(include catalog, local buffer, datafile) will be cleaned up by deleting the orphan temp schame in autovacuum.
So for GTT ,we cannot do the same with just deleting data files. This is why I dealt with it specifically.



In my patch autovacuum is prohibited for GTT.
But vacuum GTT is not prohibited. 

Yes, but the simplest solution is to prohibit also explicit vacuum of GTT, isn't it?


IMHO forced terminated of client sessions is not acceptable solution.
And it is not an absolutely necessary requirement.
So from my point of view we should not add such limitations to GTT design.
This limitation makes it possible for the GTT to do all the DDL.
IMHO even oracle's GTT has similar limitations.

I have checked that Oracle is not preventing creation of index for GTT if there are some active sessions working with this table. And this index becomes visible for all this sessions.
1 Yes The creation of inde gtt has been improved in global_temporary_table_v10-pg13.patch
2 But alter GTT ; drop GTT ; drop index on GTT is blocked by other sessions

Yes, you are right.
Orale documetation says:
>  1) DDL operation on global temporary tables

> It is not possible to perform a DDL operation (except TRUNCATE) on an existing global temporary table if one or more sessions are currently bound to that table. But looks like create index is not considered as DDL operation on GTT and is also supported by Oracle.


Your approach with prohibiting such accessed using shared cache is certainly better then my attempt to prohibit such DDLs for GTT at all.
I just what to eliminate maintenance of such shared cache to simplify the patch.

But I still think that we should allow truncation of GTT and creating/dropping indexes on it without any limitations. 
I think the goal of this work is this.
But, the first step is let GTT get as many features as possible on regular tables, even with some limitations.


May be the easies solution is to prohibit explicit vacuum of GTT?
I think vacuum is an important part of GTT.

Looking back at previous emails, robert once said that vacuum GTT is pretty important.
https://www.postgresql.org/message-id/CA%2BTgmob%3DL1k0cpXRcipdsaE07ok%2BOn%3DtTjRiw7FtD_D2T%3DJwhg%40mail.gmail.com


Well, may be I am not right.
I never saw use cases where temp table are used not like append-only storage (when temp table tuples are updated multiple times).
But I think that if such problem actually exists then solution is to support autovacuum for temp tables, rather than allow manual vacuum.
Certainly it can not be done by another  worker because it has no access to private backend's data. But it can done incrementally by backend itself.



Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Wed, Feb 5, 2020 at 2:28 AM Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
> There is very important reason (from my point of view): allow other
> sessions to use created index and
> so provide compatible behavior with regular tables (and with Oracle).
> So we should be able to populate index with existed GTT data.
> And ambuild will do it.

I don't understand. A global temporary table, as I understand it, is a
table for which each session sees separate contents. So you would
never need to populate it with existing data.

Besides, even if you did, how are you going to get the data for the
table? If you get the table data by flat-copying the table, then you
could copy the index files too. And you would want to, because if the
table contains a large amount of data, building indexes will be
expensive. If the index is *empty*, a file copy will not be much
cheaper than calling ambuild(), but if it's got a lot of data in it,
it will.

> Sorry, I do not understand the benefits of such optimization. It seems
> to be very rare situation when session will try to access temp table
> which was not previously filled with data. But even if it happen,
> keeping "master" copy will not safe much: we in any case have shared
> metadata and no data. Yes, with current approach, first access to GTT
> will cause creation of empty indexes. But It is just initialization of
> 1-3 pages. I do not think that delaying index initialization can be
> really useful.

You might be right, but you're misunderstanding the nature of my
concern. We probably can't allow DDL on a GTT unless no sessions are
attached. Having sessions that just read the empty GTT be considered
as "not attached" might make it easier for some users to find a time
when no backend is attached and thus DDL is possible.

> In any case, calling ambuild is the simplest and most universal
> approach, providing desired and compatible behavior.

Calling ambuild is definitely not simpler than a plain file copy. I
don't know how you can contend otherwise.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Wed, Feb 5, 2020 at 8:21 AM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
> What do you mean by "catalog buffer"?
> Yes, cleanup of local temp table requires deletion of correspondent entry from catalog and GTT should not do it.
> But  I am speaking only about cleanup of data files of temp relations. It is done in the same way for local and
globaltemp tables. 
>
> For native pg, the data file of temp table will not be cleaned up direct after oom happen.
> Because the orphan local temp table(include catalog, local buffer, datafile) will be cleaned up by deleting the
orphantemp schame in autovacuum. 
> So for GTT ,we cannot do the same with just deleting data files. This is why I dealt with it specifically.

After a crash restart, all temporary relfilenodes (e.g t12345_67890)
are removed. I think GTTs should use relfilenodes of this general
form, and then they'll be cleaned up by the existing code. For a
regular temporary table, there is also the problem of removing the
catalog entries, but GTTs shouldn't have this problem, because a GTT
doesn't have any catalog entries for individual sessions, just for the
main object, which isn't going away just because the system restarted.
Right?

> In my patch autovacuum is prohibited for GTT.
>
> But vacuum GTT is not prohibited.

That sounds right to me.

This thread is getting very hard to follow because neither Konstantin
nor Wenjing seem to be using the standard method of quoting. When I
reply, I get the whole thing quoted with "> " but can't easily tell
the difference between what Wenjing wrote and what Konstantin wrote,
because both of your mailers are quoting using indentation rather than
"> " and it gets wiped out by my mailer. Please see if you can get
your mailer to do what is normally done on this mailing list.

Thanks,

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:

On 05.02.2020 17:10, Robert Haas wrote:
> On Wed, Feb 5, 2020 at 2:28 AM Konstantin Knizhnik
> <k.knizhnik@postgrespro.ru> wrote:
>> There is very important reason (from my point of view): allow other
>> sessions to use created index and
>> so provide compatible behavior with regular tables (and with Oracle).
>> So we should be able to populate index with existed GTT data.
>> And ambuild will do it.
> I don't understand. A global temporary table, as I understand it, is a
> table for which each session sees separate contents. So you would
> never need to populate it with existing data.
Session 1:
create global temp table gtt(x integer);
insert into gtt values (generate_series(1,100000));

Session 2:
insert into gtt values (generate_series(1,200000));

Session1:
create index on gtt(x);
explain select * from gtt where x = 1;

Session2:
explain select * from gtt where x = 1;
??? Should we use index here?

My answer is - yes.
Just because:
- Such behavior is compatible with regular tables. So it will not 
confuse users and doesn't require some complex explanations.
- It is compatible with Oracle.
- It is what DBA usually want when creating index.
-
There are several arguments against such behavior:
- Concurrent building of index in multiple sessions can consume a lot of 
memory
- Building index can increase query execution time (which can be not 
expected by clients)

I have discussion about it with Pavel here in Pgcon Moscow but we can 
not convince each other.
May be we should provide a choice to the user, by means of GUC or index 
creating parameter.


>
> Besides, even if you did, how are you going to get the data for the
> table? If you get the table data by flat-copying the table, then you
> could copy the index files too. And you would want to, because if the
> table contains a large amount of data, building indexes will be
> expensive. If the index is *empty*, a file copy will not be much
> cheaper than calling ambuild(), but if it's got a lot of data in it,
> it will.

Sorry, I do not understand you.
ambuild is called locally by each backend on first access to the GTT index.
It is done at the moment of building query execution plan when we check 
whether index is valid.
May be it will be sensible to postpone this check and do it for indexes 
which are actually used in query execution plan.

>
>> Sorry, I do not understand the benefits of such optimization. It seems
>> to be very rare situation when session will try to access temp table
>> which was not previously filled with data. But even if it happen,
>> keeping "master" copy will not safe much: we in any case have shared
>> metadata and no data. Yes, with current approach, first access to GTT
>> will cause creation of empty indexes. But It is just initialization of
>> 1-3 pages. I do not think that delaying index initialization can be
>> really useful.
> You might be right, but you're misunderstanding the nature of my
> concern. We probably can't allow DDL on a GTT unless no sessions are
> attached. Having sessions that just read the empty GTT be considered
> as "not attached" might make it easier for some users to find a time
> when no backend is attached and thus DDL is possible.

Ok, now I understand the problem your are going to address.
But still I never saw use cases when empty temp tables are accessed.
Usually we save in temp table some intermediate results of complex query.
Certainly it can happen that query returns empty result.
But usually temp table are used when we expect huge result (otherwise 
materializing result in temp table is not needed).
So I do not think that such optimization can help much in performing DDL 
for GTT.



>
>> In any case, calling ambuild is the simplest and most universal
>> approach, providing desired and compatible behavior.
> Calling ambuild is definitely not simpler than a plain file copy. I
> don't know how you can contend otherwise.
>

This is code fragment whichbuild GTT index on demand:

     if (index->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
     {
         Buffer metapage = ReadBuffer(index, 0);
         bool isNew = PageIsNew(BufferGetPage(metapage));
         ReleaseBuffer(metapage);
         if (isNew)
         {
             Relation heap;
DropRelFileNodeAllLocalBuffers(index->rd_smgr->smgr_rnode.node);
             heap = RelationIdGetRelation(index->rd_index->indrelid);
             index->rd_indam->ambuild(heap, index, BuildIndexInfo(index));
             RelationClose(heap);
         }
     }

That is all - just 10 line of code.
I can make a bet that maintaining separate fork for indexes and copying 
data from it will require much more coding.







Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


st 5. 2. 2020 v 16:48 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 05.02.2020 17:10, Robert Haas wrote:
> On Wed, Feb 5, 2020 at 2:28 AM Konstantin Knizhnik
> <k.knizhnik@postgrespro.ru> wrote:
>> There is very important reason (from my point of view): allow other
>> sessions to use created index and
>> so provide compatible behavior with regular tables (and with Oracle).
>> So we should be able to populate index with existed GTT data.
>> And ambuild will do it.
> I don't understand. A global temporary table, as I understand it, is a
> table for which each session sees separate contents. So you would
> never need to populate it with existing data.
Session 1:
create global temp table gtt(x integer);
insert into gtt values (generate_series(1,100000));

Session 2:
insert into gtt values (generate_series(1,200000));

Session1:
create index on gtt(x);
explain select * from gtt where x = 1;

Session2:
explain select * from gtt where x = 1;
??? Should we use index here?

My answer is - yes.
Just because:
- Such behavior is compatible with regular tables. So it will not
confuse users and doesn't require some complex explanations.
- It is compatible with Oracle.
- It is what DBA usually want when creating index.
-
There are several arguments against such behavior:
- Concurrent building of index in multiple sessions can consume a lot of
memory
- Building index can increase query execution time (which can be not
expected by clients)

I have discussion about it with Pavel here in Pgcon Moscow but we can
not convince each other.
May be we should provide a choice to the user, by means of GUC or index
creating parameter.

I prefer some creating index parameter for enforcing creating indexes to living other session.

In this case I think so too much strongly the best design depends on context so there cannot to exists one design (both proposed behaves has sense and has contrary advantages and disadvantages). Unfortunately only one behave can be default.

Regards

Pavel


>
> Besides, even if you did, how are you going to get the data for the
> table? If you get the table data by flat-copying the table, then you
> could copy the index files too. And you would want to, because if the
> table contains a large amount of data, building indexes will be
> expensive. If the index is *empty*, a file copy will not be much
> cheaper than calling ambuild(), but if it's got a lot of data in it,
> it will.

Sorry, I do not understand you.
ambuild is called locally by each backend on first access to the GTT index.
It is done at the moment of building query execution plan when we check
whether index is valid.
May be it will be sensible to postpone this check and do it for indexes
which are actually used in query execution plan.

>
>> Sorry, I do not understand the benefits of such optimization. It seems
>> to be very rare situation when session will try to access temp table
>> which was not previously filled with data. But even if it happen,
>> keeping "master" copy will not safe much: we in any case have shared
>> metadata and no data. Yes, with current approach, first access to GTT
>> will cause creation of empty indexes. But It is just initialization of
>> 1-3 pages. I do not think that delaying index initialization can be
>> really useful.
> You might be right, but you're misunderstanding the nature of my
> concern. We probably can't allow DDL on a GTT unless no sessions are
> attached. Having sessions that just read the empty GTT be considered
> as "not attached" might make it easier for some users to find a time
> when no backend is attached and thus DDL is possible.

Ok, now I understand the problem your are going to address.
But still I never saw use cases when empty temp tables are accessed.
Usually we save in temp table some intermediate results of complex query.
Certainly it can happen that query returns empty result.
But usually temp table are used when we expect huge result (otherwise
materializing result in temp table is not needed).
So I do not think that such optimization can help much in performing DDL
for GTT.



>
>> In any case, calling ambuild is the simplest and most universal
>> approach, providing desired and compatible behavior.
> Calling ambuild is definitely not simpler than a plain file copy. I
> don't know how you can contend otherwise.
>

This is code fragment whichbuild GTT index on demand:

     if (index->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
     {
         Buffer metapage = ReadBuffer(index, 0);
         bool isNew = PageIsNew(BufferGetPage(metapage));
         ReleaseBuffer(metapage);
         if (isNew)
         {
             Relation heap;
DropRelFileNodeAllLocalBuffers(index->rd_smgr->smgr_rnode.node);
             heap = RelationIdGetRelation(index->rd_index->indrelid);
             index->rd_indam->ambuild(heap, index, BuildIndexInfo(index));
             RelationClose(heap);
         }
     }

That is all - just 10 line of code.
I can make a bet that maintaining separate fork for indexes and copying
data from it will require much more coding.




Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:

> 2020年2月5日 下午10:15,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Wed, Feb 5, 2020 at 8:21 AM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
>> What do you mean by "catalog buffer"?
>> Yes, cleanup of local temp table requires deletion of correspondent entry from catalog and GTT should not do it.
>> But  I am speaking only about cleanup of data files of temp relations. It is done in the same way for local and
globaltemp tables. 
>>
>> For native pg, the data file of temp table will not be cleaned up direct after oom happen.
>> Because the orphan local temp table(include catalog, local buffer, datafile) will be cleaned up by deleting the
orphantemp schame in autovacuum. 
>> So for GTT ,we cannot do the same with just deleting data files. This is why I dealt with it specifically.
>
> After a crash restart, all temporary relfilenodes (e.g t12345_67890)
> are removed. I think GTTs should use relfilenodes of this general
> form, and then they'll be cleaned up by the existing code. For a
> regular temporary table, there is also the problem of removing the
> catalog entries, but GTTs shouldn't have this problem, because a GTT
> doesn't have any catalog entries for individual sessions, just for the
> main object, which isn't going away just because the system restarted.
> Right?
Wenjing wrote:
I have implemented its processing in global_temporary_table_v10-pg13.patch
When oom happen, all backend will be killed.
Then, I choose to clean up these files(all like t12345_67890) in startup process.

Wenjing

>
>> In my patch autovacuum is prohibited for GTT.
>>
>> But vacuum GTT is not prohibited.
>
> That sounds right to me.
Wenjing wrote:
Also implemented in global_temporary_table_v10-pg13.patch

Wenjing

>
> This thread is getting very hard to follow because neither Konstantin
> nor Wenjing seem to be using the standard method of quoting. When I
> reply, I get the whole thing quoted with "> " but can't easily tell
> the difference between what Wenjing wrote and what Konstantin wrote,
> because both of your mailers are quoting using indentation rather than
> "> " and it gets wiped out by my mailer. Please see if you can get
> your mailer to do what is normally done on this mailing list.
>
> Thanks,
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company




Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Wed, Feb 5, 2020 at 10:48 AM Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
> > I don't understand. A global temporary table, as I understand it, is a
> > table for which each session sees separate contents. So you would
> > never need to populate it with existing data.
> Session 1:
> create global temp table gtt(x integer);
> insert into gtt values (generate_series(1,100000));
>
> Session 2:
> insert into gtt values (generate_series(1,200000));
>
> Session1:
> create index on gtt(x);
> explain select * from gtt where x = 1;
>
> Session2:
> explain select * from gtt where x = 1;
> ??? Should we use index here?

OK, I see where you're coming from now.

> My answer is - yes.
> Just because:
> - Such behavior is compatible with regular tables. So it will not
> confuse users and doesn't require some complex explanations.
> - It is compatible with Oracle.
> - It is what DBA usually want when creating index.
> -
> There are several arguments against such behavior:
> - Concurrent building of index in multiple sessions can consume a lot of
> memory
> - Building index can increase query execution time (which can be not
> expected by clients)

I think those are good arguments, especially the second one. There's
no limit on how long building a new index might take, and it could be
several minutes. A user who was running a query that could have
completed in a few seconds or even milliseconds will be unhappy to
suddenly wait a long time for a new index to be built. And that is an
entirely realistic scenario, because the new index might be better,
but only marginally.

Also, an important point to which I've already alluded a few times is
that creating an index can fail. Now, one way it can fail is that
there could be some problem writing to disk, or you could run out of
memory, or whatever. However, it can also fail because the new index
is UNIQUE and the data this backend has in the table doesn't conform
to the associated constraint. It will be confusing if all access to a
table suddenly starts complaining about uniqueness violations.

> That is all - just 10 line of code.

I don't believe that the feature you are proposing can be correctly
implemented in 10 lines of code. I would be pleasantly surprised if it
can be done in 1000.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:

On 07.02.2020 18:15, Robert Haas wrote:
> On Wed, Feb 5, 2020 at 10:48 AM Konstantin Knizhnik
> <k.knizhnik@postgrespro.ru> wrote:
> My answer is - yes.
>> Just because:
>> - Such behavior is compatible with regular tables. So it will not
>> confuse users and doesn't require some complex explanations.
>> - It is compatible with Oracle.
>> - It is what DBA usually want when creating index.
>> -
>> There are several arguments against such behavior:
>> - Concurrent building of index in multiple sessions can consume a lot of
>> memory
>> - Building index can increase query execution time (which can be not
>> expected by clients)
> I think those are good arguments, especially the second one. There's
> no limit on how long building a new index might take, and it could be
> several minutes. A user who was running a query that could have
> completed in a few seconds or even milliseconds will be unhappy to
> suddenly wait a long time for a new index to be built. And that is an
> entirely realistic scenario, because the new index might be better,
> but only marginally.
Yes, I agree that this arguments are important.
But IMHO less important than incompatible behavior (Pavel doesn't agree 
with word "incompatible" in this context
since semantic of temp tables is in any case different with semantic of 
regular tables).

Just want to notice that if we have huge GTT (so that creation of index 
takes significant amount of time)
sequential scan of this table also will not be fast.

But in any case, if we agree that we can control thus behavior using GUC 
or index property,
then it is ok for me.



>
> Also, an important point to which I've already alluded a few times is
> that creating an index can fail. Now, one way it can fail is that
> there could be some problem writing to disk, or you could run out of
> memory, or whatever. However, it can also fail because the new index
> is UNIQUE and the data this backend has in the table doesn't conform
> to the associated constraint. It will be confusing if all access to a
> table suddenly starts complaining about uniqueness violations.

Yes, building index can fail (as any other operation with database).
What's wring with it?
If it is fatal error, then backend is terminated and content of its temp 
table is disappeared.
If it is non-fatal error, then current transaction is aborted:


Session1:
postgres=# create global temp table gtt(x integer);
CREATE TABLE
postgres=# insert into gtt values (generate_series(1,100000));
INSERT 0 100000

Session2:
postgres=# insert into gtt values (generate_series(1,100000));
INSERT 0 100000
postgres=# insert into gtt values (1);
INSERT 0 1

Session1:
postgres=# create unique index on gtt(x);
CREATE INDEX

Sessin2:
postgres=# explain select * from gtt where x=1;
ERROR:  could not create unique index "gtt_x_idx"
DETAIL:  Key (x)=(1) is duplicated.

> I don't believe that the feature you are proposing can be correctly
> implemented in 10 lines of code. I would be pleasantly surprised if it
> can be done in 1000.
>
Right now I do not see any sources of extra complexity.
Will be pleased if you can point them to me.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Fri, Feb 7, 2020 at 12:28 PM Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
> But in any case, if we agree that we can control thus behavior using GUC
> or index property,
> then it is ok for me.

Nope, I am not going to agree to that, and I don't believe that any
other committer will, either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


pá 7. 2. 2020 v 18:28 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 07.02.2020 18:15, Robert Haas wrote:
> On Wed, Feb 5, 2020 at 10:48 AM Konstantin Knizhnik
> <k.knizhnik@postgrespro.ru> wrote:
> My answer is - yes.
>> Just because:
>> - Such behavior is compatible with regular tables. So it will not
>> confuse users and doesn't require some complex explanations.
>> - It is compatible with Oracle.
>> - It is what DBA usually want when creating index.
>> -
>> There are several arguments against such behavior:
>> - Concurrent building of index in multiple sessions can consume a lot of
>> memory
>> - Building index can increase query execution time (which can be not
>> expected by clients)
> I think those are good arguments, especially the second one. There's
> no limit on how long building a new index might take, and it could be
> several minutes. A user who was running a query that could have
> completed in a few seconds or even milliseconds will be unhappy to
> suddenly wait a long time for a new index to be built. And that is an
> entirely realistic scenario, because the new index might be better,
> but only marginally.
Yes, I agree that this arguments are important.
But IMHO less important than incompatible behavior (Pavel doesn't agree
with word "incompatible" in this context
since semantic of temp tables is in any case different with semantic of
regular tables).

Just want to notice that if we have huge GTT (so that creation of index
takes significant amount of time)
sequential scan of this table also will not be fast.

But in any case, if we agree that we can control thus behavior using GUC
or index property,
then it is ok for me.



>
> Also, an important point to which I've already alluded a few times is
> that creating an index can fail. Now, one way it can fail is that
> there could be some problem writing to disk, or you could run out of
> memory, or whatever. However, it can also fail because the new index
> is UNIQUE and the data this backend has in the table doesn't conform
> to the associated constraint. It will be confusing if all access to a
> table suddenly starts complaining about uniqueness violations.

Yes, building index can fail (as any other operation with database).
What's wring with it?
If it is fatal error, then backend is terminated and content of its temp
table is disappeared.
If it is non-fatal error, then current transaction is aborted:


Session1:
postgres=# create global temp table gtt(x integer);
CREATE TABLE
postgres=# insert into gtt values (generate_series(1,100000));
INSERT 0 100000

Session2:
postgres=# insert into gtt values (generate_series(1,100000));
INSERT 0 100000
postgres=# insert into gtt values (1);
INSERT 0 1

What when session 2 has active transaction? Then to be correct, you should to wait with index creation to end of transaction.


Session1:
postgres=# create unique index on gtt(x);
CREATE INDEX

Sessin2:
postgres=# explain select * from gtt where x=1;
ERROR:  could not create unique index "gtt_x_idx"
DETAIL:  Key (x)=(1) is duplicated.

This is little bit unexpected behave (probably nobody expect so any SELECT fail with error "could not create index" - I understand exactly to reason and context, but this side effect is something what I afraid.
 

> I don't believe that the feature you are proposing can be correctly
> implemented in 10 lines of code. I would be pleasantly surprised if it
> can be done in 1000.
>
Right now I do not see any sources of extra complexity.
Will be pleased if you can point them to me.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:


On 07.02.2020 21:37, Pavel Stehule wrote:

What when session 2 has active transaction? Then to be correct, you should to wait with index creation to end of transaction.


Session1:
postgres=# create unique index on gtt(x);
CREATE INDEX

Sessin2:
postgres=# explain select * from gtt where x=1;
ERROR:  could not create unique index "gtt_x_idx"
DETAIL:  Key (x)=(1) is duplicated.

This is little bit unexpected behave (probably nobody expect so any SELECT fail with error "could not create index" - I understand exactly to reason and context, but this side effect is something what I afraid.
 
The more I thinking creation of indexes for GTT on-demand, the more contractions I see.
So looks like there are only two safe alternatives:
1. Allow DDL for GTT (including index creation) only if there are no other sessions using this GTT ("using" means that no data was inserted in GTT by this session). Things can be even more complicated if we take in account inter-table dependencies (like foreign key constraint).
2. Create indexes for GTT locally.

2) seems to be very contradictory (global table metadata, but private indexes) and hard to implement because in this case we have to maintain some private copy of index catalog to keep information about private indexes.

1) is currently implemented by Wenjing. Frankly speaking I still find such limitation too restrictive and inconvenient for users. From my point of view Oracle developers have implemented better compromise. But if I am the only person voting for such solution, then let's stop this discussion.
But in any case I think that calling ambuild to construct index for empty table is better solution than implementation of all indexes (and still not solving the problem with custom indexes).

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


ne 9. 2. 2020 v 13:05 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 07.02.2020 21:37, Pavel Stehule wrote:

What when session 2 has active transaction? Then to be correct, you should to wait with index creation to end of transaction.


Session1:
postgres=# create unique index on gtt(x);
CREATE INDEX

Sessin2:
postgres=# explain select * from gtt where x=1;
ERROR:  could not create unique index "gtt_x_idx"
DETAIL:  Key (x)=(1) is duplicated.

This is little bit unexpected behave (probably nobody expect so any SELECT fail with error "could not create index" - I understand exactly to reason and context, but this side effect is something what I afraid.
 
The more I thinking creation of indexes for GTT on-demand, the more contractions I see.
So looks like there are only two safe alternatives:
1. Allow DDL for GTT (including index creation) only if there are no other sessions using this GTT ("using" means that no data was inserted in GTT by this session). Things can be even more complicated if we take in account inter-table dependencies (like foreign key constraint).
2. Create indexes for GTT locally.

2) seems to be very contradictory (global table metadata, but private indexes) and hard to implement because in this case we have to maintain some private copy of index catalog to keep information about private indexes.

1) is currently implemented by Wenjing. Frankly speaking I still find such limitation too restrictive and inconvenient for users. From my point of view Oracle developers have implemented better compromise. But if I am the only person voting for such solution, then let's stop this discussion.

Thank you. I respect your opinion.
 
But in any case I think that calling ambuild to construct index for empty table is better solution than implementation of all indexes (and still not solving the problem with custom indexes).

I know nothing about this area - I expect so you and  Wenjing will find good solution.

We have to start with something what is simple, usable, and if it possible it is well placed to Postgres's architecture.

at @1 .. when all tables are empty for other sessions, then I don't see any problem. From practical reason, I think so requirement to don't use table in other sessions is too hard, and I can be nice (maybe it is) if creating index should not be blocked, but if I create index too late, then index is for other session (when the table is used) invalid (again it can be done in future).

I am sure, so there are not end of all days -  and there is a space for future enhancing and testing other variants. I can imagine more different variations with different advantages/disadvantages. Just for begin I prefer design that has concept closer to current Postgres.

Regards

Pavel

 

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


čt 30. 1. 2020 v 15:21 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:


čt 30. 1. 2020 v 15:17 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


> 2020年1月29日 下午9:48,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Tue, Jan 28, 2020 at 12:12 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
>>> Opinion by Pavel
>>> + rel->rd_islocaltemp = true;  <<<<<<< if this is valid, then the name of field "rd_islocaltemp" is not probably best
>>> I renamed rd_islocaltemp
>>
>> I don't see any change?
>>
>> Rename rd_islocaltemp to rd_istemp  in global_temporary_table_v8-pg13.patch
>
> In view of commit 6919b7e3294702adc39effd16634b2715d04f012, I think
> that this has approximately a 0% chance of being acceptable. If you're
> setting a field in a way that is inconsistent with the current use of
> the field, you're probably doing it wrong, because the field has an
> existing purpose to which new code must conform. And if you're not
> doing that, then you don't need to rename it.
Thank you for pointing it out.
I've rolled back the rename.
But I still need rd_localtemp to be true, The reason is that
1 GTT The GTT needs to support DML in read-only transactions ,like local temp table.
2 GTT does not need to hold the lock before modifying the index buffer ,also like local temp table.

Please give me feedback.

maybe some like

rel->rd_globaltemp = true;

and somewhere else

if (rel->rd_localtemp || rel->rd_globaltemp)
{
  ...
}


I tested this patch again and I am very well satisfied with behave.

what doesn't work still - TRUNCATE statement

postgres=# insert into foo select generate_series(1,10000);
INSERT 0 10000
postgres=# \dt+ foo
                          List of relations
┌────────┬──────┬───────┬───────┬─────────────┬────────┬─────────────┐
│ Schema │ Name │ Type  │ Owner │ Persistence │  Size  │ Description │
╞════════╪══════╪═══════╪═══════╪═════════════╪════════╪═════════════╡
│ public │ foo  │ table │ pavel │ session     │ 384 kB │             │
└────────┴──────┴───────┴───────┴─────────────┴────────┴─────────────┘
(1 row)

postgres=# truncate foo;
TRUNCATE TABLE
postgres=# \dt+ foo
                          List of relations
┌────────┬──────┬───────┬───────┬─────────────┬───────┬─────────────┐
│ Schema │ Name │ Type  │ Owner │ Persistence │ Size  │ Description │
╞════════╪══════╪═══════╪═══════╪═════════════╪═══════╪═════════════╡
│ public │ foo  │ table │ pavel │ session     │ 16 kB │             │
└────────┴──────┴───────┴───────┴─────────────┴───────┴─────────────┘
(1 row)

I expect zero size after truncate.

Regards

Pavel



Wenjing




>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年2月14日 下午5:19,Pavel Stehule <pavel.stehule@gmail.com> 写道:



čt 30. 1. 2020 v 15:21 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:


čt 30. 1. 2020 v 15:17 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


> 2020年1月29日 下午9:48,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Tue, Jan 28, 2020 at 12:12 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
>>> Opinion by Pavel
>>> + rel->rd_islocaltemp = true;  <<<<<<< if this is valid, then the name of field "rd_islocaltemp" is not probably best
>>> I renamed rd_islocaltemp
>>
>> I don't see any change?
>>
>> Rename rd_islocaltemp to rd_istemp  in global_temporary_table_v8-pg13.patch
>
> In view of commit 6919b7e3294702adc39effd16634b2715d04f012, I think
> that this has approximately a 0% chance of being acceptable. If you're
> setting a field in a way that is inconsistent with the current use of
> the field, you're probably doing it wrong, because the field has an
> existing purpose to which new code must conform. And if you're not
> doing that, then you don't need to rename it.
Thank you for pointing it out.
I've rolled back the rename.
But I still need rd_localtemp to be true, The reason is that
1 GTT The GTT needs to support DML in read-only transactions ,like local temp table.
2 GTT does not need to hold the lock before modifying the index buffer ,also like local temp table.

Please give me feedback.

maybe some like

rel->rd_globaltemp = true;

and somewhere else

if (rel->rd_localtemp || rel->rd_globaltemp)
{
  ...
}


I tested this patch again and I am very well satisfied with behave.

what doesn't work still - TRUNCATE statement

postgres=# insert into foo select generate_series(1,10000);
INSERT 0 10000
postgres=# \dt+ foo
                          List of relations
┌────────┬──────┬───────┬───────┬─────────────┬────────┬─────────────┐
│ Schema │ Name │ Type  │ Owner │ Persistence │  Size  │ Description │
╞════════╪══════╪═══════╪═══════╪═════════════╪════════╪═════════════╡
│ public │ foo  │ table │ pavel │ session     │ 384 kB │             │
└────────┴──────┴───────┴───────┴─────────────┴────────┴─────────────┘
(1 row)

postgres=# truncate foo;
TRUNCATE TABLE
postgres=# \dt+ foo
                          List of relations
┌────────┬──────┬───────┬───────┬─────────────┬───────┬─────────────┐
│ Schema │ Name │ Type  │ Owner │ Persistence │ Size  │ Description │
╞════════╪══════╪═══════╪═══════╪═════════════╪═══════╪═════════════╡
│ public │ foo  │ table │ pavel │ session     │ 16 kB │             │
└────────┴──────┴───────┴───────┴─────────────┴───────┴─────────────┘
(1 row)

I expect zero size after truncate.
Thanks for review.

I can explain, I don't think it's a bug.
The current implementation of the truncated GTT retains two blocks of FSM pages.
The same is true for truncating regular tables in subtransactions.
This is an implementation that truncates the table without changing the relfilenode of the table.


Wenjing


Regards

Pavel



Wenjing




>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company


Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:

postgres=# insert into foo select generate_series(1,10000);
INSERT 0 10000
postgres=# \dt+ foo
                          List of relations
┌────────┬──────┬───────┬───────┬─────────────┬────────┬─────────────┐
│ Schema │ Name │ Type  │ Owner │ Persistence │  Size  │ Description │
╞════════╪══════╪═══════╪═══════╪═════════════╪════════╪═════════════╡
│ public │ foo  │ table │ pavel │ session     │ 384 kB │             │
└────────┴──────┴───────┴───────┴─────────────┴────────┴─────────────┘
(1 row)

postgres=# truncate foo;
TRUNCATE TABLE
postgres=# \dt+ foo
                          List of relations
┌────────┬──────┬───────┬───────┬─────────────┬───────┬─────────────┐
│ Schema │ Name │ Type  │ Owner │ Persistence │ Size  │ Description │
╞════════╪══════╪═══════╪═══════╪═════════════╪═══════╪═════════════╡
│ public │ foo  │ table │ pavel │ session     │ 16 kB │             │
└────────┴──────┴───────┴───────┴─────────────┴───────┴─────────────┘
(1 row)

I expect zero size after truncate.
Thanks for review.

I can explain, I don't think it's a bug.
The current implementation of the truncated GTT retains two blocks of FSM pages.
The same is true for truncating regular tables in subtransactions.
This is an implementation that truncates the table without changing the relfilenode of the table.


This is not extra important feature - now this is little bit a surprise, because I was not under transaction.

Changing relfilenode, I think, is necessary, minimally for future VACUUM FULL support.

Regards

Pavel Stehule
 

Wenjing


Regards

Pavel



Wenjing




>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company


Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年2月9日 下午8:05,Konstantin Knizhnik <k.knizhnik@postgrespro.ru> 写道:



On 07.02.2020 21:37, Pavel Stehule wrote:

What when session 2 has active transaction? Then to be correct, you should to wait with index creation to end of transaction.


Session1:
postgres=# create unique index on gtt(x);
CREATE INDEX

Sessin2:
postgres=# explain select * from gtt where x=1;
ERROR:  could not create unique index "gtt_x_idx"
DETAIL:  Key (x)=(1) is duplicated.

This is little bit unexpected behave (probably nobody expect so any SELECT fail with error "could not create index" - I understand exactly to reason and context, but this side effect is something what I afraid.
 
The more I thinking creation of indexes for GTT on-demand, the more contractions I see.
So looks like there are only two safe alternatives:
1. Allow DDL for GTT (including index creation) only if there are no other sessions using this GTT ("using" means that no data was inserted in GTT by this session). Things can be even more complicated if we take in account inter-table dependencies (like foreign key constraint).
2. Create indexes for GTT locally.

2) seems to be very contradictory (global table metadata, but private indexes) and hard to implement because in this case we have to maintain some private copy of index catalog to keep information about private indexes.

1) is currently implemented by Wenjing. Frankly speaking I still find such limitation too restrictive and inconvenient for users. From my point of view Oracle developers have implemented better compromise. But if I am the only person voting for such solution, then let's stop this discussion.
But in any case I think that calling ambuild to construct index for empty table is better solution than implementation of all indexes (and still not solving the problem with custom indexes).
I made some improvements
1 Support for all indexes on GTT (using index_build build empty index).
2 Remove some ugly code in md.c bufmgr.c

Please give me feedback.


Wenjing

Attachment

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年2月15日 下午6:06,Pavel Stehule <pavel.stehule@gmail.com> 写道:


postgres=# insert into foo select generate_series(1,10000);
INSERT 0 10000
postgres=# \dt+ foo
                          List of relations
┌────────┬──────┬───────┬───────┬─────────────┬────────┬─────────────┐
│ Schema │ Name │ Type  │ Owner │ Persistence │  Size  │ Description │
╞════════╪══════╪═══════╪═══════╪═════════════╪════════╪═════════════╡
│ public │ foo  │ table │ pavel │ session     │ 384 kB │             │
└────────┴──────┴───────┴───────┴─────────────┴────────┴─────────────┘
(1 row)

postgres=# truncate foo;
TRUNCATE TABLE
postgres=# \dt+ foo
                          List of relations
┌────────┬──────┬───────┬───────┬─────────────┬───────┬─────────────┐
│ Schema │ Name │ Type  │ Owner │ Persistence │ Size  │ Description │
╞════════╪══════╪═══════╪═══════╪═════════════╪═══════╪═════════════╡
│ public │ foo  │ table │ pavel │ session     │ 16 kB │             │
└────────┴──────┴───────┴───────┴─────────────┴───────┴─────────────┘
(1 row)

I expect zero size after truncate.
Thanks for review.

I can explain, I don't think it's a bug.
The current implementation of the truncated GTT retains two blocks of FSM pages.
The same is true for truncating regular tables in subtransactions.
This is an implementation that truncates the table without changing the relfilenode of the table.


This is not extra important feature - now this is little bit a surprise, because I was not under transaction.

Changing relfilenode, I think, is necessary, minimally for future VACUUM FULL support.
Not allowing relfilenode changes is the current limit.
I think can improve on it. But ,This is a bit complicated.
so I'd like to know the necessity of this improvement.
Could you give me more details?


Regards

Pavel Stehule
 

Wenjing


Regards

Pavel



Wenjing




>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


ne 16. 2. 2020 v 16:15 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年2月15日 下午6:06,Pavel Stehule <pavel.stehule@gmail.com> 写道:


postgres=# insert into foo select generate_series(1,10000);
INSERT 0 10000
postgres=# \dt+ foo
                          List of relations
┌────────┬──────┬───────┬───────┬─────────────┬────────┬─────────────┐
│ Schema │ Name │ Type  │ Owner │ Persistence │  Size  │ Description │
╞════════╪══════╪═══════╪═══════╪═════════════╪════════╪═════════════╡
│ public │ foo  │ table │ pavel │ session     │ 384 kB │             │
└────────┴──────┴───────┴───────┴─────────────┴────────┴─────────────┘
(1 row)

postgres=# truncate foo;
TRUNCATE TABLE
postgres=# \dt+ foo
                          List of relations
┌────────┬──────┬───────┬───────┬─────────────┬───────┬─────────────┐
│ Schema │ Name │ Type  │ Owner │ Persistence │ Size  │ Description │
╞════════╪══════╪═══════╪═══════╪═════════════╪═══════╪═════════════╡
│ public │ foo  │ table │ pavel │ session     │ 16 kB │             │
└────────┴──────┴───────┴───────┴─────────────┴───────┴─────────────┘
(1 row)

I expect zero size after truncate.
Thanks for review.

I can explain, I don't think it's a bug.
The current implementation of the truncated GTT retains two blocks of FSM pages.
The same is true for truncating regular tables in subtransactions.
This is an implementation that truncates the table without changing the relfilenode of the table.


This is not extra important feature - now this is little bit a surprise, because I was not under transaction.

Changing relfilenode, I think, is necessary, minimally for future VACUUM FULL support.
Not allowing relfilenode changes is the current limit.
I think can improve on it. But ,This is a bit complicated.
so I'd like to know the necessity of this improvement.
Could you give me more details?

I don't think so GTT without support of VACUUM FULL can be accepted. Just due consistency.

Regards

Pavel



Regards

Pavel Stehule
 

Wenjing


Regards

Pavel



Wenjing




>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
Hi,
I have started testing the "Global temporary table" feature,
from "gtt_v11-pg13.patch". Below is my findings:

-- session 1:
postgres=# create global temporary table gtt1(a int);
CREATE TABLE

-- seeeion 2:
postgres=# truncate gtt1 ;
ERROR:  could not open file "base/13585/t3_16384": No such file or directory

is it expected?

On Sun, Feb 16, 2020 at 8:53 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:


ne 16. 2. 2020 v 16:15 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年2月15日 下午6:06,Pavel Stehule <pavel.stehule@gmail.com> 写道:


postgres=# insert into foo select generate_series(1,10000);
INSERT 0 10000
postgres=# \dt+ foo
                          List of relations
┌────────┬──────┬───────┬───────┬─────────────┬────────┬─────────────┐
│ Schema │ Name │ Type  │ Owner │ Persistence │  Size  │ Description │
╞════════╪══════╪═══════╪═══════╪═════════════╪════════╪═════════════╡
│ public │ foo  │ table │ pavel │ session     │ 384 kB │             │
└────────┴──────┴───────┴───────┴─────────────┴────────┴─────────────┘
(1 row)

postgres=# truncate foo;
TRUNCATE TABLE
postgres=# \dt+ foo
                          List of relations
┌────────┬──────┬───────┬───────┬─────────────┬───────┬─────────────┐
│ Schema │ Name │ Type  │ Owner │ Persistence │ Size  │ Description │
╞════════╪══════╪═══════╪═══════╪═════════════╪═══════╪═════════════╡
│ public │ foo  │ table │ pavel │ session     │ 16 kB │             │
└────────┴──────┴───────┴───────┴─────────────┴───────┴─────────────┘
(1 row)

I expect zero size after truncate.
Thanks for review.

I can explain, I don't think it's a bug.
The current implementation of the truncated GTT retains two blocks of FSM pages.
The same is true for truncating regular tables in subtransactions.
This is an implementation that truncates the table without changing the relfilenode of the table.


This is not extra important feature - now this is little bit a surprise, because I was not under transaction.

Changing relfilenode, I think, is necessary, minimally for future VACUUM FULL support.
Not allowing relfilenode changes is the current limit.
I think can improve on it. But ,This is a bit complicated.
so I'd like to know the necessity of this improvement.
Could you give me more details?

I don't think so GTT without support of VACUUM FULL can be accepted. Just due consistency.

Regards

Pavel



Regards

Pavel Stehule
 

Wenjing


Regards

Pavel



Wenjing




>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company





--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:
Hi,
I have started testing the "Global temporary table" feature,
That's great, I see hope.
from "gtt_v11-pg13.patch". Below is my findings:

-- session 1:
postgres=# create global temporary table gtt1(a int);
CREATE TABLE

-- seeeion 2:
postgres=# truncate gtt1 ;
ERROR:  could not open file "base/13585/t3_16384": No such file or directory

is it expected?

Oh ,this is a bug, I fixed it.

Wenjing


On Sun, Feb 16, 2020 at 8:53 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:


ne 16. 2. 2020 v 16:15 odesílatel 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> napsal:


2020年2月15日 下午6:06,Pavel Stehule <pavel.stehule@gmail.com> 写道:


postgres=# insert into foo select generate_series(1,10000);
INSERT 0 10000
postgres=# \dt+ foo
                          List of relations
┌────────┬──────┬───────┬───────┬─────────────┬────────┬─────────────┐
│ Schema │ Name │ Type  │ Owner │ Persistence │  Size  │ Description │
╞════════╪══════╪═══════╪═══════╪═════════════╪════════╪═════════════╡
│ public │ foo  │ table │ pavel │ session     │ 384 kB │             │
└────────┴──────┴───────┴───────┴─────────────┴────────┴─────────────┘
(1 row)

postgres=# truncate foo;
TRUNCATE TABLE
postgres=# \dt+ foo
                          List of relations
┌────────┬──────┬───────┬───────┬─────────────┬───────┬─────────────┐
│ Schema │ Name │ Type  │ Owner │ Persistence │ Size  │ Description │
╞════════╪══════╪═══════╪═══════╪═════════════╪═══════╪═════════════╡
│ public │ foo  │ table │ pavel │ session     │ 16 kB │             │
└────────┴──────┴───────┴───────┴─────────────┴───────┴─────────────┘
(1 row)

I expect zero size after truncate.
Thanks for review.

I can explain, I don't think it's a bug.
The current implementation of the truncated GTT retains two blocks of FSM pages.
The same is true for truncating regular tables in subtransactions.
This is an implementation that truncates the table without changing the relfilenode of the table.


This is not extra important feature - now this is little bit a surprise, because I was not under transaction.

Changing relfilenode, I think, is necessary, minimally for future VACUUM FULL support.
Not allowing relfilenode changes is the current limit.
I think can improve on it. But ,This is a bit complicated.
so I'd like to know the necessity of this improvement.
Could you give me more details?

I don't think so GTT without support of VACUUM FULL can be accepted. Just due consistency.

Regards

Pavel



Regards

Pavel Stehule
 

Wenjing


Regards

Pavel



Wenjing




>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company





--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Attachment

Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
On Fri, Feb 21, 2020 at 9:10 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
Hi,
I have started testing the "Global temporary table" feature,
That's great, I see hope.
from "gtt_v11-pg13.patch". Below is my findings:

-- session 1:
postgres=# create global temporary table gtt1(a int);
CREATE TABLE

-- seeeion 2:
postgres=# truncate gtt1 ;
ERROR:  could not open file "base/13585/t3_16384": No such file or directory

is it expected?

Oh ,this is a bug, I fixed it.
Thanks for the patch.
I have verified the same, Now the issue is resolved with v12 patch.

Kindly confirm the below scenario:

postgres=# create global temporary table gtt1 (c1 int unique);
CREATE TABLE

postgres=# create global temporary table gtt2 (c1 int references gtt1(c1) );
ERROR:  referenced relation "gtt1" is not a global temp table

postgres=# create table tab2 (c1 int references gtt1(c1) );
ERROR:  referenced relation "gtt1" is not a global temp table

Thanks,
Prabhat Sahu

Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
Hi All,

I observe a different behavior in  "temporary table" and "global temporary table".
Not sure if it is expected?

postgres=# create global temporary table parent1(a int)  on commit delete rows;
CREATE TABLE
postgres=# create global temporary table child1() inherits (parent1);
CREATE TABLE
postgres=# insert into parent1 values(1);
INSERT 0 1
postgres=# insert into child1 values(2);
INSERT 0 1
postgres=# select * from parent1;
 a
---
(0 rows)

postgres=# select * from child1;
 a
---
(0 rows)


postgres=# create temporary table parent2(a int)  on commit delete rows;
CREATE TABLE
postgres=# create temporary table child2() inherits (parent2);
CREATE TABLE
postgres=# insert into parent2 values(1);
INSERT 0 1
postgres=# insert into child2 values(2);
INSERT 0 1
postgres=# select * from parent2;
 a
---
 2
(1 row)

postgres=# select * from child2;
 a
---
 2
(1 row)


Thanks,
Prabhat Sahu

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


po 24. 2. 2020 v 14:34 odesílatel Prabhat Sahu <prabhat.sahu@enterprisedb.com> napsal:
Hi All,

I observe a different behavior in  "temporary table" and "global temporary table".
Not sure if it is expected?

postgres=# create global temporary table parent1(a int)  on commit delete rows;
CREATE TABLE
postgres=# create global temporary table child1() inherits (parent1);
CREATE TABLE
postgres=# insert into parent1 values(1);
INSERT 0 1
postgres=# insert into child1 values(2);
INSERT 0 1
postgres=# select * from parent1;
 a
---
(0 rows)

postgres=# select * from child1;
 a
---
(0 rows)

It is bug. Probably INHERITS clause is not well implemented for GTT




postgres=# create temporary table parent2(a int)  on commit delete rows;
CREATE TABLE
postgres=# create temporary table child2() inherits (parent2);
CREATE TABLE
postgres=# insert into parent2 values(1);
INSERT 0 1
postgres=# insert into child2 values(2);
INSERT 0 1
postgres=# select * from parent2;
 a
---
 2
(1 row)

postgres=# select * from child2;
 a
---
 2
(1 row)


Thanks,
Prabhat Sahu

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年2月24日 下午9:34,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi All,

I observe a different behavior in  "temporary table" and "global temporary table".
Not sure if it is expected?

postgres=# create global temporary table parent1(a int)  on commit delete rows;
CREATE TABLE
postgres=# create global temporary table child1() inherits (parent1);
CREATE TABLE
postgres=# insert into parent1 values(1);
INSERT 0 1
postgres=# insert into child1 values(2);
INSERT 0 1
postgres=# select * from parent1;
 a
---
(0 rows)

postgres=# select * from child1;
 a
---
(0 rows)
Because child1 inherits its father's on commit property.
I can make GTT behave like local temp table.




postgres=# create temporary table parent2(a int)  on commit delete rows;
CREATE TABLE
postgres=# create temporary table child2() inherits (parent2);
CREATE TABLE
postgres=# insert into parent2 values(1);
INSERT 0 1
postgres=# insert into child2 values(2);
INSERT 0 1
postgres=# select * from parent2;
 a
---
 2
(1 row)

postgres=# select * from child2;
 a
---
 2
(1 row)


Thanks,
Prabhat Sahu


Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年2月24日 下午9:41,Pavel Stehule <pavel.stehule@gmail.com> 写道:



po 24. 2. 2020 v 14:34 odesílatel Prabhat Sahu <prabhat.sahu@enterprisedb.com> napsal:
Hi All,

I observe a different behavior in  "temporary table" and "global temporary table".
Not sure if it is expected?

postgres=# create global temporary table parent1(a int)  on commit delete rows;
CREATE TABLE
postgres=# create global temporary table child1() inherits (parent1);
CREATE TABLE
postgres=# insert into parent1 values(1);
INSERT 0 1
postgres=# insert into child1 values(2);
INSERT 0 1
postgres=# select * from parent1;
 a
---
(0 rows)

postgres=# select * from child1;
 a
---
(0 rows)

It is bug. Probably INHERITS clause is not well implemented for GTT
I fixed the GTT's behavior, like local temp table.


Wenjing







postgres=# create temporary table parent2(a int)  on commit delete rows;
CREATE TABLE
postgres=# create temporary table child2() inherits (parent2);
CREATE TABLE
postgres=# insert into parent2 values(1);
INSERT 0 1
postgres=# insert into child2 values(2);
INSERT 0 1
postgres=# select * from parent2;
 a
---
 2
(1 row)

postgres=# select * from child2;
 a
---
 2
(1 row)


Thanks,
Prabhat Sahu


Attachment

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年2月24日 下午5:44,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

On Fri, Feb 21, 2020 at 9:10 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
Hi,
I have started testing the "Global temporary table" feature,
That's great, I see hope.
from "gtt_v11-pg13.patch". Below is my findings:

-- session 1:
postgres=# create global temporary table gtt1(a int);
CREATE TABLE

-- seeeion 2:
postgres=# truncate gtt1 ;
ERROR:  could not open file "base/13585/t3_16384": No such file or directory

is it expected?

Oh ,this is a bug, I fixed it.
Thanks for the patch.
I have verified the same, Now the issue is resolved with v12 patch.

Kindly confirm the below scenario:

postgres=# create global temporary table gtt1 (c1 int unique);
CREATE TABLE

postgres=# create global temporary table gtt2 (c1 int references gtt1(c1) );
ERROR:  referenced relation "gtt1" is not a global temp table

postgres=# create table tab2 (c1 int references gtt1(c1) );
ERROR:  referenced relation "gtt1" is not a global temp table

Thanks,
Prabhat Sahu

GTT supports foreign key constraints in global_temporary_table_v13-pg13.patch


Wenjing



Attachment

Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
Hi All,

Please check the below findings on GTT.
-- Scenario 1:
Under "information_schema", We are not allowed to create "temporary table", whereas we can CREATE/DROP "Global Temporary Table", is it expected ?

postgres=# create temporary table information_schema.temp1(c1 int);
ERROR:  cannot create temporary relation in non-temporary schema
LINE 1: create temporary table information_schema.temp1(c1 int);
                               ^

postgres=# create global temporary table information_schema.temp1(c1 int);
CREATE TABLE

postgres=# drop table information_schema.temp1 ;
DROP TABLE

-- Scenario 2:
Here I am getting the same error message in both the below cases.
We may add a "global" keyword with GTT related error message.

postgres=# create global temporary table gtt1 (c1 int unique);
CREATE TABLE
postgres=# create temporary table tmp1 (c1 int unique);
CREATE TABLE

postgres=# create temporary table tmp2 (c1 int references gtt1(c1) );
ERROR:  constraints on temporary tables may reference only temporary tables

postgres=# create global temporary table gtt2 (c1 int references tmp1(c1) );
ERROR:  constraints on temporary tables may reference only temporary tables

Thanks,
Prabhat Sahu

On Tue, Feb 25, 2020 at 2:25 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:


2020年2月24日 下午5:44,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

On Fri, Feb 21, 2020 at 9:10 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
Hi,
I have started testing the "Global temporary table" feature,
That's great, I see hope.
from "gtt_v11-pg13.patch". Below is my findings:

-- session 1:
postgres=# create global temporary table gtt1(a int);
CREATE TABLE

-- seeeion 2:
postgres=# truncate gtt1 ;
ERROR:  could not open file "base/13585/t3_16384": No such file or directory

is it expected?

Oh ,this is a bug, I fixed it.
Thanks for the patch.
I have verified the same, Now the issue is resolved with v12 patch.

Kindly confirm the below scenario:

postgres=# create global temporary table gtt1 (c1 int unique);
CREATE TABLE

postgres=# create global temporary table gtt2 (c1 int references gtt1(c1) );
ERROR:  referenced relation "gtt1" is not a global temp table

postgres=# create table tab2 (c1 int references gtt1(c1) );
ERROR:  referenced relation "gtt1" is not a global temp table

Thanks,
Prabhat Sahu

GTT supports foreign key constraints in global_temporary_table_v13-pg13.patch


Wenjing





--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


út 25. 2. 2020 v 14:36 odesílatel Prabhat Sahu <prabhat.sahu@enterprisedb.com> napsal:
Hi All,

Please check the below findings on GTT.
-- Scenario 1:
Under "information_schema", We are not allowed to create "temporary table", whereas we can CREATE/DROP "Global Temporary Table", is it expected ?

It is ok for me. temporary tables should be created only in proprietary schema. For GTT there is not risk of collision, so it can be created in any schema where are necessary access rights.

Pavel


postgres=# create temporary table information_schema.temp1(c1 int);
ERROR:  cannot create temporary relation in non-temporary schema
LINE 1: create temporary table information_schema.temp1(c1 int);
                               ^

postgres=# create global temporary table information_schema.temp1(c1 int);
CREATE TABLE

postgres=# drop table information_schema.temp1 ;
DROP TABLE

-- Scenario 2:
Here I am getting the same error message in both the below cases.
We may add a "global" keyword with GTT related error message.

postgres=# create global temporary table gtt1 (c1 int unique);
CREATE TABLE
postgres=# create temporary table tmp1 (c1 int unique);
CREATE TABLE

postgres=# create temporary table tmp2 (c1 int references gtt1(c1) );
ERROR:  constraints on temporary tables may reference only temporary tables

postgres=# create global temporary table gtt2 (c1 int references tmp1(c1) );
ERROR:  constraints on temporary tables may reference only temporary tables

Thanks,
Prabhat Sahu

On Tue, Feb 25, 2020 at 2:25 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:


2020年2月24日 下午5:44,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

On Fri, Feb 21, 2020 at 9:10 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
Hi,
I have started testing the "Global temporary table" feature,
That's great, I see hope.
from "gtt_v11-pg13.patch". Below is my findings:

-- session 1:
postgres=# create global temporary table gtt1(a int);
CREATE TABLE

-- seeeion 2:
postgres=# truncate gtt1 ;
ERROR:  could not open file "base/13585/t3_16384": No such file or directory

is it expected?

Oh ,this is a bug, I fixed it.
Thanks for the patch.
I have verified the same, Now the issue is resolved with v12 patch.

Kindly confirm the below scenario:

postgres=# create global temporary table gtt1 (c1 int unique);
CREATE TABLE

postgres=# create global temporary table gtt2 (c1 int references gtt1(c1) );
ERROR:  referenced relation "gtt1" is not a global temp table

postgres=# create table tab2 (c1 int references gtt1(c1) );
ERROR:  referenced relation "gtt1" is not a global temp table

Thanks,
Prabhat Sahu

GTT supports foreign key constraints in global_temporary_table_v13-pg13.patch


Wenjing





--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
tushar
Date:
Hi ,

pg_upgrade  scenario is failing if database is containing  global temporary table

=============================
centos@tushar-ldap-docker bin]$ ./psql postgres
psql (13devel)
Type "help" for help.

postgres=# create global temporary table  t(n int);
CREATE TABLE
postgres=# \q
===============================

run pg_upgrade -

[centos@tushar-ldap-docker bin]$ ./pg_upgrade -d /tmp/t1/ -D /tmp/t2 -b . -B .
Performing Consistency Checks
-----------------------------
Checking cluster versions                                             ok
Checking database user is the install user                   ok
Checking database connection settings                       ok
Checking for prepared transactions                             ok
Checking for reg* data types in user tables                 ok
--
--
If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.

Performing Upgrade
------------------
Analyzing all rows in the new cluster                        ok
Freezing all rows in the new cluster                          ok
Deleting files from new pg_xact                                ok
--
--
Restoring database schemas in the new cluster
                                                                                  ok
Copying user relation files
  /tmp/t1/base/13585/16384                                 
error while copying relation "public.t": could not open file "/tmp/t1/base/13585/16384": No such file or directory
Failure, exiting

regards,

On 2/25/20 7:06 PM, Prabhat Sahu wrote:
Hi All,

Please check the below findings on GTT.
-- Scenario 1:
Under "information_schema", We are not allowed to create "temporary table", whereas we can CREATE/DROP "Global Temporary Table", is it expected ?

postgres=# create temporary table information_schema.temp1(c1 int);
ERROR:  cannot create temporary relation in non-temporary schema
LINE 1: create temporary table information_schema.temp1(c1 int);
                               ^

postgres=# create global temporary table information_schema.temp1(c1 int);
CREATE TABLE

postgres=# drop table information_schema.temp1 ;
DROP TABLE

-- Scenario 2:
Here I am getting the same error message in both the below cases.
We may add a "global" keyword with GTT related error message.

postgres=# create global temporary table gtt1 (c1 int unique);
CREATE TABLE
postgres=# create temporary table tmp1 (c1 int unique);
CREATE TABLE

postgres=# create temporary table tmp2 (c1 int references gtt1(c1) );
ERROR:  constraints on temporary tables may reference only temporary tables

postgres=# create global temporary table gtt2 (c1 int references tmp1(c1) );
ERROR:  constraints on temporary tables may reference only temporary tables

Thanks,
Prabhat Sahu

On Tue, Feb 25, 2020 at 2:25 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:


2020年2月24日 下午5:44,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

On Fri, Feb 21, 2020 at 9:10 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
Hi,
I have started testing the "Global temporary table" feature,
That's great, I see hope.
from "gtt_v11-pg13.patch". Below is my findings:

-- session 1:
postgres=# create global temporary table gtt1(a int);
CREATE TABLE

-- seeeion 2:
postgres=# truncate gtt1 ;
ERROR:  could not open file "base/13585/t3_16384": No such file or directory

is it expected?

Oh ,this is a bug, I fixed it.
Thanks for the patch.
I have verified the same, Now the issue is resolved with v12 patch.

Kindly confirm the below scenario:

postgres=# create global temporary table gtt1 (c1 int unique);
CREATE TABLE

postgres=# create global temporary table gtt2 (c1 int references gtt1(c1) );
ERROR:  referenced relation "gtt1" is not a global temp table

postgres=# create table tab2 (c1 int references gtt1(c1) );
ERROR:  referenced relation "gtt1" is not a global temp table

Thanks,
Prabhat Sahu

GTT supports foreign key constraints in global_temporary_table_v13-pg13.patch


Wenjing





--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Re: [Proposal] Global temporary tables

From
tushar
Date:
Hi,

I have created two  global temporary tables like this -

Case 1-
postgres=# create global  temp table foo(n int) with (on_commit_delete_rows='true');
CREATE TABLE

Case 2-
postgres=# create global  temp table bar1(n int) on commit delete rows;
CREATE TABLE


but   if i try to do the same having only 'temp' keyword , Case 2 is working fine but getting this error  for case 1 -

postgres=# create   temp table foo1(n int) with (on_commit_delete_rows='true');
ERROR:  regular table cannot specifie on_commit_delete_rows
postgres=#

postgres=#  create   temp table bar1(n int) on commit delete rows;
CREATE TABLE

i think this error message need to be more clear .

regards,
tushar

On 2/25/20 7:19 PM, Pavel Stehule wrote/:


út 25. 2. 2020 v 14:36 odesílatel Prabhat Sahu <prabhat.sahu@enterprisedb.com> napsal:
Hi All,

Please check the below findings on GTT.
-- Scenario 1:
Under "information_schema", We are not allowed to create "temporary table", whereas we can CREATE/DROP "Global Temporary Table", is it expected ?

It is ok for me. temporary tables should be created only in proprietary schema. For GTT there is not risk of collision, so it can be created in any schema where are necessary access rights.

Pavel


postgres=# create temporary table information_schema.temp1(c1 int);
ERROR:  cannot create temporary relation in non-temporary schema
LINE 1: create temporary table information_schema.temp1(c1 int);
                               ^

postgres=# create global temporary table information_schema.temp1(c1 int);
CREATE TABLE

postgres=# drop table information_schema.temp1 ;
DROP TABLE

-- Scenario 2:
Here I am getting the same error message in both the below cases.
We may add a "global" keyword with GTT related error message.

postgres=# create global temporary table gtt1 (c1 int unique);
CREATE TABLE
postgres=# create temporary table tmp1 (c1 int unique);
CREATE TABLE

postgres=# create temporary table tmp2 (c1 int references gtt1(c1) );
ERROR:  constraints on temporary tables may reference only temporary tables

postgres=# create global temporary table gtt2 (c1 int references tmp1(c1) );
ERROR:  constraints on temporary tables may reference only temporary tables

Thanks,
Prabhat Sahu

On Tue, Feb 25, 2020 at 2:25 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:


2020年2月24日 下午5:44,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

On Fri, Feb 21, 2020 at 9:10 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
Hi,
I have started testing the "Global temporary table" feature,
That's great, I see hope.
from "gtt_v11-pg13.patch". Below is my findings:

-- session 1:
postgres=# create global temporary table gtt1(a int);
CREATE TABLE

-- seeeion 2:
postgres=# truncate gtt1 ;
ERROR:  could not open file "base/13585/t3_16384": No such file or directory

is it expected?

Oh ,this is a bug, I fixed it.
Thanks for the patch.
I have verified the same, Now the issue is resolved with v12 patch.

Kindly confirm the below scenario:

postgres=# create global temporary table gtt1 (c1 int unique);
CREATE TABLE

postgres=# create global temporary table gtt2 (c1 int references gtt1(c1) );
ERROR:  referenced relation "gtt1" is not a global temp table

postgres=# create table tab2 (c1 int references gtt1(c1) );
ERROR:  referenced relation "gtt1" is not a global temp table

Thanks,
Prabhat Sahu

GTT supports foreign key constraints in global_temporary_table_v13-pg13.patch


Wenjing





--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:
Thanks for review.


2020年2月25日 下午9:56,tushar <tushar.ahuja@enterprisedb.com> 写道:

Hi ,

pg_upgrade  scenario is failing if database is containing  global temporary table

=============================
centos@tushar-ldap-docker bin]$ ./psql postgres
psql (13devel)
Type "help" for help.

postgres=# create global temporary table  t(n int);
CREATE TABLE
postgres=# \q
===============================

run pg_upgrade -

[centos@tushar-ldap-docker bin]$ ./pg_upgrade -d /tmp/t1/ -D /tmp/t2 -b . -B .
Performing Consistency Checks
-----------------------------
Checking cluster versions                                             ok
Checking database user is the install user                   ok
Checking database connection settings                       ok
Checking for prepared transactions                             ok
Checking for reg* data types in user tables                 ok
--
--
If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.

Performing Upgrade
------------------
Analyzing all rows in the new cluster                        ok
Freezing all rows in the new cluster                          ok
Deleting files from new pg_xact                                ok
--
--
Restoring database schemas in the new cluster
                                                                                  ok
Copying user relation files
  /tmp/t1/base/13585/16384                                 
error while copying relation "public.t": could not open file "/tmp/t1/base/13585/16384": No such file or directory
Failure, exiting
This is a bug.
I fixed in global_temporary_table_v14-pg13.patch


Wenjing





regards,

On 2/25/20 7:06 PM, Prabhat Sahu wrote:
Hi All,

Please check the below findings on GTT.
-- Scenario 1:
Under "information_schema", We are not allowed to create "temporary table", whereas we can CREATE/DROP "Global Temporary Table", is it expected ?

postgres=# create temporary table information_schema.temp1(c1 int);
ERROR:  cannot create temporary relation in non-temporary schema
LINE 1: create temporary table information_schema.temp1(c1 int);
                               ^

postgres=# create global temporary table information_schema.temp1(c1 int);
CREATE TABLE

postgres=# drop table information_schema.temp1 ;
DROP TABLE

-- Scenario 2:
Here I am getting the same error message in both the below cases.
We may add a "global" keyword with GTT related error message.

postgres=# create global temporary table gtt1 (c1 int unique);
CREATE TABLE
postgres=# create temporary table tmp1 (c1 int unique);
CREATE TABLE

postgres=# create temporary table tmp2 (c1 int references gtt1(c1) );
ERROR:  constraints on temporary tables may reference only temporary tables

postgres=# create global temporary table gtt2 (c1 int references tmp1(c1) );
ERROR:  constraints on temporary tables may reference only temporary tables

Thanks,
Prabhat Sahu

On Tue, Feb 25, 2020 at 2:25 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:


2020年2月24日 下午5:44,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

On Fri, Feb 21, 2020 at 9:10 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
Hi,
I have started testing the "Global temporary table" feature,
That's great, I see hope.
from "gtt_v11-pg13.patch". Below is my findings:

-- session 1:
postgres=# create global temporary table gtt1(a int);
CREATE TABLE

-- seeeion 2:
postgres=# truncate gtt1 ;
ERROR:  could not open file "base/13585/t3_16384": No such file or directory

is it expected?

Oh ,this is a bug, I fixed it.
Thanks for the patch.
I have verified the same, Now the issue is resolved with v12 patch.

Kindly confirm the below scenario:

postgres=# create global temporary table gtt1 (c1 int unique);
CREATE TABLE

postgres=# create global temporary table gtt2 (c1 int references gtt1(c1) );
ERROR:  referenced relation "gtt1" is not a global temp table

postgres=# create table tab2 (c1 int references gtt1(c1) );
ERROR:  referenced relation "gtt1" is not a global temp table

Thanks,
Prabhat Sahu

GTT supports foreign key constraints in global_temporary_table_v13-pg13.patch


Wenjing





--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Attachment

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年2月25日 下午11:31,tushar <tushar.ahuja@enterprisedb.com> 写道:

Hi,

I have created two  global temporary tables like this -

Case 1-
postgres=# create global  temp table foo(n int) with (on_commit_delete_rows='true');
CREATE TABLE

Case 2-
postgres=# create global  temp table bar1(n int) on commit delete rows;
CREATE TABLE


but   if i try to do the same having only 'temp' keyword , Case 2 is working fine but getting this error  for case 1 -

postgres=# create   temp table foo1(n int) with (on_commit_delete_rows='true');
ERROR:  regular table cannot specifie on_commit_delete_rows
postgres=#

postgres=#  create   temp table bar1(n int) on commit delete rows;
CREATE TABLE

i think this error message need to be more clear .
Also fixed in global_temporary_table_v14-pg13.patch

Wenjing




regards,
tushar

On 2/25/20 7:19 PM, Pavel Stehule wrote/:


út 25. 2. 2020 v 14:36 odesílatel Prabhat Sahu <prabhat.sahu@enterprisedb.com> napsal:
Hi All,

Please check the below findings on GTT.
-- Scenario 1:
Under "information_schema", We are not allowed to create "temporary table", whereas we can CREATE/DROP "Global Temporary Table", is it expected ?

It is ok for me. temporary tables should be created only in proprietary schema. For GTT there is not risk of collision, so it can be created in any schema where are necessary access rights.

Pavel


postgres=# create temporary table information_schema.temp1(c1 int);
ERROR:  cannot create temporary relation in non-temporary schema
LINE 1: create temporary table information_schema.temp1(c1 int);
                               ^

postgres=# create global temporary table information_schema.temp1(c1 int);
CREATE TABLE

postgres=# drop table information_schema.temp1 ;
DROP TABLE

-- Scenario 2:
Here I am getting the same error message in both the below cases.
We may add a "global" keyword with GTT related error message.

postgres=# create global temporary table gtt1 (c1 int unique);
CREATE TABLE
postgres=# create temporary table tmp1 (c1 int unique);
CREATE TABLE

postgres=# create temporary table tmp2 (c1 int references gtt1(c1) );
ERROR:  constraints on temporary tables may reference only temporary tables

postgres=# create global temporary table gtt2 (c1 int references tmp1(c1) );
ERROR:  constraints on temporary tables may reference only temporary tables

Thanks,
Prabhat Sahu

On Tue, Feb 25, 2020 at 2:25 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:


2020年2月24日 下午5:44,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

On Fri, Feb 21, 2020 at 9:10 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
Hi,
I have started testing the "Global temporary table" feature,
That's great, I see hope.
from "gtt_v11-pg13.patch". Below is my findings:

-- session 1:
postgres=# create global temporary table gtt1(a int);
CREATE TABLE

-- seeeion 2:
postgres=# truncate gtt1 ;
ERROR:  could not open file "base/13585/t3_16384": No such file or directory

is it expected?

Oh ,this is a bug, I fixed it.
Thanks for the patch.
I have verified the same, Now the issue is resolved with v12 patch.

Kindly confirm the below scenario:

postgres=# create global temporary table gtt1 (c1 int unique);
CREATE TABLE

postgres=# create global temporary table gtt2 (c1 int references gtt1(c1) );
ERROR:  referenced relation "gtt1" is not a global temp table

postgres=# create table tab2 (c1 int references gtt1(c1) );
ERROR:  referenced relation "gtt1" is not a global temp table

Thanks,
Prabhat Sahu

GTT supports foreign key constraints in global_temporary_table_v13-pg13.patch


Wenjing





--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年2月25日 下午9:56,tushar <tushar.ahuja@enterprisedb.com> 写道:

Hi ,

pg_upgrade  scenario is failing if database is containing  global temporary table

=============================
centos@tushar-ldap-docker bin]$ ./psql postgres
psql (13devel)
Type "help" for help.

postgres=# create global temporary table  t(n int);
CREATE TABLE
postgres=# \q
===============================

run pg_upgrade -

[centos@tushar-ldap-docker bin]$ ./pg_upgrade -d /tmp/t1/ -D /tmp/t2 -b . -B .
Performing Consistency Checks
-----------------------------
Checking cluster versions                                             ok
Checking database user is the install user                   ok
Checking database connection settings                       ok
Checking for prepared transactions                             ok
Checking for reg* data types in user tables                 ok
--
--
If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.

Performing Upgrade
------------------
Analyzing all rows in the new cluster                        ok
Freezing all rows in the new cluster                          ok
Deleting files from new pg_xact                                ok
--
--
Restoring database schemas in the new cluster
                                                                                  ok
Copying user relation files
  /tmp/t1/base/13585/16384                                 
error while copying relation "public.t": could not open file "/tmp/t1/base/13585/16384": No such file or directory
Failure, exiting
I fixed some bug in global_temporary_table_v14-pg13.patch

Please check global_temporary_table_v15-pg13.patch

Wenjing




regards,

On 2/25/20 7:06 PM, Prabhat Sahu wrote:
Hi All,

Please check the below findings on GTT.
-- Scenario 1:
Under "information_schema", We are not allowed to create "temporary table", whereas we can CREATE/DROP "Global Temporary Table", is it expected ?

postgres=# create temporary table information_schema.temp1(c1 int);
ERROR:  cannot create temporary relation in non-temporary schema
LINE 1: create temporary table information_schema.temp1(c1 int);
                               ^

postgres=# create global temporary table information_schema.temp1(c1 int);
CREATE TABLE

postgres=# drop table information_schema.temp1 ;
DROP TABLE

-- Scenario 2:
Here I am getting the same error message in both the below cases.
We may add a "global" keyword with GTT related error message.

postgres=# create global temporary table gtt1 (c1 int unique);
CREATE TABLE
postgres=# create temporary table tmp1 (c1 int unique);
CREATE TABLE

postgres=# create temporary table tmp2 (c1 int references gtt1(c1) );
ERROR:  constraints on temporary tables may reference only temporary tables

postgres=# create global temporary table gtt2 (c1 int references tmp1(c1) );
ERROR:  constraints on temporary tables may reference only temporary tables

Thanks,
Prabhat Sahu

On Tue, Feb 25, 2020 at 2:25 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:


2020年2月24日 下午5:44,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

On Fri, Feb 21, 2020 at 9:10 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
Hi,
I have started testing the "Global temporary table" feature,
That's great, I see hope.
from "gtt_v11-pg13.patch". Below is my findings:

-- session 1:
postgres=# create global temporary table gtt1(a int);
CREATE TABLE

-- seeeion 2:
postgres=# truncate gtt1 ;
ERROR:  could not open file "base/13585/t3_16384": No such file or directory

is it expected?

Oh ,this is a bug, I fixed it.
Thanks for the patch.
I have verified the same, Now the issue is resolved with v12 patch.

Kindly confirm the below scenario:

postgres=# create global temporary table gtt1 (c1 int unique);
CREATE TABLE

postgres=# create global temporary table gtt2 (c1 int references gtt1(c1) );
ERROR:  referenced relation "gtt1" is not a global temp table

postgres=# create table tab2 (c1 int references gtt1(c1) );
ERROR:  referenced relation "gtt1" is not a global temp table

Thanks,
Prabhat Sahu

GTT supports foreign key constraints in global_temporary_table_v13-pg13.patch


Wenjing





--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Attachment

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年2月25日 下午9:36,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi All,

Please check the below findings on GTT.
-- Scenario 1:
Under "information_schema", We are not allowed to create "temporary table", whereas we can CREATE/DROP "Global Temporary Table", is it expected ?

postgres=# create temporary table information_schema.temp1(c1 int);
ERROR:  cannot create temporary relation in non-temporary schema
LINE 1: create temporary table information_schema.temp1(c1 int);
                               ^

postgres=# create global temporary table information_schema.temp1(c1 int);
CREATE TABLE

postgres=# drop table information_schema.temp1 ;
DROP TABLE

-- Scenario 2:
Here I am getting the same error message in both the below cases.
We may add a "global" keyword with GTT related error message.

postgres=# create global temporary table gtt1 (c1 int unique);
CREATE TABLE
postgres=# create temporary table tmp1 (c1 int unique);
CREATE TABLE

postgres=# create temporary table tmp2 (c1 int references gtt1(c1) );
ERROR:  constraints on temporary tables may reference only temporary tables

postgres=# create global temporary table gtt2 (c1 int references tmp1(c1) );
ERROR:  constraints on temporary tables may reference only temporary tables
Fixed in global_temporary_table_v15-pg13.patch


Wenjing



Thanks,
Prabhat Sahu

On Tue, Feb 25, 2020 at 2:25 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:


2020年2月24日 下午5:44,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

On Fri, Feb 21, 2020 at 9:10 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
Hi,
I have started testing the "Global temporary table" feature,
That's great, I see hope.
from "gtt_v11-pg13.patch". Below is my findings:

-- session 1:
postgres=# create global temporary table gtt1(a int);
CREATE TABLE

-- seeeion 2:
postgres=# truncate gtt1 ;
ERROR:  could not open file "base/13585/t3_16384": No such file or directory

is it expected?

Oh ,this is a bug, I fixed it.
Thanks for the patch.
I have verified the same, Now the issue is resolved with v12 patch.

Kindly confirm the below scenario:

postgres=# create global temporary table gtt1 (c1 int unique);
CREATE TABLE

postgres=# create global temporary table gtt2 (c1 int references gtt1(c1) );
ERROR:  referenced relation "gtt1" is not a global temp table

postgres=# create table tab2 (c1 int references gtt1(c1) );
ERROR:  referenced relation "gtt1" is not a global temp table

Thanks,
Prabhat Sahu

GTT supports foreign key constraints in global_temporary_table_v13-pg13.patch


Wenjing





--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Re: [Proposal] Global temporary tables

From
tushar
Date:
On 2/27/20 9:43 AM, 曾文旌(义从) wrote:
-- Scenario 2:
Here I am getting the same error message in both the below cases.
We may add a "global" keyword with GTT related error message.

postgres=# create global temporary table gtt1 (c1 int unique);
CREATE TABLE
postgres=# create temporary table tmp1 (c1 int unique);
CREATE TABLE

postgres=# create temporary table tmp2 (c1 int references gtt1(c1) );
ERROR:  constraints on temporary tables may reference only temporary tables

postgres=# create global temporary table gtt2 (c1 int references tmp1(c1) );
ERROR:  constraints on temporary tables may reference only temporary tables
Fixed in global_temporary_table_v15-pg13.patch


Thanks Wenjing.

This below scenario is not working  i.e even 'on_commit_delete_rows' is true then after commit -  rows are NOT removing

postgres=#  create global  temp table foo1(n int) with (on_commit_delete_rows='true');
CREATE TABLE
postgres=#
postgres=# begin;
BEGIN
postgres=*# insert into foo1 values (9);
INSERT 0 1
postgres=*# insert into foo1 values (9);
INSERT 0 1
postgres=*# select * from foo1;
 n
---
 9
 9
(2 rows)

postgres=*# commit;
COMMIT
postgres=# select * from foo1;   -- after commit -there should be 0 row as on_commit_delete_rows is 'true'
 n
---
 9
 9
(2 rows)

postgres=# \d+ foo1
                                   Table "public.foo1"
 Column |  Type   | Collation | Nullable | Default | Storage | Stats target | Description
--------+---------+-----------+----------+---------+---------+--------------+-------------
 n      | integer |           |          |         | plain   |              |
Access method: heap
Options: on_commit_delete_rows=true

postgres=#

but if user - create table this way then it is working as expected

postgres=#  create global  temp table foo2(n int) on commit delete rows;
CREATE TABLE
postgres=# begin; insert into foo2 values (9); insert into foo2 values (9); commit; select * from foo2;
BEGIN
INSERT 0 1
INSERT 0 1
COMMIT
 n
---
(0 rows)

postgres=#

i guess , problem is something with this syntax - create global  temp table foo1(n int) with (on_commit_delete_rows='true'); 

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年3月2日 下午10:47,tushar <tushar.ahuja@enterprisedb.com> 写道:

On 2/27/20 9:43 AM, 曾文旌(义从) wrote:
-- Scenario 2:
Here I am getting the same error message in both the below cases.
We may add a "global" keyword with GTT related error message.

postgres=# create global temporary table gtt1 (c1 int unique);
CREATE TABLE
postgres=# create temporary table tmp1 (c1 int unique);
CREATE TABLE

postgres=# create temporary table tmp2 (c1 int references gtt1(c1) );
ERROR:  constraints on temporary tables may reference only temporary tables

postgres=# create global temporary table gtt2 (c1 int references tmp1(c1) );
ERROR:  constraints on temporary tables may reference only temporary tables
Fixed in global_temporary_table_v15-pg13.patch


Thanks Wenjing.

This below scenario is not working  i.e even 'on_commit_delete_rows' is true then after commit -  rows are NOT removing

postgres=#  create global  temp table foo1(n int) with (on_commit_delete_rows='true');
CREATE TABLE
postgres=#
postgres=# begin;
BEGIN
postgres=*# insert into foo1 values (9);
INSERT 0 1
postgres=*# insert into foo1 values (9);
INSERT 0 1
postgres=*# select * from foo1;
 n
---
 9
 9
(2 rows)

postgres=*# commit;
COMMIT
postgres=# select * from foo1;   -- after commit -there should be 0 row as on_commit_delete_rows is 'true'
 n
---
 9
 9
(2 rows)

postgres=# \d+ foo1
                                   Table "public.foo1"
 Column |  Type   | Collation | Nullable | Default | Storage | Stats target | Description
--------+---------+-----------+----------+---------+---------+--------------+-------------
 n      | integer |           |          |         | plain   |              |
Access method: heap
Options: on_commit_delete_rows=true

postgres=#

but if user - create table this way then it is working as expected

postgres=#  create global  temp table foo2(n int) on commit delete rows;
CREATE TABLE
postgres=# begin; insert into foo2 values (9); insert into foo2 values (9); commit; select * from foo2;
BEGIN
INSERT 0 1
INSERT 0 1
COMMIT
 n
---
(0 rows)

postgres=#

i guess , problem is something with this syntax - create global  temp table foo1(n int) with (on_commit_delete_rows='true'); 

Thanks for review.

I fixed in global_temporary_table_v16-pg13.patch.



Wenjing




-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Attachment

Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
On Tue, Mar 3, 2020 at 2:11 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:



I fixed in global_temporary_table_v16-pg13.patch.
  
Thank you Wenjing for the patch.
Now we are getting corruption with GTT with below scenario.

postgres=# CREATE GLOBAL TEMPORARY TABLE gtt1(c1 bigint, c2 bigserial) on commit delete rows;
CREATE TABLE
postgres=# CREATE GLOBAL TEMPORARY TABLE gtt2(c1 bigint, c2 bigserial) on commit preserve rows;
CREATE TABLE
postgres=# \q

[edb@localhost bin]$ echo "1
> 2
> 3
> "> t.dat

[edb@localhost bin]$ ./psql  postgres
psql (13devel)
Type "help" for help.

postgres=# \copy gtt1(c1) from 't.dat' with  csv;
ERROR:  could not read block 0 in file "base/13585/t3_16384": read only 0 of 8192 bytes
CONTEXT:  COPY gtt1, line 1: "1"

postgres=# \copy gtt2(c1) from 't.dat' with  csv;
ERROR:  could not read block 0 in file "base/13585/t3_16390": read only 0 of 8192 bytes
CONTEXT:  COPY gtt2, line 1: "1"

NOTE: We end with such corruption for "bigserial/smallserial/serial" datatype columns.

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
tushar
Date:
On 3/3/20 2:10 PM, 曾文旌(义从) wrote:
> I fixed in global_temporary_table_v16-pg13.patch.
Thanks Wenjing. The reported  issue is fixed now  but  there is an 
another similar  scenario -
if we enable 'on_commit_delete_rows' to true using alter command then 
getting same issue i.e rows are not removing after commit.

x=# create global  temp table foo123(n int) with 
(on_commit_delete_rows='false');
CREATE TABLE
x=#
x=# alter table foo123 set ( on_commit_delete_rows='true');
ALTER TABLE
x=#
x=# insert into foo123 values (1);
INSERT 0 1
x=# select * from foo123;   <- row should get removed.
  n
---
  1
(1 row)

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company




Re: [Proposal] Global temporary tables

From
tushar
Date:
On 3/3/20 2:10 PM, 曾文旌(义从) wrote:
I fixed in global_temporary_table_v16-pg13.patch.

Please refer this scenario -

--Connect to psql -

postgres=# alter system set max_active_global_temporary_table =1;
ALTER SYSTEM

--restart the server (./pg_ctl -D data restart)

--create global temp table

postgres=# create global temp  table ccc1  (c int);
CREATE TABLE

--Try to Create another global temp table

postgres=# create global temp  table ccc2  (c int);
WARNING:  relfilenode 13589/1663/19063 not exist in gtt shared hash when forget
ERROR:  out of shared memory
HINT:  You might need to increase max_active_gtt.

postgres=# show max_active_gtt;
ERROR:  unrecognized configuration parameter "max_active_gtt"
postgres=#
postgres=# show max_active_global_temporary_table ;
 max_active_global_temporary_table
-----------------------------------
 1
(1 row)

postgres=#

I cannot find "max_active_gtt"  GUC . I think you are referring to  "max_active_global_temporary_table" here ?

also , would be great  if we can make this error message  user friendly like  - "max connection reached"  rather than memory error

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Thu, Mar 5, 2020 at 9:19 AM tushar <tushar.ahuja@enterprisedb.com> wrote:
> WARNING:  relfilenode 13589/1663/19063 not exist in gtt shared hash when forget
> ERROR:  out of shared memory
> HINT:  You might need to increase max_active_gtt.
>
> also , would be great  if we can make this error message  user friendly like  - "max connection reached"  rather than
memoryerror
 

That would be nice, but the bigger problem is that the WARNING there
looks totally unacceptable. It's looks like it's complaining of some
internal issue (i.e. a bug or corruption) and the grammar is poor,
too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年3月4日 下午3:49,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

On Tue, Mar 3, 2020 at 2:11 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:



I fixed in global_temporary_table_v16-pg13.patch.
  
Thank you Wenjing for the patch.
Now we are getting corruption with GTT with below scenario.

postgres=# CREATE GLOBAL TEMPORARY TABLE gtt1(c1 bigint, c2 bigserial) on commit delete rows;
CREATE TABLE
postgres=# CREATE GLOBAL TEMPORARY TABLE gtt2(c1 bigint, c2 bigserial) on commit preserve rows;
CREATE TABLE
postgres=# \q

[edb@localhost bin]$ echo "1
> 2
> 3
> "> t.dat

[edb@localhost bin]$ ./psql  postgres
psql (13devel)
Type "help" for help.

postgres=# \copy gtt1(c1) from 't.dat' with  csv;
ERROR:  could not read block 0 in file "base/13585/t3_16384": read only 0 of 8192 bytes
CONTEXT:  COPY gtt1, line 1: "1"

postgres=# \copy gtt2(c1) from 't.dat' with  csv;
ERROR:  could not read block 0 in file "base/13585/t3_16390": read only 0 of 8192 bytes
CONTEXT:  COPY gtt2, line 1: "1"

NOTE: We end with such corruption for "bigserial/smallserial/serial" datatype columns.
Thanks for review.

I fixed this issue in global_temporary_table_v17-pg13.patch


Wenjing





--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:

> 2020年3月4日 下午11:39,tushar <tushar.ahuja@enterprisedb.com> 写道:
>
> On 3/3/20 2:10 PM, 曾文旌(义从) wrote:
>> I fixed in global_temporary_table_v16-pg13.patch.
> Thanks Wenjing. The reported  issue is fixed now  but  there is an another similar  scenario -
> if we enable 'on_commit_delete_rows' to true using alter command then getting same issue i.e rows are not removing
aftercommit. 
>
> x=# create global  temp table foo123(n int) with (on_commit_delete_rows='false');
> CREATE TABLE
> x=#
> x=# alter table foo123 set ( on_commit_delete_rows='true');
> ALTER TABLE
I blocked modify this parameter.

Fixed in global_temporary_table_v17-pg13.patch


Wenjing



> x=#
> x=# insert into foo123 values (1);
> INSERT 0 1
> x=# select * from foo123;   <- row should get removed.
>  n
> ---
>  1
> (1 row)
> 
> -- 
> regards,tushar
> EnterpriseDB  https://www.enterprisedb.com/
> The Enterprise PostgreSQL Company


Attachment

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年3月5日 下午10:19,tushar <tushar.ahuja@enterprisedb.com> 写道:

On 3/3/20 2:10 PM, 曾文旌(义从) wrote:
I fixed in global_temporary_table_v16-pg13.patch.

Please refer this scenario -

--Connect to psql -

postgres=# alter system set max_active_global_temporary_table =1;
ALTER SYSTEM

--restart the server (./pg_ctl -D data restart)

--create global temp table

postgres=# create global temp  table ccc1  (c int);
CREATE TABLE

--Try to Create another global temp table

postgres=# create global temp  table ccc2  (c int);
WARNING:  relfilenode 13589/1663/19063 not exist in gtt shared hash when forget
ERROR:  out of shared memory
HINT:  You might need to increase max_active_gtt.

postgres=# show max_active_gtt;
ERROR:  unrecognized configuration parameter "max_active_gtt"
postgres=#
postgres=# show max_active_global_temporary_table ;
 max_active_global_temporary_table
-----------------------------------
 1
(1 row)

postgres=#

I cannot find "max_active_gtt"  GUC . I think you are referring to  "max_active_global_temporary_table" here ?

You're right.

Fixed in global_temporary_table_v17-pg13.patch


Wenjing


also , would be great  if we can make this error message  user friendly like  - "max connection reached"  rather than memory error

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:

> 2020年3月5日 下午10:38,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Thu, Mar 5, 2020 at 9:19 AM tushar <tushar.ahuja@enterprisedb.com> wrote:
>> WARNING:  relfilenode 13589/1663/19063 not exist in gtt shared hash when forget
>> ERROR:  out of shared memory
>> HINT:  You might need to increase max_active_gtt.
>>
>> also , would be great  if we can make this error message  user friendly like  - "max connection reached"  rather
thanmemory error 
>
> That would be nice, but the bigger problem is that the WARNING there
> looks totally unacceptable. It's looks like it's complaining of some
> internal issue (i.e. a bug or corruption) and the grammar is poor,
> too.

Yes, WARNING should not exist.
This is a bug in the rollback process and I have fixed it in global_temporary_table_v17-pg13.patch


Wenjing


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company




Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
Hi All,

Kindly check the below scenario.

Case 1:
postgres=# CREATE GLOBAL TEMPORARY TABLE gtt1(c1 int) on commit delete rows;
CREATE TABLE
postgres=# CREATE GLOBAL TEMPORARY TABLE gtt2(c1 int) on commit preserve rows;
CREATE TABLE
postgres=# vacuum gtt1;
VACUUM
postgres=# vacuum gtt2;
VACUUM
postgres=# vacuum;
VACUUM
postgres=# \q

Case 2: Exit and reconnect to psql prompt.
[edb@localhost bin]$ ./psql  postgres
psql (13devel)
Type "help" for help.

postgres=# vacuum gtt1;
WARNING:  skipping vacuum empty global temp table "gtt1"
VACUUM
postgres=# vacuum gtt2;
WARNING:  skipping vacuum empty global temp table "gtt2"
VACUUM
postgres=# vacuum;
WARNING:  skipping vacuum empty global temp table "gtt1"
WARNING:  skipping vacuum empty global temp table "gtt2"
VACUUM

Although in "Case1" the gtt1/gtt2 are empty, we are not getting "WARNING:  skipping vacuum empty global temp table" for VACUUM in "Case 1".
whereas we are getting the "WARNING" for VACUUM in "Case2".


On Fri, Mar 6, 2020 at 12:41 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:


> 2020年3月5日 下午10:38,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Thu, Mar 5, 2020 at 9:19 AM tushar <tushar.ahuja@enterprisedb.com> wrote:
>> WARNING:  relfilenode 13589/1663/19063 not exist in gtt shared hash when forget
>> ERROR:  out of shared memory
>> HINT:  You might need to increase max_active_gtt.
>>
>> also , would be great  if we can make this error message  user friendly like  - "max connection reached"  rather than memory error
>
> That would be nice, but the bigger problem is that the WARNING there
> looks totally unacceptable. It's looks like it's complaining of some
> internal issue (i.e. a bug or corruption) and the grammar is poor,
> too.

Yes, WARNING should not exist.
This is a bug in the rollback process and I have fixed it in global_temporary_table_v17-pg13.patch


Wenjing


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company





--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
tushar
Date:
On 3/6/20 12:35 PM, 曾文旌(义从) wrote:
> Fixed in global_temporary_table_v17-pg13.patch

Thanks Wenjing.

Please refer this scenario , where i am able to set 
'on_commit_delete_rows=true'  on regular table using 'alter' Syntax  
which is not allowed using 'Create' Syntax

--Expected -

postgres=# CREATE TABLE foo () WITH (on_commit_delete_rows='true');
ERROR:  The parameter on_commit_delete_rows is exclusive to the global 
temp table, which cannot be specified by a regular table

--But user can do this with 'alter' command -
postgres=# create table foo();
CREATE TABLE
postgres=# alter table foo set (on_commit_delete_rows='true');
ALTER TABLE
postgres=#

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company




Re: [Proposal] Global temporary tables

From
tushar
Date:
On 3/6/20 12:35 PM, 曾文旌(义从) wrote:
> Fixed in global_temporary_table_v17-pg13.patch

I observed that , we do support 'global temp' keyword with views

postgres=# create or replace  global temp view v1 as select 5;
CREATE VIEW

but if we take the dump( using pg_dumpall) then it only display 'create 
view'

look like we are skipping it ?

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company




Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年3月9日 下午8:24,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi All,

Kindly check the below scenario.

Case 1:
postgres=# CREATE GLOBAL TEMPORARY TABLE gtt1(c1 int) on commit delete rows;
CREATE TABLE
postgres=# CREATE GLOBAL TEMPORARY TABLE gtt2(c1 int) on commit preserve rows;
CREATE TABLE
postgres=# vacuum gtt1;
VACUUM
postgres=# vacuum gtt2;
VACUUM
postgres=# vacuum;
VACUUM
postgres=# \q

Case 2: Exit and reconnect to psql prompt.
[edb@localhost bin]$ ./psql  postgres
psql (13devel)
Type "help" for help.

postgres=# vacuum gtt1;
WARNING:  skipping vacuum empty global temp table "gtt1"
VACUUM
postgres=# vacuum gtt2;
WARNING:  skipping vacuum empty global temp table "gtt2"
VACUUM
postgres=# vacuum;
WARNING:  skipping vacuum empty global temp table "gtt1"
WARNING:  skipping vacuum empty global temp table "gtt2"
VACUUM

Although in "Case1" the gtt1/gtt2 are empty, we are not getting "WARNING:  skipping vacuum empty global temp table" for VACUUM in "Case 1".
whereas we are getting the "WARNING" for VACUUM in "Case2".
I fixed the warning message, It's more accurate now.

Wenjing





On Fri, Mar 6, 2020 at 12:41 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:


> 2020年3月5日 下午10:38,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Thu, Mar 5, 2020 at 9:19 AM tushar <tushar.ahuja@enterprisedb.com> wrote:
>> WARNING:  relfilenode 13589/1663/19063 not exist in gtt shared hash when forget
>> ERROR:  out of shared memory
>> HINT:  You might need to increase max_active_gtt.
>>
>> also , would be great  if we can make this error message  user friendly like  - "max connection reached"  rather than memory error
>
> That would be nice, but the bigger problem is that the WARNING there
> looks totally unacceptable. It's looks like it's complaining of some
> internal issue (i.e. a bug or corruption) and the grammar is poor,
> too.

Yes, WARNING should not exist.
This is a bug in the rollback process and I have fixed it in global_temporary_table_v17-pg13.patch


Wenjing


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company





--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:

> 2020年3月9日 下午9:34,tushar <tushar.ahuja@enterprisedb.com> 写道:
>
> On 3/6/20 12:35 PM, 曾文旌(义从) wrote:
>> Fixed in global_temporary_table_v17-pg13.patch
>
> Thanks Wenjing.
>
> Please refer this scenario , where i am able to set 'on_commit_delete_rows=true'  on regular table using 'alter'
Syntax which is not allowed using 'Create' Syntax 
>
> --Expected -
>
> postgres=# CREATE TABLE foo () WITH (on_commit_delete_rows='true');
> ERROR:  The parameter on_commit_delete_rows is exclusive to the global temp table, which cannot be specified by a
regulartable 
>
> --But user can do this with 'alter' command -
> postgres=# create table foo();
> CREATE TABLE
> postgres=# alter table foo set (on_commit_delete_rows='true');
> ALTER TABLE
This is a bug ,I fixed.


Wenjing



> postgres=#
> 
> -- 
> regards,tushar
> EnterpriseDB  https://www.enterprisedb.com/
> The Enterprise PostgreSQL Company


Attachment

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:

> 2020年3月9日 下午10:37,tushar <tushar.ahuja@enterprisedb.com> 写道:
>
> On 3/6/20 12:35 PM, 曾文旌(义从) wrote:
>> Fixed in global_temporary_table_v17-pg13.patch
>
> I observed that , we do support 'global temp' keyword with views
>
> postgres=# create or replace  global temp view v1 as select 5;
> CREATE VIEW
I think we should not support global temp view.
Fixed in global_temporary_table_v18-pg13.patch.



Wenjing


>
> but if we take the dump( using pg_dumpall) then it only display 'create view'
>
> look like we are skipping it ?
>
> --
> regards,tushar
> EnterpriseDB  https://www.enterprisedb.com/
> The Enterprise PostgreSQL Company




Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
On Mon, Mar 9, 2020 at 10:02 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:


Fixed in global_temporary_table_v18-pg13.patch.
Hi Wenjing,
Thanks for the patch. I have verified the previous issues with "gtt_v18_pg13.patch" and those are resolved.
Please find below case:

postgres=# create sequence seq;
CREATE SEQUENCE

postgres=# CREATE GLOBAL TEMPORARY TABLE gtt1(c1 int PRIMARY KEY) ON COMMIT DELETE ROWS;
CREATE TABLE

postgres=# CREATE GLOBAL TEMPORARY TABLE gtt2(c1 int PRIMARY KEY) ON COMMIT PRESERVE ROWS;
CREATE TABLE

postgres=# alter table gtt1 add c2 int default nextval('seq');
ERROR:  cannot reindex global temporary tables

postgres=# alter table gtt2 add c2 int default nextval('seq');
ERROR:  cannot reindex global temporary tables

Note: We are getting this error if we have a key column(PK/UNIQUE) in a GTT, and trying to add a column with a default sequence into it.

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年3月11日 下午3:52,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

On Mon, Mar 9, 2020 at 10:02 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:


Fixed in global_temporary_table_v18-pg13.patch.
Hi Wenjing,
Thanks for the patch. I have verified the previous issues with "gtt_v18_pg13.patch" and those are resolved.
Please find below case:

postgres=# create sequence seq;
CREATE SEQUENCE

postgres=# CREATE GLOBAL TEMPORARY TABLE gtt1(c1 int PRIMARY KEY) ON COMMIT DELETE ROWS;
CREATE TABLE

postgres=# CREATE GLOBAL TEMPORARY TABLE gtt2(c1 int PRIMARY KEY) ON COMMIT PRESERVE ROWS;
CREATE TABLE

postgres=# alter table gtt1 add c2 int default nextval('seq');
ERROR:  cannot reindex global temporary tables

postgres=# alter table gtt2 add c2 int default nextval('seq');
ERROR:  cannot reindex global temporary tables

Note: We are getting this error if we have a key column(PK/UNIQUE) in a GTT, and trying to add a column with a default sequence into it.
This is because alter table add column with default value need reindex pk,
reindex need change relfilenode, but GTT is not currently supported.
I make the error message more clearer in global_temporary_table_v19-pg13.patch


Wenjing




--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Wed, Mar 11, 2020 at 9:07 AM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
> reindex need change relfilenode, but GTT is not currently supported.

In my view that'd have to be fixed somehow.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:

> 2020年3月12日 上午4:12,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Wed, Mar 11, 2020 at 9:07 AM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
>> reindex need change relfilenode, but GTT is not currently supported.
>
> In my view that'd have to be fixed somehow.
Ok , I am working on it.



>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company




Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
Hi Wenjing,

Please check the below findings:
After running "TRUNCATE" command, the "relfilenode" field is not changing for GTT 
whereas, for Simple table/Temp table "relfilenode" field is changing after TRUNCATE.

Case 1: Getting same "relfilenode" for GTT after and before "TRUNCATE"
postgres=# create global temporary table gtt1(c1 int) on commit delete rows;
CREATE TABLE
postgres=# select relfilenode from pg_class  where relname ='gtt1';
 relfilenode
-------------
       16384
(1 row)
postgres=# truncate gtt1;
TRUNCATE TABLE
postgres=# select relfilenode from pg_class  where relname ='gtt1';
 relfilenode
-------------
       16384
(1 row)

postgres=# create global temporary table gtt2(c1 int) on commit preserve rows;
CREATE TABLE
postgres=# select relfilenode from pg_class  where relname ='gtt2';
 relfilenode
-------------
       16387
(1 row)
postgres=# truncate gtt2;
TRUNCATE TABLE
postgres=# select relfilenode from pg_class  where relname ='gtt2';
 relfilenode
-------------
       16387
(1 row)


Case 2: "relfilenode" changes after "TRUNCATE" for Simple table/Temp table
postgres=# create temporary table temp3(c1 int) on commit preserve rows;
CREATE TABLE
postgres=# select relfilenode from pg_class  where relname ='temp3';
 relfilenode
-------------
       16392
(1 row)
postgres=# truncate temp3;
TRUNCATE TABLE
postgres=# select relfilenode from pg_class  where relname ='temp3';
 relfilenode
-------------
       16395
(1 row)


postgres=# create table tabl4(c1 int);
CREATE TABLE
postgres=# select relfilenode from pg_class  where relname ='tabl4';
 relfilenode
-------------
       16396
(1 row)
postgres=# truncate tabl4;
TRUNCATE TABLE
postgres=# select relfilenode from pg_class  where relname ='tabl4';
 relfilenode
-------------
       16399
(1 row)


On Thu, Mar 12, 2020 at 3:36 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:


> 2020年3月12日 上午4:12,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Wed, Mar 11, 2020 at 9:07 AM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
>> reindex need change relfilenode, but GTT is not currently supported.
>
> In my view that'd have to be fixed somehow.
Ok , I am working on it.



>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company



--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
tushar
Date:
On 3/9/20 10:01 PM, 曾文旌(义从) wrote:
> Fixed in global_temporary_table_v18-pg13.patch.

Thanks Wenjing.

I am getting this error  "ERROR:  could not open file 
"base/13589/t3_16440": No such file or directory" if 
max_active_global_temporary_table set to 0

Please refer this scenario -

postgres=# create global temp table  tab1 (n int ) with ( 
on_commit_delete_rows='true');
CREATE TABLE
postgres=# insert into tab1 values (1);
INSERT 0 1
postgres=# select * from tab1;
  n
---
(0 rows)

postgres=# alter system set max_active_global_temporary_table=0;
ALTER SYSTEM
postgres=# \q
[tushar@localhost bin]$ ./pg_ctl -D data/ restart -c -l logs123

waiting for server to start.... done
server started

[tushar@localhost bin]$ ./psql postgres
psql (13devel)
Type "help" for help.

postgres=# insert into tab1 values (1);
ERROR:  could not open file "base/13589/t3_16440": No such file or directory
postgres=#

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company




Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
Hi Wenjing,

Please check the below combination of GTT with Primary and Foreign key relations, with the ERROR message.

Case1:
postgres=# CREATE GLOBAL TEMPORARY TABLE gtt1(c1 serial PRIMARY KEY, c2 VARCHAR (50) UNIQUE NOT NULL) ON COMMIT DELETE ROWS;
CREATE TABLE

postgres=# CREATE GLOBAL TEMPORARY TABLE gtt2(c1 integer NOT NULL, c2 integer NOT NULL,
PRIMARY KEY (c1, c2),
FOREIGN KEY (c1) REFERENCES gtt1 (c1)) ON COMMIT PRESERVE ROWS;
ERROR:  unsupported ON COMMIT and foreign key combination
DETAIL:  Table "gtt2" references "gtt1", but they do not have the same ON COMMIT setting.

Case2:
postgres=# CREATE GLOBAL TEMPORARY TABLE gtt1(c1 serial PRIMARY KEY, c2 VARCHAR (50) UNIQUE NOT NULL) ON COMMIT PRESERVE ROWS;
CREATE TABLE

postgres=# CREATE GLOBAL TEMPORARY TABLE gtt2(c1 integer NOT NULL, c2 integer NOT NULL,
PRIMARY KEY (c1, c2),
FOREIGN KEY (c1) REFERENCES gtt1 (c1)) ON COMMIT DELETE ROWS;
CREATE TABLE

In "case2" although both the primary table and foreign key GTT do not have the same ON COMMIT setting, still we are able to create the PK-FK relations with GTT.

So I hope the detail message(DETAIL:  Table "gtt2" references "gtt1", but they do not have the same ON COMMIT setting.) in "Case1" should be more clear(something like "wrong combination of ON COMMIT setting").

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
Hi Wenjing,
Please check the below scenario, where the Foreign table on GTT not showing records.

postgres=# create extension postgres_fdw;
CREATE EXTENSION
postgres=# do $d$
    begin
        execute $$create server fdw foreign data wrapper postgres_fdw options (host 'localhost',dbname 'postgres',port '$$||current_setting('port')||$$')$$;
    end;
$d$;
DO
postgres=# create user mapping for public server fdw;
CREATE USER MAPPING

postgres=# create table lt1 (c1 integer, c2 varchar(50));
CREATE TABLE
postgres=# insert into lt1 values (1,'c21');
INSERT 0 1
postgres=# create foreign table ft1 (c1 integer, c2 varchar(50)) server fdw options (table_name 'lt1');
CREATE FOREIGN TABLE
postgres=# select * from ft1;
 c1 | c2  
----+-----
  1 | c21
(1 row)

postgres=# create global temporary table gtt1 (c1 integer, c2 varchar(50));
CREATE TABLE
postgres=# insert into gtt1 values (1,'gtt_c21');
INSERT 0 1
postgres=# create foreign table f_gtt1 (c1 integer, c2 varchar(50)) server fdw options (table_name 'gtt1');
CREATE FOREIGN TABLE

postgres=# select * from gtt1;
 c1 |   c2    
----+---------
  1 | gtt_c21
(1 row)

postgres=# select * from f_gtt1;
 c1 | c2
----+----
(0 rows)

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
Konstantin Knizhnik
Date:


On 16.03.2020 9:23, Prabhat Sahu wrote:
Hi Wenjing,
Please check the below scenario, where the Foreign table on GTT not showing records.

postgres=# create extension postgres_fdw;
CREATE EXTENSION
postgres=# do $d$
    begin
        execute $$create server fdw foreign data wrapper postgres_fdw options (host 'localhost',dbname 'postgres',port '$$||current_setting('port')||$$')$$;
    end;
$d$;
DO
postgres=# create user mapping for public server fdw;
CREATE USER MAPPING

postgres=# create table lt1 (c1 integer, c2 varchar(50));
CREATE TABLE
postgres=# insert into lt1 values (1,'c21');
INSERT 0 1
postgres=# create foreign table ft1 (c1 integer, c2 varchar(50)) server fdw options (table_name 'lt1');
CREATE FOREIGN TABLE
postgres=# select * from ft1;
 c1 | c2  
----+-----
  1 | c21
(1 row)

postgres=# create global temporary table gtt1 (c1 integer, c2 varchar(50));
CREATE TABLE
postgres=# insert into gtt1 values (1,'gtt_c21');
INSERT 0 1
postgres=# create foreign table f_gtt1 (c1 integer, c2 varchar(50)) server fdw options (table_name 'gtt1');
CREATE FOREIGN TABLE

postgres=# select * from gtt1;
 c1 |   c2    
----+---------
  1 | gtt_c21
(1 row)

postgres=# select * from f_gtt1;
 c1 | c2
----+----
(0 rows)

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


It seems to be expected behavior: GTT data is private to the session and postgres_fdw establish its own session where content of the table is empty.
But if you insert some data in f_gtt1, then you will be able to select this data from it because of connection cache in postgres_fdw.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年3月16日 下午2:23,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi Wenjing,
Please check the below scenario, where the Foreign table on GTT not showing records.

postgres=# create extension postgres_fdw;
CREATE EXTENSION
postgres=# do $d$
    begin
        execute $$create server fdw foreign data wrapper postgres_fdw options (host 'localhost',dbname 'postgres',port '$$||current_setting('port')||$$')$$;
    end;
$d$;
DO
postgres=# create user mapping for public server fdw;
CREATE USER MAPPING

postgres=# create table lt1 (c1 integer, c2 varchar(50));
CREATE TABLE
postgres=# insert into lt1 values (1,'c21');
INSERT 0 1
postgres=# create foreign table ft1 (c1 integer, c2 varchar(50)) server fdw options (table_name 'lt1');
CREATE FOREIGN TABLE
postgres=# select * from ft1;
 c1 | c2  
----+-----
  1 | c21
(1 row)

postgres=# create global temporary table gtt1 (c1 integer, c2 varchar(50));
CREATE TABLE
postgres=# insert into gtt1 values (1,'gtt_c21');
INSERT 0 1
postgres=# create foreign table f_gtt1 (c1 integer, c2 varchar(50)) server fdw options (table_name 'gtt1');
CREATE FOREIGN TABLE

postgres=# select * from gtt1;
 c1 |   c2    
----+---------
  1 | gtt_c21
(1 row)

postgres=# select * from f_gtt1;
 c1 | c2
----+----
(0 rows)

--

I understand that postgre_fdw works similar to dblink.
postgre_fdw access to the table requires a new connection.
The data in the GTT table is empty in the newly established connection.
Because GTT shares structure but not data between connections.

Try local temp table:
create temporary table ltt1 (c1 integer, c2 varchar(50));

insert into ltt1 values (1,'gtt_c21');

create foreign table f_ltt1 (c1 integer, c2 varchar(50)) server fdw options (table_name 'ltt1');

select * from ltt1;
 c1 |   c2    
----+---------
  1 | gtt_c21
(1 row)

select * from l_gtt1;
ERROR:  relation "l_gtt1" does not exist
LINE 1: select * from l_gtt1;


Wenjing


With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Re: [Proposal] Global temporary tables

From
tushar
Date:
Hi Wenjing,

I have created a global table on X session but i am not able to drop from Y session ?

X session - ( connect to psql terminal )
postgres=# create global temp table foo(n int);
CREATE TABLE
postgres=# select * from foo;
 n
---
(0 rows)


Y session - ( connect to psql terminal )
postgres=# drop table foo;
ERROR:  can not drop relation foo when other backend attached this global temp table

Table has been created  so i think - user should be able to drop from another session as well without exit from X session.

regards,

On 3/16/20 1:35 PM, 曾文旌(义从) wrote:


2020年3月16日 下午2:23,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi Wenjing,
Please check the below scenario, where the Foreign table on GTT not showing records.

postgres=# create extension postgres_fdw;
CREATE EXTENSION
postgres=# do $d$
    begin
        execute $$create server fdw foreign data wrapper postgres_fdw options (host 'localhost',dbname 'postgres',port '$$||current_setting('port')||$$')$$;
    end;
$d$;
DO
postgres=# create user mapping for public server fdw;
CREATE USER MAPPING

postgres=# create table lt1 (c1 integer, c2 varchar(50));
CREATE TABLE
postgres=# insert into lt1 values (1,'c21');
INSERT 0 1
postgres=# create foreign table ft1 (c1 integer, c2 varchar(50)) server fdw options (table_name 'lt1');
CREATE FOREIGN TABLE
postgres=# select * from ft1;
 c1 | c2  
----+-----
  1 | c21
(1 row)

postgres=# create global temporary table gtt1 (c1 integer, c2 varchar(50));
CREATE TABLE
postgres=# insert into gtt1 values (1,'gtt_c21');
INSERT 0 1
postgres=# create foreign table f_gtt1 (c1 integer, c2 varchar(50)) server fdw options (table_name 'gtt1');
CREATE FOREIGN TABLE

postgres=# select * from gtt1;
 c1 |   c2    
----+---------
  1 | gtt_c21
(1 row)

postgres=# select * from f_gtt1;
 c1 | c2
----+----
(0 rows)

--

I understand that postgre_fdw works similar to dblink.
postgre_fdw access to the table requires a new connection.
The data in the GTT table is empty in the newly established connection.
Because GTT shares structure but not data between connections.

Try local temp table:
create temporary table ltt1 (c1 integer, c2 varchar(50));

insert into ltt1 values (1,'gtt_c21');

create foreign table f_ltt1 (c1 integer, c2 varchar(50)) server fdw options (table_name 'ltt1');

select * from ltt1;
 c1 |   c2    
----+---------
  1 | gtt_c21
(1 row)

select * from l_gtt1;
ERROR:  relation "l_gtt1" does not exist
LINE 1: select * from l_gtt1;


Wenjing


With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com



-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


po 16. 3. 2020 v 9:58 odesílatel tushar <tushar.ahuja@enterprisedb.com> napsal:
Hi Wenjing,

I have created a global table on X session but i am not able to drop from Y session ?

X session - ( connect to psql terminal )
postgres=# create global temp table foo(n int);
CREATE TABLE
postgres=# select * from foo;
 n
---
(0 rows)


Y session - ( connect to psql terminal )
postgres=# drop table foo;
ERROR:  can not drop relation foo when other backend attached this global temp table

Table has been created  so i think - user should be able to drop from another session as well without exit from X session.

By the original design GTT was not modifiable until is used by any session. Now, you cannot to drop normal table when this table is used.

It is hard to say what is most correct behave and design, but for this moment, I think so protecting table against drop while it is used by other session is the best behave.

Maybe for next release we can introduce DROP TABLE x (FORCE) - like we have for DROP DATABASE. This behave is very similar.

Pavel


regards,

On 3/16/20 1:35 PM, 曾文旌(义从) wrote:


2020年3月16日 下午2:23,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi Wenjing,
Please check the below scenario, where the Foreign table on GTT not showing records.

postgres=# create extension postgres_fdw;
CREATE EXTENSION
postgres=# do $d$
    begin
        execute $$create server fdw foreign data wrapper postgres_fdw options (host 'localhost',dbname 'postgres',port '$$||current_setting('port')||$$')$$;
    end;
$d$;
DO
postgres=# create user mapping for public server fdw;
CREATE USER MAPPING

postgres=# create table lt1 (c1 integer, c2 varchar(50));
CREATE TABLE
postgres=# insert into lt1 values (1,'c21');
INSERT 0 1
postgres=# create foreign table ft1 (c1 integer, c2 varchar(50)) server fdw options (table_name 'lt1');
CREATE FOREIGN TABLE
postgres=# select * from ft1;
 c1 | c2  
----+-----
  1 | c21
(1 row)

postgres=# create global temporary table gtt1 (c1 integer, c2 varchar(50));
CREATE TABLE
postgres=# insert into gtt1 values (1,'gtt_c21');
INSERT 0 1
postgres=# create foreign table f_gtt1 (c1 integer, c2 varchar(50)) server fdw options (table_name 'gtt1');
CREATE FOREIGN TABLE

postgres=# select * from gtt1;
 c1 |   c2    
----+---------
  1 | gtt_c21
(1 row)

postgres=# select * from f_gtt1;
 c1 | c2
----+----
(0 rows)

--

I understand that postgre_fdw works similar to dblink.
postgre_fdw access to the table requires a new connection.
The data in the GTT table is empty in the newly established connection.
Because GTT shares structure but not data between connections.

Try local temp table:
create temporary table ltt1 (c1 integer, c2 varchar(50));

insert into ltt1 values (1,'gtt_c21');

create foreign table f_ltt1 (c1 integer, c2 varchar(50)) server fdw options (table_name 'ltt1');

select * from ltt1;
 c1 |   c2    
----+---------
  1 | gtt_c21
(1 row)

select * from l_gtt1;
ERROR:  relation "l_gtt1" does not exist
LINE 1: select * from l_gtt1;


Wenjing


With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com



-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年3月16日 下午4:58,tushar <tushar.ahuja@enterprisedb.com> 写道:

Hi Wenjing,

I have created a global table on X session but i am not able to drop from Y session ?

X session - ( connect to psql terminal )
postgres=# create global temp table foo(n int);
CREATE TABLE
postgres=# select * from foo;
 n
---
(0 rows)


Y session - ( connect to psql terminal )
postgres=# drop table foo;
ERROR:  can not drop relation foo when other backend attached this global temp table
For now, If one dba wants to drop one GTT,
he can use the view pg_gtt_attached_pids to see which backends are using this GTT.
then kill these sessions with pg_terminate_backend, and he can drop this GTT.


Table has been created  so i think - user should be able to drop from another session as well without exit from X session.

regards,

On 3/16/20 1:35 PM, 曾文旌(义从) wrote:


2020年3月16日 下午2:23,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi Wenjing,
Please check the below scenario, where the Foreign table on GTT not showing records.

postgres=# create extension postgres_fdw;
CREATE EXTENSION
postgres=# do $d$
    begin
        execute $$create server fdw foreign data wrapper postgres_fdw options (host 'localhost',dbname 'postgres',port '$$||current_setting('port')||$$')$$;
    end;
$d$;
DO
postgres=# create user mapping for public server fdw;
CREATE USER MAPPING

postgres=# create table lt1 (c1 integer, c2 varchar(50));
CREATE TABLE
postgres=# insert into lt1 values (1,'c21');
INSERT 0 1
postgres=# create foreign table ft1 (c1 integer, c2 varchar(50)) server fdw options (table_name 'lt1');
CREATE FOREIGN TABLE
postgres=# select * from ft1;
 c1 | c2  
----+-----
  1 | c21
(1 row)

postgres=# create global temporary table gtt1 (c1 integer, c2 varchar(50));
CREATE TABLE
postgres=# insert into gtt1 values (1,'gtt_c21');
INSERT 0 1
postgres=# create foreign table f_gtt1 (c1 integer, c2 varchar(50)) server fdw options (table_name 'gtt1');
CREATE FOREIGN TABLE

postgres=# select * from gtt1;
 c1 |   c2    
----+---------
  1 | gtt_c21
(1 row)

postgres=# select * from f_gtt1;
 c1 | c2
----+----
(0 rows)

--

I understand that postgre_fdw works similar to dblink.
postgre_fdw access to the table requires a new connection.
The data in the GTT table is empty in the newly established connection.
Because GTT shares structure but not data between connections.

Try local temp table:
create temporary table ltt1 (c1 integer, c2 varchar(50));

insert into ltt1 values (1,'gtt_c21');

create foreign table f_ltt1 (c1 integer, c2 varchar(50)) server fdw options (table_name 'ltt1');

select * from ltt1;
 c1 |   c2    
----+---------
  1 | gtt_c21
(1 row)

select * from l_gtt1;
ERROR:  relation "l_gtt1" does not exist
LINE 1: select * from l_gtt1;


Wenjing


With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com



-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年3月16日 下午5:04,Pavel Stehule <pavel.stehule@gmail.com> 写道:



po 16. 3. 2020 v 9:58 odesílatel tushar <tushar.ahuja@enterprisedb.com> napsal:
Hi Wenjing,

I have created a global table on X session but i am not able to drop from Y session ?

X session - ( connect to psql terminal )
postgres=# create global temp table foo(n int);
CREATE TABLE
postgres=# select * from foo;
 n
---
(0 rows)


Y session - ( connect to psql terminal )
postgres=# drop table foo;
ERROR:  can not drop relation foo when other backend attached this global temp table

Table has been created  so i think - user should be able to drop from another session as well without exit from X session.

By the original design GTT was not modifiable until is used by any session. Now, you cannot to drop normal table when this table is used.

It is hard to say what is most correct behave and design, but for this moment, I think so protecting table against drop while it is used by other session is the best behave.

Maybe for next release we can introduce DROP TABLE x (FORCE) - like we have for DROP DATABASE. This behave is very similar.
I agree with that.


Wenjing


Pavel


regards,

On 3/16/20 1:35 PM, 曾文旌(义从) wrote:


2020年3月16日 下午2:23,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi Wenjing,
Please check the below scenario, where the Foreign table on GTT not showing records.

postgres=# create extension postgres_fdw;
CREATE EXTENSION
postgres=# do $d$
    begin
        execute $$create server fdw foreign data wrapper postgres_fdw options (host 'localhost',dbname 'postgres',port '$$||current_setting('port')||$$')$$;
    end;
$d$;
DO
postgres=# create user mapping for public server fdw;
CREATE USER MAPPING

postgres=# create table lt1 (c1 integer, c2 varchar(50));
CREATE TABLE
postgres=# insert into lt1 values (1,'c21');
INSERT 0 1
postgres=# create foreign table ft1 (c1 integer, c2 varchar(50)) server fdw options (table_name 'lt1');
CREATE FOREIGN TABLE
postgres=# select * from ft1;
 c1 | c2  
----+-----
  1 | c21
(1 row)

postgres=# create global temporary table gtt1 (c1 integer, c2 varchar(50));
CREATE TABLE
postgres=# insert into gtt1 values (1,'gtt_c21');
INSERT 0 1
postgres=# create foreign table f_gtt1 (c1 integer, c2 varchar(50)) server fdw options (table_name 'gtt1');
CREATE FOREIGN TABLE

postgres=# select * from gtt1;
 c1 |   c2    
----+---------
  1 | gtt_c21
(1 row)

postgres=# select * from f_gtt1;
 c1 | c2
----+----
(0 rows)

--

I understand that postgre_fdw works similar to dblink.
postgre_fdw access to the table requires a new connection.
The data in the GTT table is empty in the newly established connection.
Because GTT shares structure but not data between connections.

Try local temp table:
create temporary table ltt1 (c1 integer, c2 varchar(50));

insert into ltt1 values (1,'gtt_c21');

create foreign table f_ltt1 (c1 integer, c2 varchar(50)) server fdw options (table_name 'ltt1');

select * from ltt1;
 c1 |   c2    
----+---------
  1 | gtt_c21
(1 row)

select * from l_gtt1;
ERROR:  relation "l_gtt1" does not exist
LINE 1: select * from l_gtt1;


Wenjing


With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com



-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:


On Mon, Mar 16, 2020 at 1:30 PM Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:

It seems to be expected behavior: GTT data is private to the session and postgres_fdw establish its own session where content of the table is empty.
But if you insert some data in f_gtt1, then you will be able to select this data from it because of connection cache in postgres_fdw.

Thanks for the explanation.
I am able to insert and select the value from f_gtt1.

 postgres=# insert into f_gtt1 values (1,'gtt_c21');
INSERT 0 1
postgres=# select * from f_gtt1;
 c1 |   c2    
----+---------
  1 | gtt_c21
(1 row)

I have one more doubt,
As you told above "GTT data is private to the session and postgres_fdw establish its own session where content of the table is empty."
Please check the below scenario, 
we can select data from the "root GTT" and "foreign GTT partitioned table" but we are unable to select data from "GTT partitioned table"

postgres=# create global temporary table gtt2 (c1 integer, c2 integer) partition by range(c1);
CREATE TABLE
postgres=# create global temporary table gtt2_p1 (c1 integer, c2 integer);
CREATE TABLE
postgres=# create foreign table f_gtt2_p1 (c1 integer, c2 integer) server fdw options (table_name 'gtt2_p1');
CREATE FOREIGN TABLE
postgres=# alter table gtt2 attach partition f_gtt2_p1 for values from (minvalue) to (10);
ALTER TABLE
postgres=# insert into gtt2 select i,i from generate_series(1,5,2)i;
INSERT 0 3
postgres=# select * from gtt2;
 c1 | c2
----+----
  1 |  1
  3 |  3
  5 |  5
(3 rows)

postgres=# select * from gtt2_p1;
 c1 | c2
----+----
(0 rows)

postgres=# select * from f_gtt2_p1;
 c1 | c2
----+----
  1 |  1
  3 |  3
  5 |  5
(3 rows)

Is this an expected behavior?

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年3月16日 下午5:31,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:



On Mon, Mar 16, 2020 at 1:30 PM Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:

It seems to be expected behavior: GTT data is private to the session and postgres_fdw establish its own session where content of the table is empty.
But if you insert some data in f_gtt1, then you will be able to select this data from it because of connection cache in postgres_fdw.

Thanks for the explanation.
I am able to insert and select the value from f_gtt1.

 postgres=# insert into f_gtt1 values (1,'gtt_c21');
INSERT 0 1
postgres=# select * from f_gtt1;
 c1 |   c2    
----+---------
  1 | gtt_c21
(1 row)

I have one more doubt,
As you told above "GTT data is private to the session and postgres_fdw establish its own session where content of the table is empty."
Please check the below scenario, 
we can select data from the "root GTT" and "foreign GTT partitioned table" but we are unable to select data from "GTT partitioned table"
postgres=# select pg_backend_pid();
 pg_backend_pid 
----------------
         119135
(1 row)

postgres=# select * from pg_gtt_attached_pids;
 schemaname | tablename | relid |  pid   
------------+-----------+-------+--------
 public     | gtt2_p1   | 73845 | 119135
 public     | gtt2_p1   | 73845 |  51482
(2 rows)


postgres=# select datid,datname,pid,application_name,query from pg_stat_activity where usename = ‘wenjing';
 datid | datname  |  pid   | application_name |                                                query                                                 
-------+----------+--------+------------------+------------------------------------------------------------------------------------------------------
 13589 | postgres | 119135 | psql             | select datid,datname,pid,application_name,query from pg_stat_activity where usename = 'wenjing';
 13589 | postgres |  51482 | postgres_fdw     | COMMIT TRANSACTION
(2 rows)

This can be explained
The postgre_fdw connection has not been disconnected, and it produced data in another session.
In other words, gtt2_p1 is empty in session 119135, but not in session 51482.



postgres=# create global temporary table gtt2 (c1 integer, c2 integer) partition by range(c1);
CREATE TABLE
postgres=# create global temporary table gtt2_p1 (c1 integer, c2 integer);
CREATE TABLE
postgres=# create foreign table f_gtt2_p1 (c1 integer, c2 integer) server fdw options (table_name 'gtt2_p1');
CREATE FOREIGN TABLE
postgres=# alter table gtt2 attach partition f_gtt2_p1 for values from (minvalue) to (10);
ALTER TABLE
postgres=# insert into gtt2 select i,i from generate_series(1,5,2)i;
INSERT 0 3
postgres=# select * from gtt2;
 c1 | c2
----+----
  1 |  1
  3 |  3
  5 |  5
(3 rows)

postgres=# select * from gtt2_p1;
 c1 | c2
----+----
(0 rows)

postgres=# select * from f_gtt2_p1;
 c1 | c2
----+----
  1 |  1
  3 |  3
  5 |  5
(3 rows)

Is this an expected behavior?

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年3月12日 下午8:22,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi Wenjing,

Please check the below findings:
After running "TRUNCATE" command, the "relfilenode" field is not changing for GTT 
whereas, for Simple table/Temp table "relfilenode" field is changing after TRUNCATE.

Case 1: Getting same "relfilenode" for GTT after and before "TRUNCATE"
postgres=# create global temporary table gtt1(c1 int) on commit delete rows;
CREATE TABLE
postgres=# select relfilenode from pg_class  where relname ='gtt1';
 relfilenode
-------------
       16384
(1 row)
postgres=# truncate gtt1;
TRUNCATE TABLE
postgres=# select relfilenode from pg_class  where relname ='gtt1';
 relfilenode
-------------
       16384
(1 row)

postgres=# create global temporary table gtt2(c1 int) on commit preserve rows;
CREATE TABLE
postgres=# select relfilenode from pg_class  where relname ='gtt2';
 relfilenode
-------------
       16387
(1 row)
postgres=# truncate gtt2;
TRUNCATE TABLE
postgres=# select relfilenode from pg_class  where relname ='gtt2';
 relfilenode
-------------
       16387
(1 row)


Case 2: "relfilenode" changes after "TRUNCATE" for Simple table/Temp table
postgres=# create temporary table temp3(c1 int) on commit preserve rows;
CREATE TABLE
postgres=# select relfilenode from pg_class  where relname ='temp3';
 relfilenode
-------------
       16392
(1 row)
postgres=# truncate temp3;
TRUNCATE TABLE
postgres=# select relfilenode from pg_class  where relname ='temp3';
 relfilenode
-------------
       16395
(1 row)


postgres=# create table tabl4(c1 int);
CREATE TABLE
postgres=# select relfilenode from pg_class  where relname ='tabl4';
 relfilenode
-------------
       16396
(1 row)
postgres=# truncate tabl4;
TRUNCATE TABLE
postgres=# select relfilenode from pg_class  where relname ='tabl4';
 relfilenode
-------------
       16399
(1 row)

Truncated GTT has been supported. 
Now it clears the data in the table by switching relfilenode and can support rollback.
Note that the latest relfilenode in GTT is not stored in pg_class, you can view them in the view pg_gtt_stats.

postgres=# create global temp table gtt1(a int primary key);
CREATE TABLE
postgres=# insert into gtt1 select generate_series(1,10000);
INSERT 0 10000
postgres=# select tablename,relfilenode from pg_gtt_relstats;
 tablename | relfilenode 
-----------+-------------
 gtt1      |       16406
 gtt1_pkey |       16409
(2 rows)
postgres=# truncate gtt1;
TRUNCATE TABLE
postgres=# 
postgres=# select tablename,relfilenode from pg_gtt_relstats;
 tablename | relfilenode 
-----------+-------------
 gtt1      |       16411
 gtt1_pkey |       16412
(2 rows)



Wenjing






On Thu, Mar 12, 2020 at 3:36 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:


> 2020年3月12日 上午4:12,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Wed, Mar 11, 2020 at 9:07 AM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:
>> reindex need change relfilenode, but GTT is not currently supported.
>
> In my view that'd have to be fixed somehow.
Ok , I am working on it.



>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company



--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

[Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:

> 2020年3月13日 下午8:40,tushar <tushar.ahuja@enterprisedb.com> 写道:
>
> On 3/9/20 10:01 PM, 曾文旌(义从) wrote:
>> Fixed in global_temporary_table_v18-pg13.patch.
>
> Thanks Wenjing.
>
> I am getting this error  "ERROR:  could not open file "base/13589/t3_16440": No such file or directory" if
max_active_global_temporary_tableset to 0 
>
> Please refer this scenario -
>
> postgres=# create global temp table  tab1 (n int ) with ( on_commit_delete_rows='true');
> CREATE TABLE
> postgres=# insert into tab1 values (1);
> INSERT 0 1
> postgres=# select * from tab1;
>  n
> ---
> (0 rows)
>
> postgres=# alter system set max_active_global_temporary_table=0;
> ALTER SYSTEM
> postgres=# \q
> [tushar@localhost bin]$ ./pg_ctl -D data/ restart -c -l logs123
>
> waiting for server to start.... done
> server started
>
> [tushar@localhost bin]$ ./psql postgres
> psql (13devel)
> Type "help" for help.
>
> postgres=# insert into tab1 values (1);
> ERROR:  could not open file "base/13589/t3_16440": No such file or directory
> postgres=#
Thanks for review
It is a bug, I fixed in global_temporary_table_v20-pg13.patch


Wenjing




> 
> -- 
> regards,tushar
> EnterpriseDB  https://www.enterprisedb.com/
> The Enterprise PostgreSQL Company


Attachment

Re: [Proposal] Global temporary tables

From
"曾文旌(义从)"
Date:


2020年3月11日 下午3:52,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

On Mon, Mar 9, 2020 at 10:02 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:


Fixed in global_temporary_table_v18-pg13.patch.
Hi Wenjing,
Thanks for the patch. I have verified the previous issues with "gtt_v18_pg13.patch" and those are resolved.
Please find below case:

postgres=# create sequence seq;
CREATE SEQUENCE

postgres=# CREATE GLOBAL TEMPORARY TABLE gtt1(c1 int PRIMARY KEY) ON COMMIT DELETE ROWS;
CREATE TABLE

postgres=# CREATE GLOBAL TEMPORARY TABLE gtt2(c1 int PRIMARY KEY) ON COMMIT PRESERVE ROWS;
CREATE TABLE

postgres=# alter table gtt1 add c2 int default nextval('seq');
ERROR:  cannot reindex global temporary tables

postgres=# alter table gtt2 add c2 int default nextval('seq');
ERROR:  cannot reindex global temporary tables
reindex GTT is already supported

Please check global_temporary_table_v20-pg13.patch


Wenjing




Note: We are getting this error if we have a key column(PK/UNIQUE) in a GTT, and trying to add a column with a default sequence into it.

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Re: [Proposal] Global temporary tables

From
"wenjing.zwj"
Date:
postgres=# CREATE LOCAL TEMPORARY TABLE gtt1(c1 serial PRIMARY KEY, c2 VARCHAR (50) UNIQUE NOT NULL) ON COMMIT DELETE ROWS;
CREATE TABLE
postgres=# CREATE LOCAL TEMPORARY TABLE gtt2(c1 integer NOT NULL, c2 integer NOT NULL,
postgres(# PRIMARY KEY (c1, c2),
postgres(# FOREIGN KEY (c1) REFERENCES gtt1 (c1)) ON COMMIT PRESERVE ROWS;
ERROR:  unsupported ON COMMIT and foreign key combination
DETAIL:  Table "gtt2" references "gtt1", but they do not have the same ON COMMIT setting.

postgres=# CREATE LOCAL TEMPORARY TABLE gtt3(c1 serial PRIMARY KEY, c2 VARCHAR (50) UNIQUE NOT NULL) ON COMMIT PRESERVE ROWS;
CREATE TABLE
postgres=# 
postgres=# CREATE LOCAL TEMPORARY TABLE gtt4(c1 integer NOT NULL, c2 integer NOT NULL,
postgres(# PRIMARY KEY (c1, c2),
postgres(# FOREIGN KEY (c1) REFERENCES gtt3 (c1)) ON COMMIT DELETE ROWS;
CREATE TABLE

The same behavior applies to the local temp table.
I think, Cause of the problem is temp table with on commit delete rows is not good for reference tables.
So, it the error message ”cannot reference an on commit delete rows temporary table.“ ?



2020年3月13日 下午10:16,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi Wenjing,

Please check the below combination of GTT with Primary and Foreign key relations, with the ERROR message.

Case1:
postgres=# CREATE GLOBAL TEMPORARY TABLE gtt1(c1 serial PRIMARY KEY, c2 VARCHAR (50) UNIQUE NOT NULL) ON COMMIT DELETE ROWS;
CREATE TABLE

postgres=# CREATE GLOBAL TEMPORARY TABLE gtt2(c1 integer NOT NULL, c2 integer NOT NULL,
PRIMARY KEY (c1, c2),
FOREIGN KEY (c1) REFERENCES gtt1 (c1)) ON COMMIT PRESERVE ROWS;
ERROR:  unsupported ON COMMIT and foreign key combination
DETAIL:  Table "gtt2" references "gtt1", but they do not have the same ON COMMIT setting.

Case2:
postgres=# CREATE GLOBAL TEMPORARY TABLE gtt1(c1 serial PRIMARY KEY, c2 VARCHAR (50) UNIQUE NOT NULL) ON COMMIT PRESERVE ROWS;
CREATE TABLE

postgres=# CREATE GLOBAL TEMPORARY TABLE gtt2(c1 integer NOT NULL, c2 integer NOT NULL,
PRIMARY KEY (c1, c2),
FOREIGN KEY (c1) REFERENCES gtt1 (c1)) ON COMMIT DELETE ROWS;
CREATE TABLE

In "case2" although both the primary table and foreign key GTT do not have the same ON COMMIT setting, still we are able to create the PK-FK relations with GTT.

So I hope the detail message(DETAIL:  Table "gtt2" references "gtt1", but they do not have the same ON COMMIT setting.) in "Case1" should be more clear(something like "wrong combination of ON COMMIT setting").

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
On Thu, Mar 19, 2020 at 3:51 PM wenjing.zwj <wenjing.zwj@alibaba-inc.com> wrote:
postgres=# CREATE LOCAL TEMPORARY TABLE gtt1(c1 serial PRIMARY KEY, c2 VARCHAR (50) UNIQUE NOT NULL) ON COMMIT DELETE ROWS;
CREATE TABLE
postgres=# CREATE LOCAL TEMPORARY TABLE gtt2(c1 integer NOT NULL, c2 integer NOT NULL,
postgres(# PRIMARY KEY (c1, c2),
postgres(# FOREIGN KEY (c1) REFERENCES gtt1 (c1)) ON COMMIT PRESERVE ROWS;
ERROR:  unsupported ON COMMIT and foreign key combination
DETAIL:  Table "gtt2" references "gtt1", but they do not have the same ON COMMIT setting.

postgres=# CREATE LOCAL TEMPORARY TABLE gtt3(c1 serial PRIMARY KEY, c2 VARCHAR (50) UNIQUE NOT NULL) ON COMMIT PRESERVE ROWS;
CREATE TABLE
postgres=# 
postgres=# CREATE LOCAL TEMPORARY TABLE gtt4(c1 integer NOT NULL, c2 integer NOT NULL,
postgres(# PRIMARY KEY (c1, c2),
postgres(# FOREIGN KEY (c1) REFERENCES gtt3 (c1)) ON COMMIT DELETE ROWS;
CREATE TABLE

The same behavior applies to the local temp table.
Yes, the issue is related to "local temp table".

I think, Cause of the problem is temp table with on commit delete rows is not good for reference tables.
So, it the error message ”cannot reference an on commit delete rows temporary table.“ ?
No, this is not always true.
We can create GTT/"local temp table" with "ON COMMIT DELETE ROWS"  which can references to "ON COMMIT DELETE ROWS"
 
Below are the 4 combinations of GTT/"local temp table" reference.
1. "ON COMMIT PRESERVE ROWS" can references to "ON COMMIT PRESERVE ROWS"
2. "ON COMMIT DELETE ROWS"   can references to "ON COMMIT PRESERVE ROWS"
3. "ON COMMIT DELETE ROWS"   can references to "ON COMMIT DELETE ROWS"
But
4. "ON COMMIT PRESERVE ROWS" fails to reference "ON COMMIT DELETE ROWS"
 
--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
Hi Wenjing,
Please check my findings(on gtt_v20.patch) as below:

TestCase1: (cache lookup failed on GTT)
-- Session1:
postgres=# create global temporary table gtt1(c1 int) on commit delete rows;
CREATE TABLE

-- Session2:
postgres=# drop table gtt1 ;
DROP TABLE

-- Session1:
postgres=# create global temporary table gtt1(c1 int) on commit delete rows;
ERROR:  cache lookup failed for relation 16384

TestCase2:
-- Session1:
postgres=# create global temporary table gtt (c1 integer) on commit preserve rows;
CREATE TABLE
postgres=# insert into gtt values(10);
INSERT 0 1

-- Session2:
postgres=# drop table gtt;
DROP TABLE

I hope "session2" should not allow to perform the "DROP" operation on GTT having data.

Behavior of GTT in Oracle Database in such scenario: For a completed transaction on GTT with(on_commit_delete_rows='FALSE') with data in a session, we will not be able to DROP from any session, we need to TRUNCATE the data first to DROP the table.
SQL> drop table gtt;
drop table gtt
           *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already
in use


On Tue, Mar 17, 2020 at 9:16 AM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:


2020年3月11日 下午3:52,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

On Mon, Mar 9, 2020 at 10:02 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:


Fixed in global_temporary_table_v18-pg13.patch.
Hi Wenjing,
Thanks for the patch. I have verified the previous issues with "gtt_v18_pg13.patch" and those are resolved.
Please find below case:

postgres=# create sequence seq;
CREATE SEQUENCE

postgres=# CREATE GLOBAL TEMPORARY TABLE gtt1(c1 int PRIMARY KEY) ON COMMIT DELETE ROWS;
CREATE TABLE

postgres=# CREATE GLOBAL TEMPORARY TABLE gtt2(c1 int PRIMARY KEY) ON COMMIT PRESERVE ROWS;
CREATE TABLE

postgres=# alter table gtt1 add c2 int default nextval('seq');
ERROR:  cannot reindex global temporary tables

postgres=# alter table gtt2 add c2 int default nextval('seq');
ERROR:  cannot reindex global temporary tables
reindex GTT is already supported

Please check global_temporary_table_v20-pg13.patch


Wenjing




Note: We are getting this error if we have a key column(PK/UNIQUE) in a GTT, and trying to add a column with a default sequence into it.

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com




--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
tushar
Date:
On 3/17/20 9:15 AM, 曾文旌(义从) wrote:
> reindex GTT is already supported
>
> Please check global_temporary_table_v20-pg13.patch
>
Please refer this scenario -


postgres=# create global temp table co(n int) ;
CREATE TABLE

postgres=# create index fff on co(n);
CREATE INDEX

Case 1-
postgres=# reindex table  co;
REINDEX

Case -2
postgres=# reindex database postgres ;
WARNING:  global temp table "public.co" skip reindexed
REINDEX
postgres=#

Case 2 should work as similar to Case 1.

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company




Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
Hi All,

Please check the behavior of GTT  having column with "SERIAL" datatype and column with default value as "SEQUENCE" as below:

Session1:
postgres=# create sequence gtt_c3_seq;
CREATE SEQUENCE
postgres=# create global temporary table gtt(c1 int, c2 serial, c3 int default nextval('gtt_c3_seq') not null) on commit preserve rows;
CREATE TABLE

-- Structure of column c2 and c3 are similar:
postgres=# \d+ gtt
                                                Table "public.gtt"
 Column |  Type   | Collation | Nullable |             Default             | Storage | Stats target | Description
--------+---------+-----------+----------+---------------------------------+---------+--------------+-------------
 c1     | integer |           |          |                                 | plain   |              |
 c2     | integer |           | not null | nextval('gtt_c2_seq'::regclass) | plain   |              |
 c3     | integer |           | not null | nextval('gtt_c3_seq'::regclass) | plain   |              |
Access method: heap
Options: on_commit_delete_rows=false

postgres=# insert into gtt select generate_series(1,3);
INSERT 0 3
postgres=# select * from gtt;
 c1 | c2 | c3
----+----+----
  1 |  1 |  1
  2 |  2 |  2
  3 |  3 |  3
(3 rows)

Session2:
postgres=# insert into gtt select generate_series(1,3);
INSERT 0 3
postgres=# select * from gtt;
 c1 | c2 | c3
----+----+----
  1 |  1 |  4
  2 |  2 |  5
  3 |  3 |  6
(3 rows)


Kindly let me know, Is this behavior expected?

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


st 25. 3. 2020 v 13:53 odesílatel Prabhat Sahu <prabhat.sahu@enterprisedb.com> napsal:
Hi All,

Please check the behavior of GTT  having column with "SERIAL" datatype and column with default value as "SEQUENCE" as below:

Session1:
postgres=# create sequence gtt_c3_seq;
CREATE SEQUENCE
postgres=# create global temporary table gtt(c1 int, c2 serial, c3 int default nextval('gtt_c3_seq') not null) on commit preserve rows;
CREATE TABLE

-- Structure of column c2 and c3 are similar:
postgres=# \d+ gtt
                                                Table "public.gtt"
 Column |  Type   | Collation | Nullable |             Default             | Storage | Stats target | Description
--------+---------+-----------+----------+---------------------------------+---------+--------------+-------------
 c1     | integer |           |          |                                 | plain   |              |
 c2     | integer |           | not null | nextval('gtt_c2_seq'::regclass) | plain   |              |
 c3     | integer |           | not null | nextval('gtt_c3_seq'::regclass) | plain   |              |
Access method: heap
Options: on_commit_delete_rows=false

postgres=# insert into gtt select generate_series(1,3);
INSERT 0 3
postgres=# select * from gtt;
 c1 | c2 | c3
----+----+----
  1 |  1 |  1
  2 |  2 |  2
  3 |  3 |  3
(3 rows)

Session2:
postgres=# insert into gtt select generate_series(1,3);
INSERT 0 3
postgres=# select * from gtt;
 c1 | c2 | c3
----+----+----
  1 |  1 |  4
  2 |  2 |  5
  3 |  3 |  6
(3 rows)


Kindly let me know, Is this behavior expected?

It is interesting side effect - theoretically it  is not important, because sequence ensure just unique values - so values are not important.

You created classic shared sequence so the behave is correct and expected.

Pavel


--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
tushar
Date:
On 3/17/20 9:15 AM, 曾文旌(义从) wrote:
> Please check global_temporary_table_v20-pg13.patch

There is a typo in the error message

postgres=# create global temp table test(a int ) 
with(on_commit_delete_rows=true) on commit delete rows;
ERROR:  can not defeine global temp table with on commit and with clause 
at same time
postgres=#

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company




Re: [Proposal] Global temporary tables

From
wjzeng
Date:


2020年3月25日 下午8:52,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi All,

Please check the behavior of GTT  having column with "SERIAL" datatype and column with default value as "SEQUENCE" as below:

Session1:
postgres=# create sequence gtt_c3_seq;
CREATE SEQUENCE
postgres=# create global temporary table gtt(c1 int, c2 serial, c3 int default nextval('gtt_c3_seq') not null) on commit preserve rows;
CREATE TABLE

-- Structure of column c2 and c3 are similar:
postgres=# \d+ gtt
                                                Table "public.gtt"
 Column |  Type   | Collation | Nullable |             Default             | Storage | Stats target | Description
--------+---------+-----------+----------+---------------------------------+---------+--------------+-------------
 c1     | integer |           |          |                                 | plain   |              |
 c2     | integer |           | not null | nextval('gtt_c2_seq'::regclass) | plain   |              |
 c3     | integer |           | not null | nextval('gtt_c3_seq'::regclass) | plain   |              |
Access method: heap
Options: on_commit_delete_rows=false

postgres=# insert into gtt select generate_series(1,3);
INSERT 0 3
postgres=# select * from gtt;
 c1 | c2 | c3
----+----+----
  1 |  1 |  1
  2 |  2 |  2
  3 |  3 |  3
(3 rows)

Session2:
postgres=# insert into gtt select generate_series(1,3);
INSERT 0 3
postgres=# select * from gtt;
 c1 | c2 | c3
----+----+----
  1 |  1 |  4
  2 |  2 |  5
  3 |  3 |  6
(3 rows)


Kindly let me know, Is this behavior expected?

--

postgres=# \d+
                                   List of relations
 Schema |    Name    |   Type   |    Owner    | Persistence |    Size    | Description 
--------+------------+----------+-------------+-------------+------------+-------------
 public | gtt        | table    | wenjing.zwj | session     | 8192 bytes | 
 public | gtt_c2_seq | sequence | wenjing.zwj | session     | 8192 bytes | 
 public | gtt_c3_seq | sequence | wenjing.zwj | permanent   | 8192 bytes | 
(3 rows)

This is expected.
GTT'sequence is the same as GTT, so gtt_c2_seq is independent of each sessions.
gtt_c3_seq is a classic sequence.



Wenjing


With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
wjzeng
Date:


2020年3月24日 下午9:34,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi Wenjing,
Please check my findings(on gtt_v20.patch) as below:

TestCase1: (cache lookup failed on GTT)
-- Session1:
postgres=# create global temporary table gtt1(c1 int) on commit delete rows;
CREATE TABLE

-- Session2:
postgres=# drop table gtt1 ;
DROP TABLE

-- Session1:
postgres=# create global temporary table gtt1(c1 int) on commit delete rows;
ERROR:  cache lookup failed for relation 16384

TestCase2:
-- Session1:
postgres=# create global temporary table gtt (c1 integer) on commit preserve rows;
CREATE TABLE
postgres=# insert into gtt values(10);
INSERT 0 1

-- Session2:
postgres=# drop table gtt;
DROP TABLE

I hope "session2" should not allow to perform the "DROP" operation on GTT having data.

Sorry, I introduced this bug in my refactoring.
It's been fixed.

Wenjing




Behavior of GTT in Oracle Database in such scenario: For a completed transaction on GTT with(on_commit_delete_rows='FALSE') with data in a session, we will not be able to DROP from any session, we need to TRUNCATE the data first to DROP the table.
SQL> drop table gtt;
drop table gtt
           *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already
in use


On Tue, Mar 17, 2020 at 9:16 AM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:


2020年3月11日 下午3:52,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

On Mon, Mar 9, 2020 at 10:02 PM 曾文旌(义从) <wenjing.zwj@alibaba-inc.com> wrote:


Fixed in global_temporary_table_v18-pg13.patch.
Hi Wenjing,
Thanks for the patch. I have verified the previous issues with "gtt_v18_pg13.patch" and those are resolved.
Please find below case:

postgres=# create sequence seq;
CREATE SEQUENCE

postgres=# CREATE GLOBAL TEMPORARY TABLE gtt1(c1 int PRIMARY KEY) ON COMMIT DELETE ROWS;
CREATE TABLE

postgres=# CREATE GLOBAL TEMPORARY TABLE gtt2(c1 int PRIMARY KEY) ON COMMIT PRESERVE ROWS;
CREATE TABLE

postgres=# alter table gtt1 add c2 int default nextval('seq');
ERROR:  cannot reindex global temporary tables

postgres=# alter table gtt2 add c2 int default nextval('seq');
ERROR:  cannot reindex global temporary tables
reindex GTT is already supported

Please check global_temporary_table_v20-pg13.patch


Wenjing




Note: We are getting this error if we have a key column(PK/UNIQUE) in a GTT, and trying to add a column with a default sequence into it.

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com




--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
wjzeng
Date:

> 2020年3月25日 下午6:44,tushar <tushar.ahuja@enterprisedb.com> 写道:
>
> On 3/17/20 9:15 AM, 曾文旌(义从) wrote:
>> reindex GTT is already supported
>>
>> Please check global_temporary_table_v20-pg13.patch
>>
> Please refer this scenario -
>
>
> postgres=# create global temp table co(n int) ;
> CREATE TABLE
>
> postgres=# create index fff on co(n);
> CREATE INDEX
>
> Case 1-
> postgres=# reindex table  co;
> REINDEX
>
> Case -2
> postgres=# reindex database postgres ;
> WARNING:  global temp table "public.co" skip reindexed
I fixed in global_temporary_table_v21-pg13.patch


Wenjing


> REINDEX
> postgres=#
>
> Case 2 should work as similar to Case 1.
>
> --
> regards,tushar
> EnterpriseDB  https://www.enterprisedb.com/
> The Enterprise PostgreSQL Company


Attachment

Re: [Proposal] Global temporary tables

From
wjzeng
Date:

> 2020年3月25日 下午10:16,tushar <tushar.ahuja@enterprisedb.com> 写道:
>
> On 3/17/20 9:15 AM, 曾文旌(义从) wrote:
>> Please check global_temporary_table_v20-pg13.patch
>
> There is a typo in the error message
>
> postgres=# create global temp table test(a int ) with(on_commit_delete_rows=true) on commit delete rows;
> ERROR:  can not defeine global temp table with on commit and with clause at same time
> postgres=#
Thank you pointed out.
I fixed in global_temporary_table_v21-pg13.patch


Wenjing


>
> --
> regards,tushar
> EnterpriseDB  https://www.enterprisedb.com/
> The Enterprise PostgreSQL Company


Attachment

Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:

Sorry, I introduced this bug in my refactoring.
It's been fixed.

Wenjing

Hi Wenjing,
This patch(gtt_v21_pg13.patch) is not applicable on PG HEAD, I hope you have prepared the patch on top of some previous commit.
Could you please rebase the patch which we can apply on HEAD ?

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年3月26日 下午12:34,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:


Sorry, I introduced this bug in my refactoring.
It's been fixed.

Wenjing

Hi Wenjing,
This patch(gtt_v21_pg13.patch) is not applicable on PG HEAD, I hope you have prepared the patch on top of some previous commit.
Could you please rebase the patch which we can apply on HEAD ?
Yes, It looks like the built-in functions are in conflict with new code.


Wenjing




--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
tushar
Date:
On 3/27/20 10:55 AM, 曾文旌 wrote:
Hi Wenjing,
This patch(gtt_v21_pg13.patch) is not applicable on PG HEAD, I hope you have prepared the patch on top of some previous commit.
Could you please rebase the patch which we can apply on HEAD ?
Yes, It looks like the built-in functions are in conflict with new code.


This error message looks wrong  to me-

postgres=# reindex table concurrently t ;
ERROR:  cannot create indexes on global temporary tables using concurrent mode
postgres=# 

Better message would be-

ERROR:  cannot reindex global temporary tables concurrently

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Re: [Proposal] Global temporary tables

From
tushar
Date:
On 3/27/20 10:55 AM, 曾文旌 wrote:
Hi Wenjing,
This patch(gtt_v21_pg13.patch) is not applicable on PG HEAD, I hope you have prepared the patch on top of some previous commit.
Could you please rebase the patch which we can apply on HEAD ?
Yes, It looks like the built-in functions are in conflict with new code.

In this below scenario, pg_dump is failing -

test=# CREATE database foo;
CREATE DATABASE
test=# \c foo
You are now connected to database "foo" as user "tushar".
foo=# CREATE GLOBAL TEMPORARY TABLE bar(c1 bigint, c2 bigserial) on commit PRESERVE rows;
CREATE TABLE
foo=# \q

[tushar@localhost bin]$ ./pg_dump -Fp foo > /tmp/rf2
pg_dump: error: query to get data of sequence "bar_c2_seq" returned 0 rows (expected 1)
[tushar@localhost bin]$

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年3月27日 下午6:06,tushar <tushar.ahuja@enterprisedb.com> 写道:

On 3/27/20 10:55 AM, 曾文旌 wrote:
Hi Wenjing,
This patch(gtt_v21_pg13.patch) is not applicable on PG HEAD, I hope you have prepared the patch on top of some previous commit.
Could you please rebase the patch which we can apply on HEAD ?
Yes, It looks like the built-in functions are in conflict with new code.

In this below scenario, pg_dump is failing -

test=# CREATE database foo;
CREATE DATABASE
test=# \c foo
You are now connected to database "foo" as user "tushar".
foo=# CREATE GLOBAL TEMPORARY TABLE bar(c1 bigint, c2 bigserial) on commit PRESERVE rows;
CREATE TABLE
foo=# \q

[tushar@localhost bin]$ ./pg_dump -Fp foo > /tmp/rf2
pg_dump: error: query to get data of sequence "bar_c2_seq" returned 0 rows (expected 1)
[tushar@localhost bin]$

Thanks for review
Fixed in global_temporary_table_v23-pg13.patch



Wenjing





-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Attachment

Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年3月27日 下午5:21,tushar <tushar.ahuja@enterprisedb.com> 写道:

On 3/27/20 10:55 AM, 曾文旌 wrote:
Hi Wenjing,
This patch(gtt_v21_pg13.patch) is not applicable on PG HEAD, I hope you have prepared the patch on top of some previous commit.
Could you please rebase the patch which we can apply on HEAD ?
Yes, It looks like the built-in functions are in conflict with new code.


This error message looks wrong  to me-

postgres=# reindex table concurrently t ;
ERROR:  cannot create indexes on global temporary tables using concurrent mode
postgres=# 

Better message would be-

ERROR:  cannot reindex global temporary tables concurrently

I found that the local temp table automatically disables concurrency mode.
so, I made some improvements, The reindex GTT behaves the same as the local temp table.


Wenjing




-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Attachment

Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
Hi Wenjing,
Thanks for the new patch.
I saw with the patch(gtt_v23.patch), we are supporting the new concept "global temporary sequence"(i.e. session-specific sequence), is this intentional?

postgres=# create global temporary sequence gt_seq;
CREATE SEQUENCE
postgres=# create sequence seq;
CREATE SEQUENCE
postgres=# \d+
                              List of relations
 Schema |  Name  |   Type   | Owner | Persistence |    Size    | Description
--------+--------+----------+-------+-------------+------------+-------------
 public | gt_seq | sequence | edb   | session     | 8192 bytes |
 public | seq    | sequence | edb   | permanent   | 8192 bytes |
(2 rows)

postgres=# select nextval('gt_seq'), nextval('seq');
 nextval | nextval
---------+---------
       1 |       1
(1 row)

postgres=# select nextval('gt_seq'), nextval('seq');
 nextval | nextval
---------+---------
       2 |       2
(1 row)

-- Exit and re-connect to psql prompt:
postgres=# \q
[edb@localhost bin]$ ./psql postgres
psql (13devel)
Type "help" for help.

postgres=# select nextval('gt_seq'), nextval('seq');
 nextval | nextval
---------+---------
       1 |       3
(1 row)

postgres=# select nextval('gt_seq'), nextval('seq');
 nextval | nextval
---------+---------
       2 |       4
(1 row)

On Tue, Mar 31, 2020 at 9:46 AM 曾文旌 <wenjing.zwj@alibaba-inc.com> wrote:


2020年3月27日 下午5:21,tushar <tushar.ahuja@enterprisedb.com> 写道:

On 3/27/20 10:55 AM, 曾文旌 wrote:
Hi Wenjing,
This patch(gtt_v21_pg13.patch) is not applicable on PG HEAD, I hope you have prepared the patch on top of some previous commit.
Could you please rebase the patch which we can apply on HEAD ?
Yes, It looks like the built-in functions are in conflict with new code.


This error message looks wrong  to me-

postgres=# reindex table concurrently t ;
ERROR:  cannot create indexes on global temporary tables using concurrent mode
postgres=# 

Better message would be-

ERROR:  cannot reindex global temporary tables concurrently

I found that the local temp table automatically disables concurrency mode.
so, I made some improvements, The reindex GTT behaves the same as the local temp table.


Wenjing




-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company



--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年3月31日 下午9:59,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi Wenjing,
Thanks for the new patch.
I saw with the patch(gtt_v23.patch), we are supporting the new concept "global temporary sequence"(i.e. session-specific sequence), is this intentional?
It was supported in earlier versions,
This causes the sequence built into the GTT to automatically become a "global temp sequence",
Such as create global temp table (a serial);
Like GTT, the global temp sequnce is used individually for each session.

Recently, I added the global temp sequence syntax so that it can be created independently.
The purpose of this is to enable such sequence built into the GTT to support pg_dump and pg_restore.


Wenjing



postgres=# create global temporary sequence gt_seq;
CREATE SEQUENCE
postgres=# create sequence seq;
CREATE SEQUENCE
postgres=# \d+
                              List of relations
 Schema |  Name  |   Type   | Owner | Persistence |    Size    | Description
--------+--------+----------+-------+-------------+------------+-------------
 public | gt_seq | sequence | edb   | session     | 8192 bytes |
 public | seq    | sequence | edb   | permanent   | 8192 bytes |
(2 rows)

postgres=# select nextval('gt_seq'), nextval('seq');
 nextval | nextval
---------+---------
       1 |       1
(1 row)

postgres=# select nextval('gt_seq'), nextval('seq');
 nextval | nextval
---------+---------
       2 |       2
(1 row)

-- Exit and re-connect to psql prompt:
postgres=# \q
[edb@localhost bin]$ ./psql postgres
psql (13devel)
Type "help" for help.

postgres=# select nextval('gt_seq'), nextval('seq');
 nextval | nextval
---------+---------
       1 |       3
(1 row)

postgres=# select nextval('gt_seq'), nextval('seq');
 nextval | nextval
---------+---------
       2 |       4
(1 row)

On Tue, Mar 31, 2020 at 9:46 AM 曾文旌 <wenjing.zwj@alibaba-inc.com> wrote:


2020年3月27日 下午5:21,tushar <tushar.ahuja@enterprisedb.com> 写道:

On 3/27/20 10:55 AM, 曾文旌 wrote:
Hi Wenjing,
This patch(gtt_v21_pg13.patch) is not applicable on PG HEAD, I hope you have prepared the patch on top of some previous commit.
Could you please rebase the patch which we can apply on HEAD ?
Yes, It looks like the built-in functions are in conflict with new code.


This error message looks wrong  to me-

postgres=# reindex table concurrently t ;
ERROR:  cannot create indexes on global temporary tables using concurrent mode
postgres=# 

Better message would be-

ERROR:  cannot reindex global temporary tables concurrently

I found that the local temp table automatically disables concurrency mode.
so, I made some improvements, The reindex GTT behaves the same as the local temp table.


Wenjing




-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company



--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:

On Wed, Apr 1, 2020 at 8:52 AM 曾文旌 <wenjing.zwj@alibaba-inc.com> wrote:


2020年3月31日 下午9:59,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi Wenjing,
Thanks for the new patch.
I saw with the patch(gtt_v23.patch), we are supporting the new concept "global temporary sequence"(i.e. session-specific sequence), is this intentional?
It was supported in earlier versions,
yes.

This causes the sequence built into the GTT to automatically become a "global temp sequence",
Such as create global temp table (a serial);
Like GTT, the global temp sequnce is used individually for each session.

Recently, I added the global temp sequence syntax so that it can be created independently.
The purpose of this is to enable such sequence built into the GTT to support pg_dump and pg_restore.

Thanks for the explanation.

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
Hi Wenjing,
I hope we need to change the below error message.

postgres=# create global temporary table gtt(c1 int) on commit preserve rows;
CREATE TABLE

postgres=# create materialized view mvw as select * from gtt;
ERROR: materialized views must not use global temporary tables or views


Anyways we are not allowed to create a "global temporary view", 
so the above ERROR message should change(i.e. " or view" need to be removed from the error message) something like:
"ERROR: materialized views must not use global temporary tables"

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
Hi All,

I have noted down few behavioral difference in our GTT implementation in PG as compared to Oracle DB:
As per my understanding, the behavior of DROP TABLE in case of "Normal table and GTT" in Oracle DB are as below:
  1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
  2. For a completed transaction on a normal table having data, we will be able to DROP from another session. If the transaction is not yet complete, and we are trying to drop the table from another session, then we will get an error. (working as expected)
  3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
  4. For a completed transaction on GTT with(on commit preserve rows) with data in a session, we will not be able to DROP from any session(not even from the session in which GTT is created), we need to truncate the table data first from all the session(session1, session2) which is having data.
1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
Session1:
create table t1 (c1 integer);
create global temporary table gtt1 (c1 integer) on commit delete rows;
create global temporary table gtt2 (c1 integer) on commit preserve rows;


Session2:
drop table t1;
drop table gtt1;
drop table gtt2;


-- Issue 1: But we are able to drop a simple table and failed to drop GTT as below.
postgres=# drop table t1;
DROP TABLE
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table
postgres=# drop table gtt2;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
Session1:
create global temporary table gtt1 (c1 integer) on commit delete rows;

Session2:
drop table gtt1;

-- Issue 2: But we are getting error for GTT with(on_commit_delete_rows) without data.
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table

4. For a completed transaction on GTT with(on commit preserve rows) with data in any session, we will not be able to DROP from any session(not even from the session in which GTT is created)

Case1:
create global temporary table gtt2 (c1 integer) on commit preserve rows;
insert into gtt2 values(100);
drop table gtt2;


SQL> drop table gtt2;
drop table gtt2
  *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use


-- Issue 3: But, we are able to drop the GTT(having data) which we have created in the same session.
postgres=# drop table gtt2;
DROP TABLE

Case2: GTT with(on commit preserve rows) having data in both session1 and session2
Session1:
create global temporary table gtt2 (c1 integer) on commit preserve rows;
insert into gtt2 values(100);


Session2:
insert into gtt2 values(200);

-- If we try to drop the table from any session we should get an error, it is working fine.
drop table gtt2;
SQL> drop table gtt2;
drop table gtt2
  *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use

postgres=# drop table gtt2 ;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

-- To drop the table gtt2 from any session1/session2, we need to truncate the table data first from all the session(session1, session2) which is having data.
Session1:
truncate table gtt2;
-- Session2:
truncate table gtt2;

Session 2:
SQL> drop table gtt2;

Table dropped.


-- Issue 4: But we are not able to drop the GTT, even after TRUNCATE the table in all the sessions.
-- truncate from all sessions where GTT have data.
postgres=# truncate gtt2 ;
TRUNCATE TABLE


-- try to DROP GTT still, we are getting error.
postgres=# drop table gtt2 ;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

To drop the GTT from any session, we need to exit from all other sessions.
postgres=# drop table gtt2 ;
DROP TABLE


Kindly let me know if I am missing something.


On Wed, Apr 1, 2020 at 6:26 PM Prabhat Sahu <prabhat.sahu@enterprisedb.com> wrote:
Hi Wenjing,
I hope we need to change the below error message.

postgres=# create global temporary table gtt(c1 int) on commit preserve rows;
CREATE TABLE

postgres=# create materialized view mvw as select * from gtt;
ERROR: materialized views must not use global temporary tables or views


Anyways we are not allowed to create a "global temporary view", 
so the above ERROR message should change(i.e. " or view" need to be removed from the error message) something like:
"ERROR: materialized views must not use global temporary tables"

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com



--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


čt 2. 4. 2020 v 10:45 odesílatel Prabhat Sahu <prabhat.sahu@enterprisedb.com> napsal:
Hi All,

I have noted down few behavioral difference in our GTT implementation in PG as compared to Oracle DB:
As per my understanding, the behavior of DROP TABLE in case of "Normal table and GTT" in Oracle DB are as below:
  1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
  2. For a completed transaction on a normal table having data, we will be able to DROP from another session. If the transaction is not yet complete, and we are trying to drop the table from another session, then we will get an error. (working as expected)
  3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
  4. For a completed transaction on GTT with(on commit preserve rows) with data in a session, we will not be able to DROP from any session(not even from the session in which GTT is created), we need to truncate the table data first from all the session(session1, session2) which is having data.
1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
Session1:
create table t1 (c1 integer);
create global temporary table gtt1 (c1 integer) on commit delete rows;
create global temporary table gtt2 (c1 integer) on commit preserve rows;


Session2:
drop table t1;
drop table gtt1;
drop table gtt2;


-- Issue 1: But we are able to drop a simple table and failed to drop GTT as below.
postgres=# drop table t1;
DROP TABLE
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table
postgres=# drop table gtt2;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

I think so this is expected behave. It was proposed for first release - and for next releases there can be support for DROP TABLE with force option like DROP DATABASE (force).

Regards

Pavel


3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
Session1:
create global temporary table gtt1 (c1 integer) on commit delete rows;

Session2:
drop table gtt1;

-- Issue 2: But we are getting error for GTT with(on_commit_delete_rows) without data.
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table

4. For a completed transaction on GTT with(on commit preserve rows) with data in any session, we will not be able to DROP from any session(not even from the session in which GTT is created)

Case1:
create global temporary table gtt2 (c1 integer) on commit preserve rows;
insert into gtt2 values(100);
drop table gtt2;


SQL> drop table gtt2;
drop table gtt2
  *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use


-- Issue 3: But, we are able to drop the GTT(having data) which we have created in the same session.
postgres=# drop table gtt2;
DROP TABLE

Case2: GTT with(on commit preserve rows) having data in both session1 and session2
Session1:
create global temporary table gtt2 (c1 integer) on commit preserve rows;
insert into gtt2 values(100);


Session2:
insert into gtt2 values(200);

-- If we try to drop the table from any session we should get an error, it is working fine.
drop table gtt2;
SQL> drop table gtt2;
drop table gtt2
  *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use

postgres=# drop table gtt2 ;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

-- To drop the table gtt2 from any session1/session2, we need to truncate the table data first from all the session(session1, session2) which is having data.
Session1:
truncate table gtt2;
-- Session2:
truncate table gtt2;

Session 2:
SQL> drop table gtt2;

Table dropped.


-- Issue 4: But we are not able to drop the GTT, even after TRUNCATE the table in all the sessions.
-- truncate from all sessions where GTT have data.
postgres=# truncate gtt2 ;
TRUNCATE TABLE


-- try to DROP GTT still, we are getting error.
postgres=# drop table gtt2 ;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

To drop the GTT from any session, we need to exit from all other sessions.
postgres=# drop table gtt2 ;
DROP TABLE


Kindly let me know if I am missing something.


On Wed, Apr 1, 2020 at 6:26 PM Prabhat Sahu <prabhat.sahu@enterprisedb.com> wrote:
Hi Wenjing,
I hope we need to change the below error message.

postgres=# create global temporary table gtt(c1 int) on commit preserve rows;
CREATE TABLE

postgres=# create materialized view mvw as select * from gtt;
ERROR: materialized views must not use global temporary tables or views


Anyways we are not allowed to create a "global temporary view", 
so the above ERROR message should change(i.e. " or view" need to be removed from the error message) something like:
"ERROR: materialized views must not use global temporary tables"

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com



--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
曾文旌
Date:
In my opinion
1 We are developing GTT according to the SQL standard, not Oracle.

2 The implementation differences you listed come from pg and oracle storage modules and DDL implementations.

2.1 issue 1 and issue 2
The creation of Normal table/GTT defines the catalog and initializes the data store file, in the case of the GTT, which initializes the store file for the current session. 
But in oracle It just looks like only defines the catalog.
This causes other sessions can not drop the GTT in PostgreSQL.
This is the reason for issue 1 and issue 2, I think it is reasonable.

2.2 issue 3
I thinking the logic of drop GTT is
When only the current session is using the GTT, it is safe to drop the GTT. 
because the GTT's definition and storage files can completely delete from db.
But, If multiple sessions are using this GTT, it is hard to drop GTT in session a, because remove the local buffer and data file of the GTT in other session is difficult.
I am not sure why oracle has this limitation.
So, issue 3 is reasonable.

2.3 TRUNCATE Normal table/GTT
TRUNCATE Normal table / GTT clean up the logical data but not unlink data store file. in the case of the GTT, which is the store file for the current session.
But in oracle,  It just looks like data store file was cleaned up.
PostgreSQL storage is obviously different from oracle, In other words, session is detached from storage.
This is the reason for issue4 I think it is reasonable.

All in all, I think the current implementation is sufficient for dba to manage GTT.

2020年4月2日 下午4:45,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi All,

I have noted down few behavioral difference in our GTT implementation in PG as compared to Oracle DB:
As per my understanding, the behavior of DROP TABLE in case of "Normal table and GTT" in Oracle DB are as below:
  1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
  2. For a completed transaction on a normal table having data, we will be able to DROP from another session. If the transaction is not yet complete, and we are trying to drop the table from another session, then we will get an error. (working as expected)
  3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
  4. For a completed transaction on GTT with(on commit preserve rows) with data in a session, we will not be able to DROP from any session(not even from the session in which GTT is created), we need to truncate the table data first from all the session(session1, session2) which is having data.
1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
Session1:
create table t1 (c1 integer);
create global temporary table gtt1 (c1 integer) on commit delete rows;
create global temporary table gtt2 (c1 integer) on commit preserve rows;


Session2:
drop table t1;
drop table gtt1;
drop table gtt2;


-- Issue 1: But we are able to drop a simple table and failed to drop GTT as below.
postgres=# drop table t1;
DROP TABLE
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table
postgres=# drop table gtt2;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
Session1:
create global temporary table gtt1 (c1 integer) on commit delete rows;

Session2:
drop table gtt1;

-- Issue 2: But we are getting error for GTT with(on_commit_delete_rows) without data.
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table

4. For a completed transaction on GTT with(on commit preserve rows) with data in any session, we will not be able to DROP from any session(not even from the session in which GTT is created)

Case1:
create global temporary table gtt2 (c1 integer) on commit preserve rows;
insert into gtt2 values(100);
drop table gtt2;


SQL> drop table gtt2;
drop table gtt2
  *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use


-- Issue 3: But, we are able to drop the GTT(having data) which we have created in the same session.
postgres=# drop table gtt2;
DROP TABLE

Case2: GTT with(on commit preserve rows) having data in both session1 and session2
Session1:
create global temporary table gtt2 (c1 integer) on commit preserve rows;
insert into gtt2 values(100);


Session2:
insert into gtt2 values(200);

-- If we try to drop the table from any session we should get an error, it is working fine.
drop table gtt2;
SQL> drop table gtt2;
drop table gtt2
  *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use

postgres=# drop table gtt2 ;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

-- To drop the table gtt2 from any session1/session2, we need to truncate the table data first from all the session(session1, session2) which is having data.
Session1:
truncate table gtt2;
-- Session2:
truncate table gtt2;

Session 2:
SQL> drop table gtt2;

Table dropped.


-- Issue 4: But we are not able to drop the GTT, even after TRUNCATE the table in all the sessions.
-- truncate from all sessions where GTT have data.
postgres=# truncate gtt2 ;
TRUNCATE TABLE


-- try to DROP GTT still, we are getting error.
postgres=# drop table gtt2 ;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

To drop the GTT from any session, we need to exit from all other sessions.
postgres=# drop table gtt2 ;
DROP TABLE


Kindly let me know if I am missing something.


On Wed, Apr 1, 2020 at 6:26 PM Prabhat Sahu <prabhat.sahu@enterprisedb.com> wrote:
Hi Wenjing,
I hope we need to change the below error message.

postgres=# create global temporary table gtt(c1 int) on commit preserve rows;
CREATE TABLE

postgres=# create materialized view mvw as select * from gtt;
ERROR: materialized views must not use global temporary tables or views


Anyways we are not allowed to create a "global temporary view", 
so the above ERROR message should change(i.e. " or view" need to be removed from the error message) something like:
"ERROR: materialized views must not use global temporary tables"

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com



--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


pá 3. 4. 2020 v 9:52 odesílatel 曾文旌 <wenjing.zwj@alibaba-inc.com> napsal:
In my opinion
1 We are developing GTT according to the SQL standard, not Oracle.

2 The implementation differences you listed come from pg and oracle storage modules and DDL implementations.

2.1 issue 1 and issue 2
The creation of Normal table/GTT defines the catalog and initializes the data store file, in the case of the GTT, which initializes the store file for the current session. 
But in oracle It just looks like only defines the catalog.
This causes other sessions can not drop the GTT in PostgreSQL.
This is the reason for issue 1 and issue 2, I think it is reasonable.

2.2 issue 3
I thinking the logic of drop GTT is
When only the current session is using the GTT, it is safe to drop the GTT. 
because the GTT's definition and storage files can completely delete from db.
But, If multiple sessions are using this GTT, it is hard to drop GTT in session a, because remove the local buffer and data file of the GTT in other session is difficult.
I am not sure why oracle has this limitation.
So, issue 3 is reasonable.

2.3 TRUNCATE Normal table/GTT
TRUNCATE Normal table / GTT clean up the logical data but not unlink data store file. in the case of the GTT, which is the store file for the current session.
But in oracle,  It just looks like data store file was cleaned up.
PostgreSQL storage is obviously different from oracle, In other words, session is detached from storage.
This is the reason for issue4 I think it is reasonable.

Although the implementation of GTT is different, I think so TRUNCATE on Postgres (when it is really finalized) can remove session metadata of GTT too (and reduce usage's counter). It is not critical feature, but I think so it should not be hard to implement. From practical reason can be nice to have a tool how to refresh GTT without a necessity to close session. TRUNCATE can be this tool.

Regards

Pavel


All in all, I think the current implementation is sufficient for dba to manage GTT.

2020年4月2日 下午4:45,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi All,

I have noted down few behavioral difference in our GTT implementation in PG as compared to Oracle DB:
As per my understanding, the behavior of DROP TABLE in case of "Normal table and GTT" in Oracle DB are as below:
  1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
  2. For a completed transaction on a normal table having data, we will be able to DROP from another session. If the transaction is not yet complete, and we are trying to drop the table from another session, then we will get an error. (working as expected)
  3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
  4. For a completed transaction on GTT with(on commit preserve rows) with data in a session, we will not be able to DROP from any session(not even from the session in which GTT is created), we need to truncate the table data first from all the session(session1, session2) which is having data.
1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
Session1:
create table t1 (c1 integer);
create global temporary table gtt1 (c1 integer) on commit delete rows;
create global temporary table gtt2 (c1 integer) on commit preserve rows;


Session2:
drop table t1;
drop table gtt1;
drop table gtt2;


-- Issue 1: But we are able to drop a simple table and failed to drop GTT as below.
postgres=# drop table t1;
DROP TABLE
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table
postgres=# drop table gtt2;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
Session1:
create global temporary table gtt1 (c1 integer) on commit delete rows;

Session2:
drop table gtt1;

-- Issue 2: But we are getting error for GTT with(on_commit_delete_rows) without data.
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table

4. For a completed transaction on GTT with(on commit preserve rows) with data in any session, we will not be able to DROP from any session(not even from the session in which GTT is created)

Case1:
create global temporary table gtt2 (c1 integer) on commit preserve rows;
insert into gtt2 values(100);
drop table gtt2;


SQL> drop table gtt2;
drop table gtt2
  *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use


-- Issue 3: But, we are able to drop the GTT(having data) which we have created in the same session.
postgres=# drop table gtt2;
DROP TABLE

Case2: GTT with(on commit preserve rows) having data in both session1 and session2
Session1:
create global temporary table gtt2 (c1 integer) on commit preserve rows;
insert into gtt2 values(100);


Session2:
insert into gtt2 values(200);

-- If we try to drop the table from any session we should get an error, it is working fine.
drop table gtt2;
SQL> drop table gtt2;
drop table gtt2
  *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use

postgres=# drop table gtt2 ;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

-- To drop the table gtt2 from any session1/session2, we need to truncate the table data first from all the session(session1, session2) which is having data.
Session1:
truncate table gtt2;
-- Session2:
truncate table gtt2;

Session 2:
SQL> drop table gtt2;

Table dropped.


-- Issue 4: But we are not able to drop the GTT, even after TRUNCATE the table in all the sessions.
-- truncate from all sessions where GTT have data.
postgres=# truncate gtt2 ;
TRUNCATE TABLE


-- try to DROP GTT still, we are getting error.
postgres=# drop table gtt2 ;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

To drop the GTT from any session, we need to exit from all other sessions.
postgres=# drop table gtt2 ;
DROP TABLE


Kindly let me know if I am missing something.


On Wed, Apr 1, 2020 at 6:26 PM Prabhat Sahu <prabhat.sahu@enterprisedb.com> wrote:
Hi Wenjing,
I hope we need to change the below error message.

postgres=# create global temporary table gtt(c1 int) on commit preserve rows;
CREATE TABLE

postgres=# create materialized view mvw as select * from gtt;
ERROR: materialized views must not use global temporary tables or views


Anyways we are not allowed to create a "global temporary view", 
so the above ERROR message should change(i.e. " or view" need to be removed from the error message) something like:
"ERROR: materialized views must not use global temporary tables"

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com



--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
Hi Wenjing,

Please check the allowed values for boolean parameter "on_commit_delete_rows".

postgres=# create global temp table gtt1(c1 int) with(on_commit_delete_rows='true');
CREATE TABLE
Similarly we can successfully create GTT by using the values as: 'true','false', true, false, 'ON', 'OFF', ON, OFF, 1, 0 for boolean parameter "on_commit_delete_rows"

But we are getting error while using the boolean value as: '1', '0', 't', 'f', 'yes', 'no', 'y', 'n' as below.
postgres=# create global temp table gtt11(c1 int) with(on_commit_delete_rows='1');
ERROR:  on_commit_delete_rows requires a Boolean value
postgres=# create global temp table gtt11(c1 int) with(on_commit_delete_rows='0');
ERROR:  on_commit_delete_rows requires a Boolean value
postgres=# create global temp table gtt11(c1 int) with(on_commit_delete_rows='t');
ERROR:  on_commit_delete_rows requires a Boolean value
postgres=# create global temp table gtt11(c1 int) with(on_commit_delete_rows='f');
ERROR:  on_commit_delete_rows requires a Boolean value
postgres=# create global temp table gtt11(c1 int) with(on_commit_delete_rows='yes');
ERROR:  on_commit_delete_rows requires a Boolean value
postgres=# create global temp table gtt11(c1 int) with(on_commit_delete_rows='no');
ERROR:  on_commit_delete_rows requires a Boolean value
postgres=# create global temp table gtt11(c1 int) with(on_commit_delete_rows='y');
ERROR:  on_commit_delete_rows requires a Boolean value
postgres=# create global temp table gtt11(c1 int) with(on_commit_delete_rows='n');
ERROR:  on_commit_delete_rows requires a Boolean value

-- As per the error message "ERROR:  on_commit_delete_rows requires a Boolean value" either we should allow all the boolean values.
Example: CREATE VIEW view1 WITH (security_barrier = 'true') as select 5;
The syntax of VIEW allows all the above possible boolean values for the boolean parameter "security_barrier"

-- or else we should change the error message something like
"ERROR:  on_commit_delete_rows requires 'true','false','ON','OFF',1,0 as Boolean value".

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年4月3日 下午8:43,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi Wenjing,

Please check the allowed values for boolean parameter "on_commit_delete_rows".

postgres=# create global temp table gtt1(c1 int) with(on_commit_delete_rows='true');
CREATE TABLE
Similarly we can successfully create GTT by using the values as: 'true','false', true, false, 'ON', 'OFF', ON, OFF, 1, 0 for boolean parameter "on_commit_delete_rows"

But we are getting error while using the boolean value as: '1', '0', 't', 'f', 'yes', 'no', 'y', 'n' as below.
postgres=# create global temp table gtt11(c1 int) with(on_commit_delete_rows='1');
ERROR:  on_commit_delete_rows requires a Boolean value
postgres=# create global temp table gtt11(c1 int) with(on_commit_delete_rows='0');
ERROR:  on_commit_delete_rows requires a Boolean value
postgres=# create global temp table gtt11(c1 int) with(on_commit_delete_rows='t');
ERROR:  on_commit_delete_rows requires a Boolean value
postgres=# create global temp table gtt11(c1 int) with(on_commit_delete_rows='f');
ERROR:  on_commit_delete_rows requires a Boolean value
postgres=# create global temp table gtt11(c1 int) with(on_commit_delete_rows='yes');
ERROR:  on_commit_delete_rows requires a Boolean value
postgres=# create global temp table gtt11(c1 int) with(on_commit_delete_rows='no');
ERROR:  on_commit_delete_rows requires a Boolean value
postgres=# create global temp table gtt11(c1 int) with(on_commit_delete_rows='y');
ERROR:  on_commit_delete_rows requires a Boolean value
postgres=# create global temp table gtt11(c1 int) with(on_commit_delete_rows='n');
ERROR:  on_commit_delete_rows requires a Boolean value
Thanks for review.
This parameter should support all types of writing of the bool type like parameter autovacuum_enabled.
So I fixed in global_temporary_table_v24-pg13.patch.


Wenjing




-- As per the error message "ERROR:  on_commit_delete_rows requires a Boolean value" either we should allow all the boolean values.
Example: CREATE VIEW view1 WITH (security_barrier = 'true') as select 5;
The syntax of VIEW allows all the above possible boolean values for the boolean parameter "security_barrier"

-- or else we should change the error message something like
"ERROR:  on_commit_delete_rows requires 'true','false','ON','OFF',1,0 as Boolean value".

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年2月15日 下午6:06,Pavel Stehule <pavel.stehule@gmail.com> 写道:


postgres=# insert into foo select generate_series(1,10000);
INSERT 0 10000
postgres=# \dt+ foo
                          List of relations
┌────────┬──────┬───────┬───────┬─────────────┬────────┬─────────────┐
│ Schema │ Name │ Type  │ Owner │ Persistence │  Size  │ Description │
╞════════╪══════╪═══════╪═══════╪═════════════╪════════╪═════════════╡
│ public │ foo  │ table │ pavel │ session     │ 384 kB │             │
└────────┴──────┴───────┴───────┴─────────────┴────────┴─────────────┘
(1 row)

postgres=# truncate foo;
TRUNCATE TABLE
postgres=# \dt+ foo
                          List of relations
┌────────┬──────┬───────┬───────┬─────────────┬───────┬─────────────┐
│ Schema │ Name │ Type  │ Owner │ Persistence │ Size  │ Description │
╞════════╪══════╪═══════╪═══════╪═════════════╪═══════╪═════════════╡
│ public │ foo  │ table │ pavel │ session     │ 16 kB │             │
└────────┴──────┴───────┴───────┴─────────────┴───────┴─────────────┘
(1 row)

I expect zero size after truncate.
Thanks for review.

I can explain, I don't think it's a bug.
The current implementation of the truncated GTT retains two blocks of FSM pages.
The same is true for truncating regular tables in subtransactions.
This is an implementation that truncates the table without changing the relfilenode of the table.


This is not extra important feature - now this is little bit a surprise, because I was not under transaction.

Changing relfilenode, I think, is necessary, minimally for future VACUUM FULL support.
HI all

Vacuum full GTT, cluster GTT is already supported in global_temporary_table_v24-pg13.patch.



Wenjing




Regards

Pavel Stehule
 

Wenjing


Regards

Pavel



Wenjing




>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company



Attachment

Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
Thanks for review.
This parameter should support all types of writing of the bool type like parameter autovacuum_enabled.
So I fixed in global_temporary_table_v24-pg13.patch.

Thank you Wenjing for the new patch with the fix and the "VACUUM FULL GTT" support.
I have verified the above issue now its resolved.

Please check the below findings on VACUUM FULL.

postgres=# create global temporary table  gtt(c1 int) on commit preserve rows;
CREATE TABLE
postgres=# vacuum FULL ;
WARNING:  global temp table oldest FrozenXid is far in the past
HINT:  please truncate them or kill those sessions that use them.
VACUUM


--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
Erik Rijkers
Date:
On 2020-04-07 10:57, 曾文旌 wrote:

> [global_temporary_table_v24-pg13.patch ]

Hi,

With gcc 9.3.0 (debian stretch), I see some low-key protests during the 
build:

index.c: In function ‘index_drop’:
index.c:2051:8: warning: variable ‘rel_persistence’ set but not used 
[-Wunused-but-set-variable]
  2051 |  char  rel_persistence;
       |        ^~~~~~~~~~~~~~~
storage_gtt.c: In function ‘gtt_force_enable_index’:
storage_gtt.c:1252:6: warning: unused variable ‘indexOid’ 
[-Wunused-variable]
  1252 |  Oid indexOid = RelationGetRelid(index);
       |      ^~~~~~~~
cluster.c: In function ‘copy_table_data’:
cluster.c:780:2: warning: this ‘if’ clause does not guard... 
[-Wmisleading-indentation]
   780 |  if (RELATION_IS_GLOBAL_TEMP(OldHeap));
       |  ^~
cluster.c:781:3: note: ...this statement, but the latter is misleadingly 
indented as if it were guarded by the ‘if’
   781 |   is_gtt = true;
       |   ^~~~~~


Erik



Re: [Proposal] Global temporary tables

From
tushar
Date:
On 4/7/20 2:27 PM, 曾文旌 wrote:
> Vacuum full GTT, cluster GTT is already 
> supported in global_temporary_table_v24-pg13.patch.

Here , it is skipping GTT

postgres=# \c foo
You are now connected to database "foo" as user "tushar".
foo=# create global temporary table  g123( c1 int) ;
CREATE TABLE
foo=# \q
[tushar@localhost bin]$ ./vacuumdb --full  foo
vacuumdb: vacuuming database "foo"
WARNING:  skipping vacuum global temp table "g123" because storage is 
not initialized for current session

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company




Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年4月7日 下午6:22,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Thanks for review.
This parameter should support all types of writing of the bool type like parameter autovacuum_enabled.
So I fixed in global_temporary_table_v24-pg13.patch.

Thank you Wenjing for the new patch with the fix and the "VACUUM FULL GTT" support.
I have verified the above issue now its resolved.

Please check the below findings on VACUUM FULL.

postgres=# create global temporary table  gtt(c1 int) on commit preserve rows;
CREATE TABLE
postgres=# vacuum FULL ;
WARNING:  global temp table oldest FrozenXid is far in the past
HINT:  please truncate them or kill those sessions that use them.
VACUUM


This is expected,
This represents that the GTT FrozenXid is the oldest in the entire db, and dba should vacuum the GTT if he want to push the db datfrozenxid.
Also he can use function pg_list_gtt_relfrozenxids() to check which session has "too old” data and truncate them or kill the sessions.




--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
tushar
Date:
On 4/7/20 2:27 PM, 曾文旌 wrote:
> Vacuum full GTT, cluster GTT is already 
> supported in global_temporary_table_v24-pg13.patch.
Please refer this below scenario , where pg_upgrade is failing
1)Server is up and running (./pg_ctl -D data status)
2)Stop the server ( ./pg_ctl -D data stop)
3)Connect to server using single user mode ( ./postgres --single -D data 
postgres) and create a global temp table
[tushar@localhost bin]$ ./postgres --single -D data1233 postgres

PostgreSQL stand-alone backend 13devel
backend> create global temp table t(n int);

--Press Ctl+D to exit

4)Perform initdb ( ./initdb -D data123)
5.Run pg_upgrade
[tushar@localhost bin]$ ./pg_upgrade -d data -D data123 -b . -B .
--
--
--
Restoring database schemas in the new cluster
   postgres
*failure*
Consult the last few lines of "pg_upgrade_dump_13592.log" for
the probable cause of the failure.
Failure, exiting

log file content  -

[tushar@localhost bin]$ tail -20   pg_upgrade_dump_13592.log
pg_restore: error: could not execute query: ERROR:  pg_type array OID 
value not set when in binary upgrade mode
Command was:
-- For binary upgrade, must preserve pg_type oid
SELECT 
pg_catalog.binary_upgrade_set_next_pg_type_oid('13594'::pg_catalog.oid);


-- For binary upgrade, must preserve pg_class oids
SELECT 
pg_catalog.binary_upgrade_set_next_heap_pg_class_oid('13593'::pg_catalog.oid);

CREATE GLOBAL TEMPORARY TABLE "public"."t" (
     "n" integer
)
WITH ("on_commit_delete_rows"='false');

-- For binary upgrade, set heap's relfrozenxid and relminmxid
UPDATE pg_catalog.pg_class
SET relfrozenxid = '0', relminmxid = '0'
WHERE oid = '"public"."t"'::pg_catalog.regclass;

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company




Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:


On Wed, Apr 8, 2020 at 1:48 PM 曾文旌 <wenjing.zwj@alibaba-inc.com> wrote:


2020年4月7日 下午6:22,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Thanks for review.
This parameter should support all types of writing of the bool type like parameter autovacuum_enabled.
So I fixed in global_temporary_table_v24-pg13.patch.

Thank you Wenjing for the new patch with the fix and the "VACUUM FULL GTT" support.
I have verified the above issue now its resolved.

Please check the below findings on VACUUM FULL.

postgres=# create global temporary table  gtt(c1 int) on commit preserve rows;
CREATE TABLE
postgres=# vacuum FULL ;
WARNING:  global temp table oldest FrozenXid is far in the past
HINT:  please truncate them or kill those sessions that use them.
VACUUM


This is expected,
This represents that the GTT FrozenXid is the oldest in the entire db, and dba should vacuum the GTT if he want to push the db datfrozenxid.
Also he can use function pg_list_gtt_relfrozenxids() to check which session has "too old” data and truncate them or kill the sessions.
 
Again as per the HINT given, as  "HINT:  please truncate them or kill those sessions that use them."
There is only a single session.
If we try "TRUNCATE" and "VACUUM FULL" still the behavior is same as below.

postgres=# truncate gtt ;
TRUNCATE TABLE
postgres=# vacuum full;
WARNING: global temp table oldest FrozenXid is far in the past
HINT: please truncate them or kill those sessions that use them.
VACUUM

I have one more finding related to "CLUSTER table USING index", Please check the below issue.
postgres=# create global temporary table gtt(c1 int) on commit preserve rows;
CREATE TABLE
postgres=# create index idx1 ON gtt (c1);
CREATE INDEX

-- exit and re-connect the psql prompt
postgres=# \q
[edb@localhost bin]$ ./psql postgres
psql (13devel)
Type "help" for help.

postgres=# cluster gtt using idx1;
WARNING:  relcache reference leak: relation "gtt" not closed
CLUSTER
 
--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
tushar
Date:
On 4/7/20 2:27 PM, 曾文旌 wrote:
> Vacuum full GTT, cluster GTT is already 
> supported in global_temporary_table_v24-pg13.patch.

Hi Wenjing,

Please refer this scenario , where reindex   message is not coming next 
time ( after reconnecting to database) for GTT

A)
--normal table
postgres=# create table nt(n int primary key);
CREATE TABLE
--GTT table
postgres=# create global temp table gtt(n int primary key);
CREATE TABLE
B)
--Reindex  , normal table
postgres=# REINDEX (VERBOSE) TABLE  nt;
INFO:  index "nt_pkey" was reindexed
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
REINDEX
--reindex GTT table
postgres=# REINDEX (VERBOSE) TABLE  gtt;
INFO:  index "gtt_pkey" was reindexed
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
REINDEX
C)
--Reconnect  to database
postgres=# \c
You are now connected to database "postgres" as user "tushar".
D) again perform step B)

postgres=# REINDEX (VERBOSE) TABLE  nt;
INFO:  index "nt_pkey" was reindexed
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
REINDEX
postgres=# REINDEX (VERBOSE) TABLE  gtt;   <-- message  not coming
REINDEX

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company




Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年4月8日 下午6:55,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:



On Wed, Apr 8, 2020 at 1:48 PM 曾文旌 <wenjing.zwj@alibaba-inc.com> wrote:


2020年4月7日 下午6:22,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Thanks for review.
This parameter should support all types of writing of the bool type like parameter autovacuum_enabled.
So I fixed in global_temporary_table_v24-pg13.patch.

Thank you Wenjing for the new patch with the fix and the "VACUUM FULL GTT" support.
I have verified the above issue now its resolved.

Please check the below findings on VACUUM FULL.

postgres=# create global temporary table  gtt(c1 int) on commit preserve rows;
CREATE TABLE
postgres=# vacuum FULL ;
WARNING:  global temp table oldest FrozenXid is far in the past
HINT:  please truncate them or kill those sessions that use them.
VACUUM


This is expected,
This represents that the GTT FrozenXid is the oldest in the entire db, and dba should vacuum the GTT if he want to push the db datfrozenxid.
Also he can use function pg_list_gtt_relfrozenxids() to check which session has "too old” data and truncate them or kill the sessions.
 
Again as per the HINT given, as  "HINT:  please truncate them or kill those sessions that use them."
There is only a single session.
If we try "TRUNCATE" and "VACUUM FULL" still the behavior is same as below.

postgres=# truncate gtt ;
TRUNCATE TABLE
postgres=# vacuum full;
WARNING: global temp table oldest FrozenXid is far in the past
HINT: please truncate them or kill those sessions that use them.
VACUUM

If the GTT is vacuumed in the all table first, it will still receive a warning message,
So I improved the error message to make it more reasonable.


I have one more finding related to "CLUSTER table USING index", Please check the below issue.
postgres=# create global temporary table gtt(c1 int) on commit preserve rows;
CREATE TABLE
postgres=# create index idx1 ON gtt (c1);
CREATE INDEX

-- exit and re-connect the psql prompt
postgres=# \q
[edb@localhost bin]$ ./psql postgres
psql (13devel)
Type "help" for help.

postgres=# cluster gtt using idx1;
WARNING:  relcache reference leak: relation "gtt" not closed
CLUSTER
It is a bug, I fixed ,please check global_temporary_table_v25-pg13.patch.


Wenjing



 
--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
曾文旌
Date:

> 2020年4月7日 下午7:25,tushar <tushar.ahuja@enterprisedb.com> 写道:
>
> On 4/7/20 2:27 PM, 曾文旌 wrote:
>> Vacuum full GTT, cluster GTT is already supported in global_temporary_table_v24-pg13.patch.
>
> Here , it is skipping GTT
>
> postgres=# \c foo
> You are now connected to database "foo" as user "tushar".
> foo=# create global temporary table  g123( c1 int) ;
> CREATE TABLE
> foo=# \q
> [tushar@localhost bin]$ ./vacuumdb --full  foo
> vacuumdb: vacuuming database "foo"
> WARNING:  skipping vacuum global temp table "g123" because storage is not initialized for current session
The message was inappropriate at some point, so I removed it.


Wenjing


> 
> -- 
> regards,tushar
> EnterpriseDB  https://www.enterprisedb.com/
> The Enterprise PostgreSQL Company


Attachment

Re: [Proposal] Global temporary tables

From
曾文旌
Date:

> 2020年4月9日 下午7:46,tushar <tushar.ahuja@enterprisedb.com> 写道:
>
> On 4/7/20 2:27 PM, 曾文旌 wrote:
>> Vacuum full GTT, cluster GTT is already supported in global_temporary_table_v24-pg13.patch.
>
> Hi Wenjing,
>
> Please refer this scenario , where reindex   message is not coming next time ( after reconnecting to database) for
GTT
>
> A)
> --normal table
> postgres=# create table nt(n int primary key);
> CREATE TABLE
> --GTT table
> postgres=# create global temp table gtt(n int primary key);
> CREATE TABLE
> B)
> --Reindex  , normal table
> postgres=# REINDEX (VERBOSE) TABLE  nt;
> INFO:  index "nt_pkey" was reindexed
> DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
> REINDEX
> --reindex GTT table
> postgres=# REINDEX (VERBOSE) TABLE  gtt;
> INFO:  index "gtt_pkey" was reindexed
> DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
> REINDEX
> C)
> --Reconnect  to database
> postgres=# \c
> You are now connected to database "postgres" as user "tushar".
> D) again perform step B)
>
> postgres=# REINDEX (VERBOSE) TABLE  nt;
> INFO:  index "nt_pkey" was reindexed
> DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
> REINDEX
> postgres=# REINDEX (VERBOSE) TABLE  gtt;   <-- message  not coming
> REINDEX
Yes , Since the newly established connection is on the db, the GTT store file is not initialized, so there is no info
message.

>
> --
> regards,tushar
> EnterpriseDB  https://www.enterprisedb.com/
> The Enterprise PostgreSQL Company


Attachment

Re: [Proposal] Global temporary tables

From
曾文旌
Date:

> 2020年4月7日 下午6:40,Erik Rijkers <er@xs4all.nl> 写道:
>
> On 2020-04-07 10:57, 曾文旌 wrote:
>
>> [global_temporary_table_v24-pg13.patch ]
>
> Hi,
>
> With gcc 9.3.0 (debian stretch), I see some low-key protests during the build:
>
> index.c: In function ‘index_drop’:
> index.c:2051:8: warning: variable ‘rel_persistence’ set but not used [-Wunused-but-set-variable]
> 2051 |  char  rel_persistence;
>      |        ^~~~~~~~~~~~~~~
> storage_gtt.c: In function ‘gtt_force_enable_index’:
> storage_gtt.c:1252:6: warning: unused variable ‘indexOid’ [-Wunused-variable]
> 1252 |  Oid indexOid = RelationGetRelid(index);
>      |      ^~~~~~~~
> cluster.c: In function ‘copy_table_data’:
> cluster.c:780:2: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation]
>  780 |  if (RELATION_IS_GLOBAL_TEMP(OldHeap));
>      |  ^~
> cluster.c:781:3: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘if’
>  781 |   is_gtt = true;
>      |   ^~~~~~
>

Part of the problem is that some variables are only used by assert statements, and I fixed those alarms.
Please provide your configue parameter, and I will verify it again.


Wenjing



> 
> Erik
> 
> 


Attachment

Re: [Proposal] Global temporary tables

From
Erik Rijkers
Date:
On 2020-04-09 15:28, 曾文旌 wrote:
> [global_temporary_table_v25-pg13.patch]

> Part of the problem is that some variables are only used by assert 
> statements, and I fixed those alarms.
> Please provide your configue parameter, and I will verify it again.


Hi,

Just now I compiled the newer version of your patch (v25), and the 
warnings/notes that I saw earlier, are now gone. Thank you.


In case you still want it here is the configure:

-- [2020.04.09 15:06:45 global_temp_tables/1] ./configure  
--prefix=/home/aardvark/pg_stuff/pg_installations/pgsql.global_temp_tables 
--bindir=/home/aardvark/pg_stuff/pg_installations/pgsql.global_temp_tables/bin.fast 
--libdir=/home/aardvark/pg_stuff/pg_installations/pgsql.global_temp_tables/lib.fast 
--with-pgport=6975 --quiet --enable-depend --with-openssl --with-perl 
--with-libxml --with-libxslt --with-zlib  --enable-tap-tests  
--with-extra-version=_0409

-- [2020.04.09 15:07:13 global_temp_tables/1] make core: make --quiet -j 
4
partbounds.c: In function ‘partition_bounds_merge’:
partbounds.c:1024:21: warning: unused variable ‘inner_binfo’ 
[-Wunused-variable]
  1024 |  PartitionBoundInfo inner_binfo = inner_rel->boundinfo;
       |                     ^~~~~~~~~~~
All of PostgreSQL successfully made. Ready to install.


Thanks,

Erik Rijkers








Re: [Proposal] Global temporary tables

From
曾文旌
Date:

> 2020年4月8日 下午6:34,tushar <tushar.ahuja@enterprisedb.com> 写道:
>
> On 4/7/20 2:27 PM, 曾文旌 wrote:
>> Vacuum full GTT, cluster GTT is already supported in global_temporary_table_v24-pg13.patch.
> Please refer this below scenario , where pg_upgrade is failing
> 1)Server is up and running (./pg_ctl -D data status)
> 2)Stop the server ( ./pg_ctl -D data stop)
> 3)Connect to server using single user mode ( ./postgres --single -D data postgres) and create a global temp table
> [tushar@localhost bin]$ ./postgres --single -D data1233 postgres
>
> PostgreSQL stand-alone backend 13devel
> backend> create global temp table t(n int);
>
> --Press Ctl+D to exit
>
> 4)Perform initdb ( ./initdb -D data123)
> 5.Run pg_upgrade
> [tushar@localhost bin]$ ./pg_upgrade -d data -D data123 -b . -B .
> --
> --
> --
> Restoring database schemas in the new cluster
>   postgres
> *failure*
> Consult the last few lines of "pg_upgrade_dump_13592.log" for
> the probable cause of the failure.
> Failure, exiting
>
> log file content  -
>
> [tushar@localhost bin]$ tail -20   pg_upgrade_dump_13592.log
> pg_restore: error: could not execute query: ERROR:  pg_type array OID value not set when in binary upgrade mode
I found that the regular table also has this problem, I am very unfamiliar with this part, so I opened another email to
consultthis problem. 

> Command was:
> -- For binary upgrade, must preserve pg_type oid
> SELECT pg_catalog.binary_upgrade_set_next_pg_type_oid('13594'::pg_catalog.oid);
>
>
> -- For binary upgrade, must preserve pg_class oids
> SELECT pg_catalog.binary_upgrade_set_next_heap_pg_class_oid('13593'::pg_catalog.oid);
>
> CREATE GLOBAL TEMPORARY TABLE "public"."t" (
>     "n" integer
> )
> WITH ("on_commit_delete_rows"='false');
>
> -- For binary upgrade, set heap's relfrozenxid and relminmxid
> UPDATE pg_catalog.pg_class
> SET relfrozenxid = '0', relminmxid = '0'
> WHERE oid = '"public"."t"'::pg_catalog.regclass;
>
> --
> regards,tushar
> EnterpriseDB  https://www.enterprisedb.com/
> The Enterprise PostgreSQL Company


Attachment

Re: [Proposal] Global temporary tables

From
tushar
Date:
On 4/13/20 1:57 PM, 曾文旌 wrote:
[tushar@localhost bin]$ tail -20   pg_upgrade_dump_13592.log
pg_restore: error: could not execute query: ERROR:  pg_type array OID value not set when in binary upgrade mode
I found that the regular table also has this problem, I am very unfamiliar with this part, so I opened another email to consult this problem.

ohh. Thanks.

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Re: [Proposal] Global temporary tables

From
tushar
Date:
On 4/9/20 6:26 PM, 曾文旌 wrote:
On 4/7/20 2:27 PM, 曾文旌 wrote:
Vacuum full GTT, cluster GTT is already supported in global_temporary_table_v24-pg13.patch.
Here , it is skipping GTT

postgres=# \c foo
You are now connected to database "foo" as user "tushar".
foo=# create global temporary table  g123( c1 int) ;
CREATE TABLE
foo=# \q
[tushar@localhost bin]$ ./vacuumdb --full  foo
vacuumdb: vacuuming database "foo"
WARNING:  skipping vacuum global temp table "g123" because storage is not initialized for current session
The message was inappropriate at some point, so I removed it.

Thanks Wenjing. Please see -if this below behavior is correct

X terminal -

postgres=# create global temp table foo1(n int);
CREATE TABLE
postgres=# insert into foo1 values (generate_series(1,10));
INSERT 0 10
postgres=# vacuum full ;
VACUUM

Y Terminal -

[tushar@localhost bin]$ ./vacuumdb -f  postgres
vacuumdb: vacuuming database "postgres"
WARNING:  global temp table oldest relfrozenxid 3276 is the oldest in the entire db
DETAIL:  The oldest relfrozenxid in pg_class is 3277
HINT:  If they differ greatly, please consider cleaning up the data in global temp table.
WARNING:  global temp table oldest relfrozenxid 3276 is the oldest in the entire db
DETAIL:  The oldest relfrozenxid in pg_class is 3277
HINT:  If they differ greatly, please consider cleaning up the data in global temp table.


-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年4月13日 下午6:32,tushar <tushar.ahuja@enterprisedb.com> 写道:

On 4/9/20 6:26 PM, 曾文旌 wrote:
On 4/7/20 2:27 PM, 曾文旌 wrote:
Vacuum full GTT, cluster GTT is already supported in global_temporary_table_v24-pg13.patch.
Here , it is skipping GTT

postgres=# \c foo
You are now connected to database "foo" as user "tushar".
foo=# create global temporary table  g123( c1 int) ;
CREATE TABLE
foo=# \q
[tushar@localhost bin]$ ./vacuumdb --full  foo
vacuumdb: vacuuming database "foo"
WARNING:  skipping vacuum global temp table "g123" because storage is not initialized for current session
The message was inappropriate at some point, so I removed it.

Thanks Wenjing. Please see -if this below behavior is correct

X terminal -

postgres=# create global temp table foo1(n int);
CREATE TABLE
postgres=# insert into foo1 values (generate_series(1,10));
INSERT 0 10
postgres=# vacuum full ;
VACUUM

Y Terminal -

[tushar@localhost bin]$ ./vacuumdb -f  postgres
vacuumdb: vacuuming database "postgres"
WARNING:  global temp table oldest relfrozenxid 3276 is the oldest in the entire db
DETAIL:  The oldest relfrozenxid in pg_class is 3277
HINT:  If they differ greatly, please consider cleaning up the data in global temp table.
WARNING:  global temp table oldest relfrozenxid 3276 is the oldest in the entire db
DETAIL:  The oldest relfrozenxid in pg_class is 3277
HINT:  If they differ greatly, please consider cleaning up the data in global temp table.

I improved the logic of the warning message so that when the gap between relfrozenxid of GTT is small,
it will no longer be alarmed message.



Wenjing





-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Attachment

Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
On Fri, Apr 17, 2020 at 2:44 PM 曾文旌 <wenjing.zwj@alibaba-inc.com> wrote:

I improved the logic of the warning message so that when the gap between relfrozenxid of GTT is small,
it will no longer be alarmed message.

Hi Wenjing,
Thanks for the patch(v26), I have verified the previous related issues, and are working fine now.
Please check the below scenario VACUUM from a non-super user.

-- Create user "test_gtt", connect it , create gtt, VACUUM gtt and VACUUM / VACUUM FULL
postgres=# CREATE USER test_gtt;
CREATE ROLE
postgres=# \c postgres test_gtt
You are now connected to database "postgres" as user "test_gtt".
postgres=> CREATE GLOBAL TEMPORARY TABLE gtt1(c1 int);
CREATE TABLE

-- VACUUM gtt is working fine, whereas we are getting huge WARNING for VACUUM / VACUUM FULL as below:
postgres=> VACUUM gtt1 ;
VACUUM
postgres=> VACUUM;
WARNING:  skipping "pg_statistic" --- only superuser or database owner can vacuum it
WARNING:  skipping "pg_type" --- only superuser or database owner can vacuum it
WARNING:  skipping "pg_toast_2600" --- only table or database owner can vacuum it
WARNING:  skipping "pg_toast_2600_index" --- only table or database owner can vacuum it

... ...
... ...

WARNING:  skipping "_pg_foreign_tables" --- only table or database owner can vacuum it
WARNING:  skipping "foreign_table_options" --- only table or database owner can vacuum it
WARNING:  skipping "user_mapping_options" --- only table or database owner can vacuum it
WARNING:  skipping "user_mappings" --- only table or database owner can vacuum it
VACUUM 

-- 

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
Hi Wenjing,

Please check below scenario, we are getting a server crash with "ALTER TABLE" add column with default value as sequence:

-- Create gtt, exit and re-connect the psql prompt, create sequence, alter table add a column with sequence.

postgres=# create global temporary table gtt1 (c1 int);
CREATE TABLE
postgres=# \q
[edb@localhost bin]$ ./psql postgres
psql (13devel)
Type "help" for help.

postgres=# create sequence seq;
CREATE SEQUENCE
postgres=# alter table gtt1 add c2 int default nextval('seq');
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!?> \q


-- Stack trace:
[edb@localhost bin]$ gdb -q -c data/core.70358 postgres
Reading symbols from /home/edb/PG/PGsrcNew/postgresql/inst/bin/postgres...done.
[New LWP 70358]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: edb postgres [local] ALTER TABLE                '.
Program terminated with signal 6, Aborted.
#0  0x00007f150223b337 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-292.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-37.el7_7.2.x86_64 libcom_err-1.42.9-16.el7.x86_64 libgcc-4.8.5-39.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 openssl-libs-1.0.2k-19.el7.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0  0x00007f150223b337 in raise () from /lib64/libc.so.6
#1  0x00007f150223ca28 in abort () from /lib64/libc.so.6
#2  0x0000000000ab2cdd in ExceptionalCondition (conditionName=0xc03ab8 "OidIsValid(relfilenode1) && OidIsValid(relfilenode2)",
    errorType=0xc0371f "FailedAssertion", fileName=0xc03492 "cluster.c", lineNumber=1637) at assert.c:67
#3  0x000000000065e200 in gtt_swap_relation_files (r1=16384, r2=16390, target_is_pg_class=false, swap_toast_by_content=false, is_internal=true,
    frozenXid=490, cutoffMulti=1, mapped_tables=0x7ffd841f7ee0) at cluster.c:1637
#4  0x000000000065dcd9 in finish_heap_swap (OIDOldHeap=16384, OIDNewHeap=16390, is_system_catalog=false, swap_toast_by_content=false,
    check_constraints=true, is_internal=true, frozenXid=490, cutoffMulti=1, newrelpersistence=103 'g') at cluster.c:1395
#5  0x00000000006bca18 in ATRewriteTables (parsetree=0x1deab80, wqueue=0x7ffd841f80c8, lockmode=8, context=0x7ffd841f8260) at tablecmds.c:4991
#6  0x00000000006ba890 in ATController (parsetree=0x1deab80, rel=0x7f150378f330, cmds=0x1deab28, recurse=true, lockmode=8, context=0x7ffd841f8260)
    at tablecmds.c:3991
#7  0x00000000006ba4f8 in AlterTable (stmt=0x1deab80, lockmode=8, context=0x7ffd841f8260) at tablecmds.c:3644
#8  0x000000000093b62a in ProcessUtilitySlow (pstate=0x1e0d6d0, pstmt=0x1deac48,
    queryString=0x1de9b30 "alter table gtt1 add c2 int default nextval('seq');", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x1deaf28,
    qc=0x7ffd841f8830) at utility.c:1267
#9  0x000000000093b141 in standard_ProcessUtility (pstmt=0x1deac48, queryString=0x1de9b30 "alter table gtt1 add c2 int default nextval('seq');",
    context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x1deaf28, qc=0x7ffd841f8830) at utility.c:1067
#10 0x000000000093a22b in ProcessUtility (pstmt=0x1deac48, queryString=0x1de9b30 "alter table gtt1 add c2 int default nextval('seq');",
    context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x1deaf28, qc=0x7ffd841f8830) at utility.c:522
#11 0x000000000093909d in PortalRunUtility (portal=0x1e4fba0, pstmt=0x1deac48, isTopLevel=true, setHoldSnapshot=false, dest=0x1deaf28, qc=0x7ffd841f8830)
    at pquery.c:1157
#12 0x00000000009392b3 in PortalRunMulti (portal=0x1e4fba0, isTopLevel=true, setHoldSnapshot=false, dest=0x1deaf28, altdest=0x1deaf28, qc=0x7ffd841f8830)
    at pquery.c:1303
#13 0x00000000009387d1 in PortalRun (portal=0x1e4fba0, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x1deaf28, altdest=0x1deaf28,
    qc=0x7ffd841f8830) at pquery.c:779
#14 0x000000000093298b in exec_simple_query (query_string=0x1de9b30 "alter table gtt1 add c2 int default nextval('seq');") at postgres.c:1239
#15 0x0000000000936997 in PostgresMain (argc=1, argv=0x1e13b80, dbname=0x1e13a78 "postgres", username=0x1e13a58 "edb") at postgres.c:4315
#16 0x00000000008868b3 in BackendRun (port=0x1e0bb50) at postmaster.c:4510
#17 0x00000000008860a8 in BackendStartup (port=0x1e0bb50) at postmaster.c:4202
#18 0x0000000000882626 in ServerLoop () at postmaster.c:1727
#19 0x0000000000881efd in PostmasterMain (argc=3, argv=0x1de4460) at postmaster.c:1400
#20 0x0000000000789288 in main (argc=3, argv=0x1de4460) at main.c:210
(gdb) 


--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年4月17日 下午8:59,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi Wenjing,

Please check below scenario, we are getting a server crash with "ALTER TABLE" add column with default value as sequence:

-- Create gtt, exit and re-connect the psql prompt, create sequence, alter table add a column with sequence.

postgres=# create global temporary table gtt1 (c1 int);
CREATE TABLE
postgres=# \q
[edb@localhost bin]$ ./psql postgres
psql (13devel)
Type "help" for help.

postgres=# create sequence seq;
CREATE SEQUENCE
postgres=# alter table gtt1 add c2 int default nextval('seq');
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!?> \q
Fixed in global_temporary_table_v27-pg13.patch


Wenjing






-- Stack trace:
[edb@localhost bin]$ gdb -q -c data/core.70358 postgres
Reading symbols from /home/edb/PG/PGsrcNew/postgresql/inst/bin/postgres...done.
[New LWP 70358]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: edb postgres [local] ALTER TABLE                '.
Program terminated with signal 6, Aborted.
#0  0x00007f150223b337 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-292.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-37.el7_7.2.x86_64 libcom_err-1.42.9-16.el7.x86_64 libgcc-4.8.5-39.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 openssl-libs-1.0.2k-19.el7.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0  0x00007f150223b337 in raise () from /lib64/libc.so.6
#1  0x00007f150223ca28 in abort () from /lib64/libc.so.6
#2  0x0000000000ab2cdd in ExceptionalCondition (conditionName=0xc03ab8 "OidIsValid(relfilenode1) && OidIsValid(relfilenode2)",
    errorType=0xc0371f "FailedAssertion", fileName=0xc03492 "cluster.c", lineNumber=1637) at assert.c:67
#3  0x000000000065e200 in gtt_swap_relation_files (r1=16384, r2=16390, target_is_pg_class=false, swap_toast_by_content=false, is_internal=true,
    frozenXid=490, cutoffMulti=1, mapped_tables=0x7ffd841f7ee0) at cluster.c:1637
#4  0x000000000065dcd9 in finish_heap_swap (OIDOldHeap=16384, OIDNewHeap=16390, is_system_catalog=false, swap_toast_by_content=false,
    check_constraints=true, is_internal=true, frozenXid=490, cutoffMulti=1, newrelpersistence=103 'g') at cluster.c:1395
#5  0x00000000006bca18 in ATRewriteTables (parsetree=0x1deab80, wqueue=0x7ffd841f80c8, lockmode=8, context=0x7ffd841f8260) at tablecmds.c:4991
#6  0x00000000006ba890 in ATController (parsetree=0x1deab80, rel=0x7f150378f330, cmds=0x1deab28, recurse=true, lockmode=8, context=0x7ffd841f8260)
    at tablecmds.c:3991
#7  0x00000000006ba4f8 in AlterTable (stmt=0x1deab80, lockmode=8, context=0x7ffd841f8260) at tablecmds.c:3644
#8  0x000000000093b62a in ProcessUtilitySlow (pstate=0x1e0d6d0, pstmt=0x1deac48,
    queryString=0x1de9b30 "alter table gtt1 add c2 int default nextval('seq');", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x1deaf28,
    qc=0x7ffd841f8830) at utility.c:1267
#9  0x000000000093b141 in standard_ProcessUtility (pstmt=0x1deac48, queryString=0x1de9b30 "alter table gtt1 add c2 int default nextval('seq');",
    context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x1deaf28, qc=0x7ffd841f8830) at utility.c:1067
#10 0x000000000093a22b in ProcessUtility (pstmt=0x1deac48, queryString=0x1de9b30 "alter table gtt1 add c2 int default nextval('seq');",
    context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x1deaf28, qc=0x7ffd841f8830) at utility.c:522
#11 0x000000000093909d in PortalRunUtility (portal=0x1e4fba0, pstmt=0x1deac48, isTopLevel=true, setHoldSnapshot=false, dest=0x1deaf28, qc=0x7ffd841f8830)
    at pquery.c:1157
#12 0x00000000009392b3 in PortalRunMulti (portal=0x1e4fba0, isTopLevel=true, setHoldSnapshot=false, dest=0x1deaf28, altdest=0x1deaf28, qc=0x7ffd841f8830)
    at pquery.c:1303
#13 0x00000000009387d1 in PortalRun (portal=0x1e4fba0, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x1deaf28, altdest=0x1deaf28,
    qc=0x7ffd841f8830) at pquery.c:779
#14 0x000000000093298b in exec_simple_query (query_string=0x1de9b30 "alter table gtt1 add c2 int default nextval('seq');") at postgres.c:1239
#15 0x0000000000936997 in PostgresMain (argc=1, argv=0x1e13b80, dbname=0x1e13a78 "postgres", username=0x1e13a58 "edb") at postgres.c:4315
#16 0x00000000008868b3 in BackendRun (port=0x1e0bb50) at postmaster.c:4510
#17 0x00000000008860a8 in BackendStartup (port=0x1e0bb50) at postmaster.c:4202
#18 0x0000000000882626 in ServerLoop () at postmaster.c:1727
#19 0x0000000000881efd in PostmasterMain (argc=3, argv=0x1de4460) at postmaster.c:1400
#20 0x0000000000789288 in main (argc=3, argv=0x1de4460) at main.c:210
(gdb) 


--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年4月17日 下午8:59,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi Wenjing,

Please check below scenario, we are getting a server crash with "ALTER TABLE" add column with default value as sequence:

-- Create gtt, exit and re-connect the psql prompt, create sequence, alter table add a column with sequence.

postgres=# create global temporary table gtt1 (c1 int);
CREATE TABLE
postgres=# \q
[edb@localhost bin]$ ./psql postgres
psql (13devel)
Type "help" for help.

postgres=# create sequence seq;
CREATE SEQUENCE
postgres=# alter table gtt1 add c2 int default nextval('seq');
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!?> \q
Please check my new patch.



Wenjing






-- Stack trace:
[edb@localhost bin]$ gdb -q -c data/core.70358 postgres
Reading symbols from /home/edb/PG/PGsrcNew/postgresql/inst/bin/postgres...done.
[New LWP 70358]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: edb postgres [local] ALTER TABLE                '.
Program terminated with signal 6, Aborted.
#0  0x00007f150223b337 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-292.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-37.el7_7.2.x86_64 libcom_err-1.42.9-16.el7.x86_64 libgcc-4.8.5-39.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 openssl-libs-1.0.2k-19.el7.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0  0x00007f150223b337 in raise () from /lib64/libc.so.6
#1  0x00007f150223ca28 in abort () from /lib64/libc.so.6
#2  0x0000000000ab2cdd in ExceptionalCondition (conditionName=0xc03ab8 "OidIsValid(relfilenode1) && OidIsValid(relfilenode2)",
    errorType=0xc0371f "FailedAssertion", fileName=0xc03492 "cluster.c", lineNumber=1637) at assert.c:67
#3  0x000000000065e200 in gtt_swap_relation_files (r1=16384, r2=16390, target_is_pg_class=false, swap_toast_by_content=false, is_internal=true,
    frozenXid=490, cutoffMulti=1, mapped_tables=0x7ffd841f7ee0) at cluster.c:1637
#4  0x000000000065dcd9 in finish_heap_swap (OIDOldHeap=16384, OIDNewHeap=16390, is_system_catalog=false, swap_toast_by_content=false,
    check_constraints=true, is_internal=true, frozenXid=490, cutoffMulti=1, newrelpersistence=103 'g') at cluster.c:1395
#5  0x00000000006bca18 in ATRewriteTables (parsetree=0x1deab80, wqueue=0x7ffd841f80c8, lockmode=8, context=0x7ffd841f8260) at tablecmds.c:4991
#6  0x00000000006ba890 in ATController (parsetree=0x1deab80, rel=0x7f150378f330, cmds=0x1deab28, recurse=true, lockmode=8, context=0x7ffd841f8260)
    at tablecmds.c:3991
#7  0x00000000006ba4f8 in AlterTable (stmt=0x1deab80, lockmode=8, context=0x7ffd841f8260) at tablecmds.c:3644
#8  0x000000000093b62a in ProcessUtilitySlow (pstate=0x1e0d6d0, pstmt=0x1deac48,
    queryString=0x1de9b30 "alter table gtt1 add c2 int default nextval('seq');", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x1deaf28,
    qc=0x7ffd841f8830) at utility.c:1267
#9  0x000000000093b141 in standard_ProcessUtility (pstmt=0x1deac48, queryString=0x1de9b30 "alter table gtt1 add c2 int default nextval('seq');",
    context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x1deaf28, qc=0x7ffd841f8830) at utility.c:1067
#10 0x000000000093a22b in ProcessUtility (pstmt=0x1deac48, queryString=0x1de9b30 "alter table gtt1 add c2 int default nextval('seq');",
    context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x1deaf28, qc=0x7ffd841f8830) at utility.c:522
#11 0x000000000093909d in PortalRunUtility (portal=0x1e4fba0, pstmt=0x1deac48, isTopLevel=true, setHoldSnapshot=false, dest=0x1deaf28, qc=0x7ffd841f8830)
    at pquery.c:1157
#12 0x00000000009392b3 in PortalRunMulti (portal=0x1e4fba0, isTopLevel=true, setHoldSnapshot=false, dest=0x1deaf28, altdest=0x1deaf28, qc=0x7ffd841f8830)
    at pquery.c:1303
#13 0x00000000009387d1 in PortalRun (portal=0x1e4fba0, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x1deaf28, altdest=0x1deaf28,
    qc=0x7ffd841f8830) at pquery.c:779
#14 0x000000000093298b in exec_simple_query (query_string=0x1de9b30 "alter table gtt1 add c2 int default nextval('seq');") at postgres.c:1239
#15 0x0000000000936997 in PostgresMain (argc=1, argv=0x1e13b80, dbname=0x1e13a78 "postgres", username=0x1e13a58 "edb") at postgres.c:4315
#16 0x00000000008868b3 in BackendRun (port=0x1e0bb50) at postmaster.c:4510
#17 0x00000000008860a8 in BackendStartup (port=0x1e0bb50) at postmaster.c:4202
#18 0x0000000000882626 in ServerLoop () at postmaster.c:1727
#19 0x0000000000881efd in PostmasterMain (argc=3, argv=0x1de4460) at postmaster.c:1400
#20 0x0000000000789288 in main (argc=3, argv=0x1de4460) at main.c:210
(gdb) 


--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年4月17日 下午7:26,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

On Fri, Apr 17, 2020 at 2:44 PM 曾文旌 <wenjing.zwj@alibaba-inc.com> wrote:

I improved the logic of the warning message so that when the gap between relfrozenxid of GTT is small,
it will no longer be alarmed message.

Hi Wenjing,
Thanks for the patch(v26), I have verified the previous related issues, and are working fine now.
Please check the below scenario VACUUM from a non-super user.

-- Create user "test_gtt", connect it , create gtt, VACUUM gtt and VACUUM / VACUUM FULL
postgres=# CREATE USER test_gtt;
CREATE ROLE
postgres=# \c postgres test_gtt
You are now connected to database "postgres" as user "test_gtt".
postgres=> CREATE GLOBAL TEMPORARY TABLE gtt1(c1 int);
CREATE TABLE

-- VACUUM gtt is working fine, whereas we are getting huge WARNING for VACUUM / VACUUM FULL as below:
postgres=> VACUUM gtt1 ;
VACUUM
postgres=> VACUUM;
WARNING:  skipping "pg_statistic" --- only superuser or database owner can vacuum it
WARNING:  skipping "pg_type" --- only superuser or database owner can vacuum it
WARNING:  skipping "pg_toast_2600" --- only table or database owner can vacuum it
WARNING:  skipping "pg_toast_2600_index" --- only table or database owner can vacuum it

... ...
... ...

WARNING:  skipping "_pg_foreign_tables" --- only table or database owner can vacuum it
WARNING:  skipping "foreign_table_options" --- only table or database owner can vacuum it
WARNING:  skipping "user_mapping_options" --- only table or database owner can vacuum it
WARNING:  skipping "user_mappings" --- only table or database owner can vacuum it
VACUUM 
I think this is expected, and user test_gtt does not have permission to vacuum the system table.
This has nothing to do with GTT.


Wenjing


-- 

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
I think this is expected, and user test_gtt does not have permission to vacuum the system table.
This has nothing to do with GTT.

Hi Wenjing, Thanks for the explanation.
Thanks for the new patch. I have verified the crash, Now its resolved.

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
tushar
Date:
On 4/20/20 2:59 PM, 曾文旌 wrote:
> Please check my new patch.

Thanks Wenjing. Please refer this below scenario , getting error -  
ERROR:  could not read block 0 in file "base/16466/t4_16472": read only 
0 of 8192 bytes

Steps to reproduce

Connect to psql terminal,create a table ( create global temp table t2 (n 
int primary key ) on commit delete rows;)
exit from psql terminal and execute (./clusterdb -t t2 -d postgres -v)
connect to psql terminal and one by one execute these below sql statements
(
cluster verbose t2 using t2_pkey;
cluster verbose t2 ;
alter table t2 add column i int;
cluster verbose t2 ;
cluster verbose t2 using t2_pkey;
create unique index ind on t2(n);
create unique index concurrently  ind1 on t2(n);
select * from t2;
)
This last SQL - will throw this error -  - ERROR:  could not read block 
0 in file "base/16466/t4_16472": read only 0 of 8192 bytes

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company




Re: [Proposal] Global temporary tables

From
曾文旌
Date:

> 2020年4月20日 下午9:15,tushar <tushar.ahuja@enterprisedb.com> 写道:
>
> On 4/20/20 2:59 PM, 曾文旌 wrote:
>> Please check my new patch.
>
> Thanks Wenjing. Please refer this below scenario , getting error -  ERROR:  could not read block 0 in file
"base/16466/t4_16472":read only 0 of 8192 bytes 
>
> Steps to reproduce
>
> Connect to psql terminal,create a table ( create global temp table t2 (n int primary key ) on commit delete rows;)
> exit from psql terminal and execute (./clusterdb -t t2 -d postgres -v)
> connect to psql terminal and one by one execute these below sql statements
> (
> cluster verbose t2 using t2_pkey;
> cluster verbose t2 ;
> alter table t2 add column i int;
> cluster verbose t2 ;
> cluster verbose t2 using t2_pkey;
> create unique index ind on t2(n);
> create unique index concurrently  ind1 on t2(n);
> select * from t2;
> )
> This last SQL - will throw this error -  - ERROR:  could not read block 0 in file "base/16466/t4_16472": read only 0
of8192 bytes 
Fixed in global_temporary_table_v29-pg13.patch
Please check.



Wenjing




> 
> -- 
> regards,tushar
> EnterpriseDB  https://www.enterprisedb.com/
> The Enterprise PostgreSQL Company


Attachment

Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年4月3日 下午4:38,Pavel Stehule <pavel.stehule@gmail.com> 写道:



pá 3. 4. 2020 v 9:52 odesílatel 曾文旌 <wenjing.zwj@alibaba-inc.com> napsal:
In my opinion
1 We are developing GTT according to the SQL standard, not Oracle.

2 The implementation differences you listed come from pg and oracle storage modules and DDL implementations.

2.1 issue 1 and issue 2
The creation of Normal table/GTT defines the catalog and initializes the data store file, in the case of the GTT, which initializes the store file for the current session. 
But in oracle It just looks like only defines the catalog.
This causes other sessions can not drop the GTT in PostgreSQL.
This is the reason for issue 1 and issue 2, I think it is reasonable.

2.2 issue 3
I thinking the logic of drop GTT is
When only the current session is using the GTT, it is safe to drop the GTT. 
because the GTT's definition and storage files can completely delete from db.
But, If multiple sessions are using this GTT, it is hard to drop GTT in session a, because remove the local buffer and data file of the GTT in other session is difficult.
I am not sure why oracle has this limitation.
So, issue 3 is reasonable.

2.3 TRUNCATE Normal table/GTT
TRUNCATE Normal table / GTT clean up the logical data but not unlink data store file. in the case of the GTT, which is the store file for the current session.
But in oracle,  It just looks like data store file was cleaned up.
PostgreSQL storage is obviously different from oracle, In other words, session is detached from storage.
This is the reason for issue4 I think it is reasonable.

Although the implementation of GTT is different, I think so TRUNCATE on Postgres (when it is really finalized) can remove session metadata of GTT too (and reduce usage's counter). It is not critical feature, but I think so it should not be hard to implement. From practical reason can be nice to have a tool how to refresh GTT without a necessity to close session. TRUNCATE can be this tool.
Yes, I think we need a way to delete the GTT local storage without closing the session.

I provide the TRUNCATE tablename DROP to clear the data in the GTT and delete the storage files.
This feature requires the current transaction to commit immediately after it finishes truncate.



Wenjing



Regards

Pavel


All in all, I think the current implementation is sufficient for dba to manage GTT.

2020年4月2日 下午4:45,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi All,

I have noted down few behavioral difference in our GTT implementation in PG as compared to Oracle DB:
As per my understanding, the behavior of DROP TABLE in case of "Normal table and GTT" in Oracle DB are as below:
  1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
  2. For a completed transaction on a normal table having data, we will be able to DROP from another session. If the transaction is not yet complete, and we are trying to drop the table from another session, then we will get an error. (working as expected)
  3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
  4. For a completed transaction on GTT with(on commit preserve rows) with data in a session, we will not be able to DROP from any session(not even from the session in which GTT is created), we need to truncate the table data first from all the session(session1, session2) which is having data.
1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
Session1:
create table t1 (c1 integer);
create global temporary table gtt1 (c1 integer) on commit delete rows;
create global temporary table gtt2 (c1 integer) on commit preserve rows;


Session2:
drop table t1;
drop table gtt1;
drop table gtt2;


-- Issue 1: But we are able to drop a simple table and failed to drop GTT as below.
postgres=# drop table t1;
DROP TABLE
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table
postgres=# drop table gtt2;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
Session1:
create global temporary table gtt1 (c1 integer) on commit delete rows;

Session2:
drop table gtt1;

-- Issue 2: But we are getting error for GTT with(on_commit_delete_rows) without data.
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table

4. For a completed transaction on GTT with(on commit preserve rows) with data in any session, we will not be able to DROP from any session(not even from the session in which GTT is created)

Case1:
create global temporary table gtt2 (c1 integer) on commit preserve rows;
insert into gtt2 values(100);
drop table gtt2;


SQL> drop table gtt2;
drop table gtt2
  *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use


-- Issue 3: But, we are able to drop the GTT(having data) which we have created in the same session.
postgres=# drop table gtt2;
DROP TABLE

Case2: GTT with(on commit preserve rows) having data in both session1 and session2
Session1:
create global temporary table gtt2 (c1 integer) on commit preserve rows;
insert into gtt2 values(100);


Session2:
insert into gtt2 values(200);

-- If we try to drop the table from any session we should get an error, it is working fine.
drop table gtt2;
SQL> drop table gtt2;
drop table gtt2
  *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use

postgres=# drop table gtt2 ;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

-- To drop the table gtt2 from any session1/session2, we need to truncate the table data first from all the session(session1, session2) which is having data.
Session1:
truncate table gtt2;
-- Session2:
truncate table gtt2;

Session 2:
SQL> drop table gtt2;

Table dropped.


-- Issue 4: But we are not able to drop the GTT, even after TRUNCATE the table in all the sessions.
-- truncate from all sessions where GTT have data.
postgres=# truncate gtt2 ;
TRUNCATE TABLE


-- try to DROP GTT still, we are getting error.
postgres=# drop table gtt2 ;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

To drop the GTT from any session, we need to exit from all other sessions.
postgres=# drop table gtt2 ;
DROP TABLE


Kindly let me know if I am missing something.


On Wed, Apr 1, 2020 at 6:26 PM Prabhat Sahu <prabhat.sahu@enterprisedb.com> wrote:
Hi Wenjing,
I hope we need to change the below error message.

postgres=# create global temporary table gtt(c1 int) on commit preserve rows;
CREATE TABLE

postgres=# create materialized view mvw as select * from gtt;
ERROR: materialized views must not use global temporary tables or views


Anyways we are not allowed to create a "global temporary view", 
so the above ERROR message should change(i.e. " or view" need to be removed from the error message) something like:
"ERROR: materialized views must not use global temporary tables"

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com



--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com



Attachment

Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:


On Wed, Apr 22, 2020 at 2:49 PM 曾文旌 <wenjing.zwj@alibaba-inc.com> wrote:

Although the implementation of GTT is different, I think so TRUNCATE on Postgres (when it is really finalized) can remove session metadata of GTT too (and reduce usage's counter). It is not critical feature, but I think so it should not be hard to implement. From practical reason can be nice to have a tool how to refresh GTT without a necessity to close session. TRUNCATE can be this tool.
Yes, I think we need a way to delete the GTT local storage without closing the session.

I provide the TRUNCATE tablename DROP to clear the data in the GTT and delete the storage files.
This feature requires the current transaction to commit immediately after it finishes truncate.

Hi Wenjing,
Thanks for the patch(v30) for the new syntax support for (TRUNCATE table_name DROP) for deleting storage files after TRUNCATE on GTT.
 
Please check below scenarios:

Case1:
-- session1:
postgres=# create global temporary table gtt2 (c1 integer) on commit preserve rows;
CREATE TABLE
postgres=# create index  idx1 on gtt2 (c1);
CREATE INDEX
postgres=# create index  idx2 on gtt2 (c1) where c1%2 =0;
CREATE INDEX
postgres=#
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
ERROR:  cannot cluster on partial index "idx2"        

Case2:
-- Session2:
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
CLUSTER

postgres=# insert into gtt2 values(1);
INSERT 0 1
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
ERROR:  cannot cluster on partial index "idx2"

Case3:
-- Session2:
postgres=# TRUNCATE gtt2 DROP;
TRUNCATE TABLE
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
CLUSTER

In Case2, Case3 we can observe, with the absence of data in GTT, we are able to "CLUSTER gtt2 USING idx2;" (having partial index)
But why does the same query fail for Case1 (absence of data)?

Thanks,
Prabhat Sahu

 


Wenjing



Regards

Pavel


All in all, I think the current implementation is sufficient for dba to manage GTT.

2020年4月2日 下午4:45,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi All,

I have noted down few behavioral difference in our GTT implementation in PG as compared to Oracle DB:
As per my understanding, the behavior of DROP TABLE in case of "Normal table and GTT" in Oracle DB are as below:
  1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
  2. For a completed transaction on a normal table having data, we will be able to DROP from another session. If the transaction is not yet complete, and we are trying to drop the table from another session, then we will get an error. (working as expected)
  3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
  4. For a completed transaction on GTT with(on commit preserve rows) with data in a session, we will not be able to DROP from any session(not even from the session in which GTT is created), we need to truncate the table data first from all the session(session1, session2) which is having data.
1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
Session1:
create table t1 (c1 integer);
create global temporary table gtt1 (c1 integer) on commit delete rows;
create global temporary table gtt2 (c1 integer) on commit preserve rows;


Session2:
drop table t1;
drop table gtt1;
drop table gtt2;


-- Issue 1: But we are able to drop a simple table and failed to drop GTT as below.
postgres=# drop table t1;
DROP TABLE
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table
postgres=# drop table gtt2;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
Session1:
create global temporary table gtt1 (c1 integer) on commit delete rows;

Session2:
drop table gtt1;

-- Issue 2: But we are getting error for GTT with(on_commit_delete_rows) without data.
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table

4. For a completed transaction on GTT with(on commit preserve rows) with data in any session, we will not be able to DROP from any session(not even from the session in which GTT is created)

Case1:
create global temporary table gtt2 (c1 integer) on commit preserve rows;
insert into gtt2 values(100);
drop table gtt2;


SQL> drop table gtt2;
drop table gtt2
  *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use


-- Issue 3: But, we are able to drop the GTT(having data) which we have created in the same session.
postgres=# drop table gtt2;
DROP TABLE

Case2: GTT with(on commit preserve rows) having data in both session1 and session2
Session1:
create global temporary table gtt2 (c1 integer) on commit preserve rows;
insert into gtt2 values(100);


Session2:
insert into gtt2 values(200);

-- If we try to drop the table from any session we should get an error, it is working fine.
drop table gtt2;
SQL> drop table gtt2;
drop table gtt2
  *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use

postgres=# drop table gtt2 ;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

-- To drop the table gtt2 from any session1/session2, we need to truncate the table data first from all the session(session1, session2) which is having data.
Session1:
truncate table gtt2;
-- Session2:
truncate table gtt2;

Session 2:
SQL> drop table gtt2;

Table dropped.


-- Issue 4: But we are not able to drop the GTT, even after TRUNCATE the table in all the sessions.
-- truncate from all sessions where GTT have data.
postgres=# truncate gtt2 ;
TRUNCATE TABLE


-- try to DROP GTT still, we are getting error.
postgres=# drop table gtt2 ;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

To drop the GTT from any session, we need to exit from all other sessions.
postgres=# drop table gtt2 ;
DROP TABLE


Kindly let me know if I am missing something.


On Wed, Apr 1, 2020 at 6:26 PM Prabhat Sahu <prabhat.sahu@enterprisedb.com> wrote:
Hi Wenjing,
I hope we need to change the below error message.

postgres=# create global temporary table gtt(c1 int) on commit preserve rows;
CREATE TABLE

postgres=# create materialized view mvw as select * from gtt;
ERROR: materialized views must not use global temporary tables or views


Anyways we are not allowed to create a "global temporary view", 
so the above ERROR message should change(i.e. " or view" need to be removed from the error message) something like:
"ERROR: materialized views must not use global temporary tables"

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com



--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com





--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


st 22. 4. 2020 v 16:38 odesílatel Prabhat Sahu <prabhat.sahu@enterprisedb.com> napsal:


On Wed, Apr 22, 2020 at 2:49 PM 曾文旌 <wenjing.zwj@alibaba-inc.com> wrote:

Although the implementation of GTT is different, I think so TRUNCATE on Postgres (when it is really finalized) can remove session metadata of GTT too (and reduce usage's counter). It is not critical feature, but I think so it should not be hard to implement. From practical reason can be nice to have a tool how to refresh GTT without a necessity to close session. TRUNCATE can be this tool.
Yes, I think we need a way to delete the GTT local storage without closing the session.

I provide the TRUNCATE tablename DROP to clear the data in the GTT and delete the storage files.
This feature requires the current transaction to commit immediately after it finishes truncate.

Hi Wenjing,
Thanks for the patch(v30) for the new syntax support for (TRUNCATE table_name DROP) for deleting storage files after TRUNCATE on GTT.

This syntax looks strange, and I don't think so it solve anything in practical life, because without lock the table will be used in few seconds by other sessions.

This is same topic when we talked about ALTER - when and where the changes should be applied.

The CLUSTER commands works only on session private data, so it should not to need some special lock or some special cleaning before.

Regards

Pavel
 
 
Please check below scenarios:

Case1:
-- session1:
postgres=# create global temporary table gtt2 (c1 integer) on commit preserve rows;
CREATE TABLE
postgres=# create index  idx1 on gtt2 (c1);
CREATE INDEX
postgres=# create index  idx2 on gtt2 (c1) where c1%2 =0;
CREATE INDEX
postgres=#
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
ERROR:  cannot cluster on partial index "idx2"        

Case2:
-- Session2:
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
CLUSTER

postgres=# insert into gtt2 values(1);
INSERT 0 1
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
ERROR:  cannot cluster on partial index "idx2"

Case3:
-- Session2:
postgres=# TRUNCATE gtt2 DROP;
TRUNCATE TABLE
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
CLUSTER

In Case2, Case3 we can observe, with the absence of data in GTT, we are able to "CLUSTER gtt2 USING idx2;" (having partial index)
But why does the same query fail for Case1 (absence of data)?

Thanks,
Prabhat Sahu

 


Wenjing



Regards

Pavel


All in all, I think the current implementation is sufficient for dba to manage GTT.

2020年4月2日 下午4:45,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi All,

I have noted down few behavioral difference in our GTT implementation in PG as compared to Oracle DB:
As per my understanding, the behavior of DROP TABLE in case of "Normal table and GTT" in Oracle DB are as below:
  1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
  2. For a completed transaction on a normal table having data, we will be able to DROP from another session. If the transaction is not yet complete, and we are trying to drop the table from another session, then we will get an error. (working as expected)
  3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
  4. For a completed transaction on GTT with(on commit preserve rows) with data in a session, we will not be able to DROP from any session(not even from the session in which GTT is created), we need to truncate the table data first from all the session(session1, session2) which is having data.
1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
Session1:
create table t1 (c1 integer);
create global temporary table gtt1 (c1 integer) on commit delete rows;
create global temporary table gtt2 (c1 integer) on commit preserve rows;


Session2:
drop table t1;
drop table gtt1;
drop table gtt2;


-- Issue 1: But we are able to drop a simple table and failed to drop GTT as below.
postgres=# drop table t1;
DROP TABLE
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table
postgres=# drop table gtt2;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
Session1:
create global temporary table gtt1 (c1 integer) on commit delete rows;

Session2:
drop table gtt1;

-- Issue 2: But we are getting error for GTT with(on_commit_delete_rows) without data.
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table

4. For a completed transaction on GTT with(on commit preserve rows) with data in any session, we will not be able to DROP from any session(not even from the session in which GTT is created)

Case1:
create global temporary table gtt2 (c1 integer) on commit preserve rows;
insert into gtt2 values(100);
drop table gtt2;


SQL> drop table gtt2;
drop table gtt2
  *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use


-- Issue 3: But, we are able to drop the GTT(having data) which we have created in the same session.
postgres=# drop table gtt2;
DROP TABLE

Case2: GTT with(on commit preserve rows) having data in both session1 and session2
Session1:
create global temporary table gtt2 (c1 integer) on commit preserve rows;
insert into gtt2 values(100);


Session2:
insert into gtt2 values(200);

-- If we try to drop the table from any session we should get an error, it is working fine.
drop table gtt2;
SQL> drop table gtt2;
drop table gtt2
  *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use

postgres=# drop table gtt2 ;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

-- To drop the table gtt2 from any session1/session2, we need to truncate the table data first from all the session(session1, session2) which is having data.
Session1:
truncate table gtt2;
-- Session2:
truncate table gtt2;

Session 2:
SQL> drop table gtt2;

Table dropped.


-- Issue 4: But we are not able to drop the GTT, even after TRUNCATE the table in all the sessions.
-- truncate from all sessions where GTT have data.
postgres=# truncate gtt2 ;
TRUNCATE TABLE


-- try to DROP GTT still, we are getting error.
postgres=# drop table gtt2 ;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

To drop the GTT from any session, we need to exit from all other sessions.
postgres=# drop table gtt2 ;
DROP TABLE


Kindly let me know if I am missing something.


On Wed, Apr 1, 2020 at 6:26 PM Prabhat Sahu <prabhat.sahu@enterprisedb.com> wrote:
Hi Wenjing,
I hope we need to change the below error message.

postgres=# create global temporary table gtt(c1 int) on commit preserve rows;
CREATE TABLE

postgres=# create materialized view mvw as select * from gtt;
ERROR: materialized views must not use global temporary tables or views


Anyways we are not allowed to create a "global temporary view", 
so the above ERROR message should change(i.e. " or view" need to be removed from the error message) something like:
"ERROR: materialized views must not use global temporary tables"

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com



--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com





--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年4月22日 下午10:38,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:



On Wed, Apr 22, 2020 at 2:49 PM 曾文旌 <wenjing.zwj@alibaba-inc.com> wrote:

Although the implementation of GTT is different, I think so TRUNCATE on Postgres (when it is really finalized) can remove session metadata of GTT too (and reduce usage's counter). It is not critical feature, but I think so it should not be hard to implement. From practical reason can be nice to have a tool how to refresh GTT without a necessity to close session. TRUNCATE can be this tool.
Yes, I think we need a way to delete the GTT local storage without closing the session.

I provide the TRUNCATE tablename DROP to clear the data in the GTT and delete the storage files.
This feature requires the current transaction to commit immediately after it finishes truncate.

Hi Wenjing,
Thanks for the patch(v30) for the new syntax support for (TRUNCATE table_name DROP) for deleting storage files after TRUNCATE on GTT.
 
Please check below scenarios:

Case1:
-- session1:
postgres=# create global temporary table gtt2 (c1 integer) on commit preserve rows;
CREATE TABLE
postgres=# create index  idx1 on gtt2 (c1);
CREATE INDEX
postgres=# create index  idx2 on gtt2 (c1) where c1%2 =0;
CREATE INDEX
postgres=#
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
ERROR:  cannot cluster on partial index "idx2"        

Case2:
-- Session2:
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
CLUSTER

postgres=# insert into gtt2 values(1);
INSERT 0 1
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
ERROR:  cannot cluster on partial index "idx2"

Case3:
-- Session2:
postgres=# TRUNCATE gtt2 DROP;
TRUNCATE TABLE
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
CLUSTER

In Case2, Case3 we can observe, with the absence of data in GTT, we are able to "CLUSTER gtt2 USING idx2;" (having partial index)
But why does the same query fail for Case1 (absence of data)?
This is expected
Because TRUNCATE gtt2 DROP; The local storage file was deleted, so CLUSTER checked that there were no local files and ended the process.


Wenjing



Thanks,
Prabhat Sahu

 


Wenjing



Regards

Pavel


All in all, I think the current implementation is sufficient for dba to manage GTT.

2020年4月2日 下午4:45,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi All,

I have noted down few behavioral difference in our GTT implementation in PG as compared to Oracle DB:
As per my understanding, the behavior of DROP TABLE in case of "Normal table and GTT" in Oracle DB are as below:
  1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
  2. For a completed transaction on a normal table having data, we will be able to DROP from another session. If the transaction is not yet complete, and we are trying to drop the table from another session, then we will get an error. (working as expected)
  3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
  4. For a completed transaction on GTT with(on commit preserve rows) with data in a session, we will not be able to DROP from any session(not even from the session in which GTT is created), we need to truncate the table data first from all the session(session1, session2) which is having data.
1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
Session1:
create table t1 (c1 integer);
create global temporary table gtt1 (c1 integer) on commit delete rows;
create global temporary table gtt2 (c1 integer) on commit preserve rows;


Session2:
drop table t1;
drop table gtt1;
drop table gtt2;


-- Issue 1: But we are able to drop a simple table and failed to drop GTT as below.
postgres=# drop table t1;
DROP TABLE
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table
postgres=# drop table gtt2;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
Session1:
create global temporary table gtt1 (c1 integer) on commit delete rows;

Session2:
drop table gtt1;

-- Issue 2: But we are getting error for GTT with(on_commit_delete_rows) without data.
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table

4. For a completed transaction on GTT with(on commit preserve rows) with data in any session, we will not be able to DROP from any session(not even from the session in which GTT is created)

Case1:
create global temporary table gtt2 (c1 integer) on commit preserve rows;
insert into gtt2 values(100);
drop table gtt2;


SQL> drop table gtt2;
drop table gtt2
  *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use


-- Issue 3: But, we are able to drop the GTT(having data) which we have created in the same session.
postgres=# drop table gtt2;
DROP TABLE

Case2: GTT with(on commit preserve rows) having data in both session1 and session2
Session1:
create global temporary table gtt2 (c1 integer) on commit preserve rows;
insert into gtt2 values(100);


Session2:
insert into gtt2 values(200);

-- If we try to drop the table from any session we should get an error, it is working fine.
drop table gtt2;
SQL> drop table gtt2;
drop table gtt2
  *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use

postgres=# drop table gtt2 ;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

-- To drop the table gtt2 from any session1/session2, we need to truncate the table data first from all the session(session1, session2) which is having data.
Session1:
truncate table gtt2;
-- Session2:
truncate table gtt2;

Session 2:
SQL> drop table gtt2;

Table dropped.


-- Issue 4: But we are not able to drop the GTT, even after TRUNCATE the table in all the sessions.
-- truncate from all sessions where GTT have data.
postgres=# truncate gtt2 ;
TRUNCATE TABLE


-- try to DROP GTT still, we are getting error.
postgres=# drop table gtt2 ;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

To drop the GTT from any session, we need to exit from all other sessions.
postgres=# drop table gtt2 ;
DROP TABLE


Kindly let me know if I am missing something.


On Wed, Apr 1, 2020 at 6:26 PM Prabhat Sahu <prabhat.sahu@enterprisedb.com> wrote:
Hi Wenjing,
I hope we need to change the below error message.

postgres=# create global temporary table gtt(c1 int) on commit preserve rows;
CREATE TABLE

postgres=# create materialized view mvw as select * from gtt;
ERROR: materialized views must not use global temporary tables or views


Anyways we are not allowed to create a "global temporary view", 
so the above ERROR message should change(i.e. " or view" need to be removed from the error message) something like:
"ERROR: materialized views must not use global temporary tables"

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com



--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com





--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年4月22日 下午10:50,Pavel Stehule <pavel.stehule@gmail.com> 写道:



st 22. 4. 2020 v 16:38 odesílatel Prabhat Sahu <prabhat.sahu@enterprisedb.com> napsal:


On Wed, Apr 22, 2020 at 2:49 PM 曾文旌 <wenjing.zwj@alibaba-inc.com> wrote:

Although the implementation of GTT is different, I think so TRUNCATE on Postgres (when it is really finalized) can remove session metadata of GTT too (and reduce usage's counter). It is not critical feature, but I think so it should not be hard to implement. From practical reason can be nice to have a tool how to refresh GTT without a necessity to close session. TRUNCATE can be this tool.
Sorry, I don't quite understand what you mean, could you describe it in detail? 
In my opinion the TRUNCATE GTT cannot clean up data in other sessions, especially clean up local buffers in other sessions.


Yes, I think we need a way to delete the GTT local storage without closing the session.

I provide the TRUNCATE tablename DROP to clear the data in the GTT and delete the storage files.
This feature requires the current transaction to commit immediately after it finishes truncate.

Hi Wenjing,
Thanks for the patch(v30) for the new syntax support for (TRUNCATE table_name DROP) for deleting storage files after TRUNCATE on GTT.

This syntax looks strange, and I don't think so it solve anything in practical life, because without lock the table will be used in few seconds by other sessions.

If a dba wants to delete or modify a GTT, he can use locks to help him make the change.

postgres=# begin;
BEGIN
postgres=*# LOCK TABLE gtt2 IN ACCESS EXCLUSIVE MODE;
postgres=*# select * from pg_gtt_attached_pids ;

Kill session or let session do TRUNCATE tablename DROP

postgres=*# drop table gtt2;
DROP TABLE
postgres=*# commit;
COMMIT


This is same topic when we talked about ALTER - when and where the changes should be applied.

The CLUSTER commands works only on session private data, so it should not to need some special lock or some special cleaning before.

Regards

Pavel
 
 
Please check below scenarios:

Case1:
-- session1:
postgres=# create global temporary table gtt2 (c1 integer) on commit preserve rows;
CREATE TABLE
postgres=# create index  idx1 on gtt2 (c1);
CREATE INDEX
postgres=# create index  idx2 on gtt2 (c1) where c1%2 =0;
CREATE INDEX
postgres=#
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
ERROR:  cannot cluster on partial index "idx2"        

Case2:
-- Session2:
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
CLUSTER

postgres=# insert into gtt2 values(1);
INSERT 0 1
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
ERROR:  cannot cluster on partial index "idx2"

Case3:
-- Session2:
postgres=# TRUNCATE gtt2 DROP;
TRUNCATE TABLE
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
CLUSTER

In Case2, Case3 we can observe, with the absence of data in GTT, we are able to "CLUSTER gtt2 USING idx2;" (having partial index)
But why does the same query fail for Case1 (absence of data)?

Thanks,
Prabhat Sahu

 


Wenjing



Regards

Pavel


All in all, I think the current implementation is sufficient for dba to manage GTT.

2020年4月2日 下午4:45,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi All,

I have noted down few behavioral difference in our GTT implementation in PG as compared to Oracle DB:
As per my understanding, the behavior of DROP TABLE in case of "Normal table and GTT" in Oracle DB are as below:
  1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
  2. For a completed transaction on a normal table having data, we will be able to DROP from another session. If the transaction is not yet complete, and we are trying to drop the table from another session, then we will get an error. (working as expected)
  3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
  4. For a completed transaction on GTT with(on commit preserve rows) with data in a session, we will not be able to DROP from any session(not even from the session in which GTT is created), we need to truncate the table data first from all the session(session1, session2) which is having data.
1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
Session1:
create table t1 (c1 integer);
create global temporary table gtt1 (c1 integer) on commit delete rows;
create global temporary table gtt2 (c1 integer) on commit preserve rows;


Session2:
drop table t1;
drop table gtt1;
drop table gtt2;


-- Issue 1: But we are able to drop a simple table and failed to drop GTT as below.
postgres=# drop table t1;
DROP TABLE
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table
postgres=# drop table gtt2;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
Session1:
create global temporary table gtt1 (c1 integer) on commit delete rows;

Session2:
drop table gtt1;

-- Issue 2: But we are getting error for GTT with(on_commit_delete_rows) without data.
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table

4. For a completed transaction on GTT with(on commit preserve rows) with data in any session, we will not be able to DROP from any session(not even from the session in which GTT is created)

Case1:
create global temporary table gtt2 (c1 integer) on commit preserve rows;
insert into gtt2 values(100);
drop table gtt2;


SQL> drop table gtt2;
drop table gtt2
  *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use


-- Issue 3: But, we are able to drop the GTT(having data) which we have created in the same session.
postgres=# drop table gtt2;
DROP TABLE

Case2: GTT with(on commit preserve rows) having data in both session1 and session2
Session1:
create global temporary table gtt2 (c1 integer) on commit preserve rows;
insert into gtt2 values(100);


Session2:
insert into gtt2 values(200);

-- If we try to drop the table from any session we should get an error, it is working fine.
drop table gtt2;
SQL> drop table gtt2;
drop table gtt2
  *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use

postgres=# drop table gtt2 ;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

-- To drop the table gtt2 from any session1/session2, we need to truncate the table data first from all the session(session1, session2) which is having data.
Session1:
truncate table gtt2;
-- Session2:
truncate table gtt2;

Session 2:
SQL> drop table gtt2;

Table dropped.


-- Issue 4: But we are not able to drop the GTT, even after TRUNCATE the table in all the sessions.
-- truncate from all sessions where GTT have data.
postgres=# truncate gtt2 ;
TRUNCATE TABLE


-- try to DROP GTT still, we are getting error.
postgres=# drop table gtt2 ;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

To drop the GTT from any session, we need to exit from all other sessions.
postgres=# drop table gtt2 ;
DROP TABLE


Kindly let me know if I am missing something.


On Wed, Apr 1, 2020 at 6:26 PM Prabhat Sahu <prabhat.sahu@enterprisedb.com> wrote:
Hi Wenjing,
I hope we need to change the below error message.

postgres=# create global temporary table gtt(c1 int) on commit preserve rows;
CREATE TABLE

postgres=# create materialized view mvw as select * from gtt;
ERROR: materialized views must not use global temporary tables or views


Anyways we are not allowed to create a "global temporary view", 
so the above ERROR message should change(i.e. " or view" need to be removed from the error message) something like:
"ERROR: materialized views must not use global temporary tables"

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com



--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com





--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


čt 23. 4. 2020 v 9:10 odesílatel 曾文旌 <wenjing.zwj@alibaba-inc.com> napsal:


2020年4月22日 下午10:50,Pavel Stehule <pavel.stehule@gmail.com> 写道:



st 22. 4. 2020 v 16:38 odesílatel Prabhat Sahu <prabhat.sahu@enterprisedb.com> napsal:


On Wed, Apr 22, 2020 at 2:49 PM 曾文旌 <wenjing.zwj@alibaba-inc.com> wrote:

Although the implementation of GTT is different, I think so TRUNCATE on Postgres (when it is really finalized) can remove session metadata of GTT too (and reduce usage's counter). It is not critical feature, but I think so it should not be hard to implement. From practical reason can be nice to have a tool how to refresh GTT without a necessity to close session. TRUNCATE can be this tool.
Sorry, I don't quite understand what you mean, could you describe it in detail? 
In my opinion the TRUNCATE GTT cannot clean up data in other sessions, especially clean up local buffers in other sessions.

It is about a possibility to force reset GTT to default empty state for all sessions.

Maybe it is some what does your TRUNCATE DROP, but I don't think so this design (TRUNCATE DROP) is good, because then user have to know implementation detail.

I prefer some like TRUNCATE tab WITH OPTION (GLOBAL, FORCE) - "GLOBAL" .. apply on all sessions, FORCE try to do without waiting on some global lock, try to do immediately with possibility to cancel some statements and rollback some session.

instead GLOBAL maybe we can use  "ALLSESSION", or "ALL SESSION" or some else

But I like possible terminology LOCAL x GLOBAL for GTT. What I mean? Some statements like "TRUNCATE" can  works (by default) in "local" mode .. it has impact to current session only. But sometimes can be executed in "global" mode with effect on all sessions.



Yes, I think we need a way to delete the GTT local storage without closing the session.

I provide the TRUNCATE tablename DROP to clear the data in the GTT and delete the storage files.
This feature requires the current transaction to commit immediately after it finishes truncate.

Hi Wenjing,
Thanks for the patch(v30) for the new syntax support for (TRUNCATE table_name DROP) for deleting storage files after TRUNCATE on GTT.

This syntax looks strange, and I don't think so it solve anything in practical life, because without lock the table will be used in few seconds by other sessions.

If a dba wants to delete or modify a GTT, he can use locks to help him make the change.

postgres=# begin;
BEGIN
postgres=*# LOCK TABLE gtt2 IN ACCESS EXCLUSIVE MODE;
postgres=*# select * from pg_gtt_attached_pids ;

Kill session or let session do TRUNCATE tablename DROP

postgres=*# drop table gtt2;
DROP TABLE
postgres=*# commit;
COMMIT

yes, user can lock a tables. But I think so it is user friendly design. I don't remember any statement in Postgres, where I have to use table locks explicitly.

For builtin commands it should be done transparently (for user).

Regards

Pavel
 


This is same topic when we talked about ALTER - when and where the changes should be applied.

The CLUSTER commands works only on session private data, so it should not to need some special lock or some special cleaning before.

Regards

Pavel
 
 
Please check below scenarios:

Case1:
-- session1:
postgres=# create global temporary table gtt2 (c1 integer) on commit preserve rows;
CREATE TABLE
postgres=# create index  idx1 on gtt2 (c1);
CREATE INDEX
postgres=# create index  idx2 on gtt2 (c1) where c1%2 =0;
CREATE INDEX
postgres=#
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
ERROR:  cannot cluster on partial index "idx2"        

Case2:
-- Session2:
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
CLUSTER

postgres=# insert into gtt2 values(1);
INSERT 0 1
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
ERROR:  cannot cluster on partial index "idx2"

Case3:
-- Session2:
postgres=# TRUNCATE gtt2 DROP;
TRUNCATE TABLE
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
CLUSTER

In Case2, Case3 we can observe, with the absence of data in GTT, we are able to "CLUSTER gtt2 USING idx2;" (having partial index)
But why does the same query fail for Case1 (absence of data)?

Thanks,
Prabhat Sahu

 


Wenjing



Regards

Pavel


All in all, I think the current implementation is sufficient for dba to manage GTT.

2020年4月2日 下午4:45,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi All,

I have noted down few behavioral difference in our GTT implementation in PG as compared to Oracle DB:
As per my understanding, the behavior of DROP TABLE in case of "Normal table and GTT" in Oracle DB are as below:
  1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
  2. For a completed transaction on a normal table having data, we will be able to DROP from another session. If the transaction is not yet complete, and we are trying to drop the table from another session, then we will get an error. (working as expected)
  3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
  4. For a completed transaction on GTT with(on commit preserve rows) with data in a session, we will not be able to DROP from any session(not even from the session in which GTT is created), we need to truncate the table data first from all the session(session1, session2) which is having data.
1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
Session1:
create table t1 (c1 integer);
create global temporary table gtt1 (c1 integer) on commit delete rows;
create global temporary table gtt2 (c1 integer) on commit preserve rows;


Session2:
drop table t1;
drop table gtt1;
drop table gtt2;


-- Issue 1: But we are able to drop a simple table and failed to drop GTT as below.
postgres=# drop table t1;
DROP TABLE
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table
postgres=# drop table gtt2;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
Session1:
create global temporary table gtt1 (c1 integer) on commit delete rows;

Session2:
drop table gtt1;

-- Issue 2: But we are getting error for GTT with(on_commit_delete_rows) without data.
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table

4. For a completed transaction on GTT with(on commit preserve rows) with data in any session, we will not be able to DROP from any session(not even from the session in which GTT is created)

Case1:
create global temporary table gtt2 (c1 integer) on commit preserve rows;
insert into gtt2 values(100);
drop table gtt2;


SQL> drop table gtt2;
drop table gtt2
  *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use


-- Issue 3: But, we are able to drop the GTT(having data) which we have created in the same session.
postgres=# drop table gtt2;
DROP TABLE

Case2: GTT with(on commit preserve rows) having data in both session1 and session2
Session1:
create global temporary table gtt2 (c1 integer) on commit preserve rows;
insert into gtt2 values(100);


Session2:
insert into gtt2 values(200);

-- If we try to drop the table from any session we should get an error, it is working fine.
drop table gtt2;
SQL> drop table gtt2;
drop table gtt2
  *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use

postgres=# drop table gtt2 ;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

-- To drop the table gtt2 from any session1/session2, we need to truncate the table data first from all the session(session1, session2) which is having data.
Session1:
truncate table gtt2;
-- Session2:
truncate table gtt2;

Session 2:
SQL> drop table gtt2;

Table dropped.


-- Issue 4: But we are not able to drop the GTT, even after TRUNCATE the table in all the sessions.
-- truncate from all sessions where GTT have data.
postgres=# truncate gtt2 ;
TRUNCATE TABLE


-- try to DROP GTT still, we are getting error.
postgres=# drop table gtt2 ;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

To drop the GTT from any session, we need to exit from all other sessions.
postgres=# drop table gtt2 ;
DROP TABLE


Kindly let me know if I am missing something.


On Wed, Apr 1, 2020 at 6:26 PM Prabhat Sahu <prabhat.sahu@enterprisedb.com> wrote:
Hi Wenjing,
I hope we need to change the below error message.

postgres=# create global temporary table gtt(c1 int) on commit preserve rows;
CREATE TABLE

postgres=# create materialized view mvw as select * from gtt;
ERROR: materialized views must not use global temporary tables or views


Anyways we are not allowed to create a "global temporary view", 
so the above ERROR message should change(i.e. " or view" need to be removed from the error message) something like:
"ERROR: materialized views must not use global temporary tables"

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com



--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com





--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
Hi Wenjing,

Please check, the server getting crash with the below scenario(CLUSTER gtt using INDEX).

-- Session1:
postgres=# create global temporary table gtt (c1 integer) on commit preserve rows;
CREATE TABLE
postgres=# create index idx1 on gtt (c1);
CREATE INDEX

-- Session2:
postgres=# create index idx2 on gtt (c1);
CREATE INDEX

-- Session1:
postgres=# cluster gtt using idx1;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!?>

-- Below is the stacktrace:
[edb@localhost bin]$ gdb -q -c data/core.95690 postgres
Reading symbols from /home/edb/PG/PGsrcNew/postgresql/inst/bin/postgres...done.
[New LWP 95690]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: edb postgres [local] CLUSTER                    '.
Program terminated with signal 6, Aborted.
#0  0x00007f9c574ee337 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-292.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-37.el7_7.2.x86_64 libcom_err-1.42.9-16.el7.x86_64 libgcc-4.8.5-39.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 openssl-libs-1.0.2k-19.el7.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0  0x00007f9c574ee337 in raise () from /lib64/libc.so.6
#1  0x00007f9c574efa28 in abort () from /lib64/libc.so.6
#2  0x0000000000ab3a3c in ExceptionalCondition (conditionName=0xb5e2e8 "!ReindexIsProcessingIndex(indexOid)", errorType=0xb5d365 "FailedAssertion",
    fileName=0xb5d4e9 "index.c", lineNumber=3825) at assert.c:67
#3  0x00000000005b0412 in reindex_relation (relid=16384, flags=2, options=0) at index.c:3825
#4  0x000000000065e36d in finish_heap_swap (OIDOldHeap=16384, OIDNewHeap=16389, is_system_catalog=false, swap_toast_by_content=false,
    check_constraints=false, is_internal=true, frozenXid=491, cutoffMulti=1, newrelpersistence=103 'g') at cluster.c:1448
#5  0x000000000065ccef in rebuild_relation (OldHeap=0x7f9c589adef0, indexOid=16387, verbose=false) at cluster.c:602
#6  0x000000000065c757 in cluster_rel (tableOid=16384, indexOid=16387, options=0) at cluster.c:418
#7  0x000000000065c2cf in cluster (stmt=0x2cd1600, isTopLevel=true) at cluster.c:180
#8  0x000000000093b213 in standard_ProcessUtility (pstmt=0x2cd16c8, queryString=0x2cd0b30 "cluster gtt using idx1;", context=PROCESS_UTILITY_TOPLEVEL,
    params=0x0, queryEnv=0x0, dest=0x2cd19a8, qc=0x7ffcd32604b0) at utility.c:819
#9  0x000000000093aa50 in ProcessUtility (pstmt=0x2cd16c8, queryString=0x2cd0b30 "cluster gtt using idx1;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0,
    queryEnv=0x0, dest=0x2cd19a8, qc=0x7ffcd32604b0) at utility.c:522
#10 0x00000000009398c2 in PortalRunUtility (portal=0x2d36ba0, pstmt=0x2cd16c8, isTopLevel=true, setHoldSnapshot=false, dest=0x2cd19a8, qc=0x7ffcd32604b0)
    at pquery.c:1157
#11 0x0000000000939ad8 in PortalRunMulti (portal=0x2d36ba0, isTopLevel=true, setHoldSnapshot=false, dest=0x2cd19a8, altdest=0x2cd19a8, qc=0x7ffcd32604b0)
    at pquery.c:1303
#12 0x0000000000938ff6 in PortalRun (portal=0x2d36ba0, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x2cd19a8, altdest=0x2cd19a8,
    qc=0x7ffcd32604b0) at pquery.c:779
#13 0x00000000009331b0 in exec_simple_query (query_string=0x2cd0b30 "cluster gtt using idx1;") at postgres.c:1239
#14 0x00000000009371bc in PostgresMain (argc=1, argv=0x2cfab80, dbname=0x2cfaa78 "postgres", username=0x2cfaa58 "edb") at postgres.c:4315
#15 0x00000000008872a9 in BackendRun (port=0x2cf2b50) at postmaster.c:4510
#16 0x0000000000886a9e in BackendStartup (port=0x2cf2b50) at postmaster.c:4202
#17 0x000000000088301c in ServerLoop () at postmaster.c:1727
#18 0x00000000008828f3 in PostmasterMain (argc=3, argv=0x2ccb460) at postmaster.c:1400
#19 0x0000000000789c54 in main (argc=3, argv=0x2ccb460) at main.c:210
(gdb) 


--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
Hi Wenjing, 

With the new patch(v30) as you mentioned the new syntax support for "TRUNCATE TABLE gtt DROP", but we also observe the syntax "DROP TABLE gtt DROP" is working as below:

postgres=# create global temporary table gtt(c1 int) on commit preserve rows;
CREATE TABLE
postgres=# DROP TABLE gtt DROP;
DROP TABLE

Does this syntax intensional? If not, we should get a syntax error.

On Fri, Apr 24, 2020 at 10:25 AM Prabhat Sahu <prabhat.sahu@enterprisedb.com> wrote:
Hi Wenjing,

Please check, the server getting crash with the below scenario(CLUSTER gtt using INDEX).

-- Session1:
postgres=# create global temporary table gtt (c1 integer) on commit preserve rows;
CREATE TABLE
postgres=# create index idx1 on gtt (c1);
CREATE INDEX

-- Session2:
postgres=# create index idx2 on gtt (c1);
CREATE INDEX

-- Session1:
postgres=# cluster gtt using idx1;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!?>

-- Below is the stacktrace:
[edb@localhost bin]$ gdb -q -c data/core.95690 postgres
Reading symbols from /home/edb/PG/PGsrcNew/postgresql/inst/bin/postgres...done.
[New LWP 95690]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: edb postgres [local] CLUSTER                    '.
Program terminated with signal 6, Aborted.
#0  0x00007f9c574ee337 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-292.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-37.el7_7.2.x86_64 libcom_err-1.42.9-16.el7.x86_64 libgcc-4.8.5-39.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 openssl-libs-1.0.2k-19.el7.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0  0x00007f9c574ee337 in raise () from /lib64/libc.so.6
#1  0x00007f9c574efa28 in abort () from /lib64/libc.so.6
#2  0x0000000000ab3a3c in ExceptionalCondition (conditionName=0xb5e2e8 "!ReindexIsProcessingIndex(indexOid)", errorType=0xb5d365 "FailedAssertion",
    fileName=0xb5d4e9 "index.c", lineNumber=3825) at assert.c:67
#3  0x00000000005b0412 in reindex_relation (relid=16384, flags=2, options=0) at index.c:3825
#4  0x000000000065e36d in finish_heap_swap (OIDOldHeap=16384, OIDNewHeap=16389, is_system_catalog=false, swap_toast_by_content=false,
    check_constraints=false, is_internal=true, frozenXid=491, cutoffMulti=1, newrelpersistence=103 'g') at cluster.c:1448
#5  0x000000000065ccef in rebuild_relation (OldHeap=0x7f9c589adef0, indexOid=16387, verbose=false) at cluster.c:602
#6  0x000000000065c757 in cluster_rel (tableOid=16384, indexOid=16387, options=0) at cluster.c:418
#7  0x000000000065c2cf in cluster (stmt=0x2cd1600, isTopLevel=true) at cluster.c:180
#8  0x000000000093b213 in standard_ProcessUtility (pstmt=0x2cd16c8, queryString=0x2cd0b30 "cluster gtt using idx1;", context=PROCESS_UTILITY_TOPLEVEL,
    params=0x0, queryEnv=0x0, dest=0x2cd19a8, qc=0x7ffcd32604b0) at utility.c:819
#9  0x000000000093aa50 in ProcessUtility (pstmt=0x2cd16c8, queryString=0x2cd0b30 "cluster gtt using idx1;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0,
    queryEnv=0x0, dest=0x2cd19a8, qc=0x7ffcd32604b0) at utility.c:522
#10 0x00000000009398c2 in PortalRunUtility (portal=0x2d36ba0, pstmt=0x2cd16c8, isTopLevel=true, setHoldSnapshot=false, dest=0x2cd19a8, qc=0x7ffcd32604b0)
    at pquery.c:1157
#11 0x0000000000939ad8 in PortalRunMulti (portal=0x2d36ba0, isTopLevel=true, setHoldSnapshot=false, dest=0x2cd19a8, altdest=0x2cd19a8, qc=0x7ffcd32604b0)
    at pquery.c:1303
#12 0x0000000000938ff6 in PortalRun (portal=0x2d36ba0, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x2cd19a8, altdest=0x2cd19a8,
    qc=0x7ffcd32604b0) at pquery.c:779
#13 0x00000000009331b0 in exec_simple_query (query_string=0x2cd0b30 "cluster gtt using idx1;") at postgres.c:1239
#14 0x00000000009371bc in PostgresMain (argc=1, argv=0x2cfab80, dbname=0x2cfaa78 "postgres", username=0x2cfaa58 "edb") at postgres.c:4315
#15 0x00000000008872a9 in BackendRun (port=0x2cf2b50) at postmaster.c:4510
#16 0x0000000000886a9e in BackendStartup (port=0x2cf2b50) at postmaster.c:4202
#17 0x000000000088301c in ServerLoop () at postmaster.c:1727
#18 0x00000000008828f3 in PostmasterMain (argc=3, argv=0x2ccb460) at postmaster.c:1400
#19 0x0000000000789c54 in main (argc=3, argv=0x2ccb460) at main.c:210
(gdb) 


--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com



--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
tushar
Date:
On 4/22/20 2:49 PM, 曾文旌 wrote:
>
> I provide the TRUNCATE tablename DROP to clear the data in the GTT and 
> delete the storage files.
> This feature requires the current transaction to commit immediately 
> after it finishes truncate.
>
Thanks Wenjing , Please refer this scenario

postgres=# create global temp table testing (a int);
CREATE TABLE
postgres=# begin;
BEGIN
postgres=*# truncate testing;      -- working   [1]
TRUNCATE TABLE
postgres=*# truncate testing drop;
ERROR:  Truncate global temporary table cannot run inside a transaction 
block    --that is throwing an error claiming something which i did  
successfully [1]
postgres=!#

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company




Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年4月23日 下午3:43,Pavel Stehule <pavel.stehule@gmail.com> 写道:



čt 23. 4. 2020 v 9:10 odesílatel 曾文旌 <wenjing.zwj@alibaba-inc.com> napsal:


2020年4月22日 下午10:50,Pavel Stehule <pavel.stehule@gmail.com> 写道:



st 22. 4. 2020 v 16:38 odesílatel Prabhat Sahu <prabhat.sahu@enterprisedb.com> napsal:


On Wed, Apr 22, 2020 at 2:49 PM 曾文旌 <wenjing.zwj@alibaba-inc.com> wrote:

Although the implementation of GTT is different, I think so TRUNCATE on Postgres (when it is really finalized) can remove session metadata of GTT too (and reduce usage's counter). It is not critical feature, but I think so it should not be hard to implement. From practical reason can be nice to have a tool how to refresh GTT without a necessity to close session. TRUNCATE can be this tool.
Sorry, I don't quite understand what you mean, could you describe it in detail? 
In my opinion the TRUNCATE GTT cannot clean up data in other sessions, especially clean up local buffers in other sessions.

It is about a possibility to force reset GTT to default empty state for all sessions.

Maybe it is some what does your TRUNCATE DROP, but I don't think so this design (TRUNCATE DROP) is good, because then user have to know implementation detail.

I prefer some like TRUNCATE tab WITH OPTION (GLOBAL, FORCE) - "GLOBAL" .. apply on all sessions, FORCE try to do without waiting on some global lock, try to do immediately with possibility to cancel some statements and rollback some session.

instead GLOBAL maybe we can use  "ALLSESSION", or "ALL SESSION" or some else

But I like possible terminology LOCAL x GLOBAL for GTT. What I mean? Some statements like "TRUNCATE" can  works (by default) in "local" mode .. it has impact to current session only. But sometimes can be executed in "global" mode with effect on all sessions.
The TRUNCATE GTT GLOBAL like DROP GTT FORCE you mentioned that before.
I think this requires identifying sessions that have initialized the stored file and no actual data.
And Handling local buffers on other session and locks is also difficult.
It may be harder than dropping the GTT force, which can kill other sessions, but TRUNCATE GTT would prefer not to.
This doesn't seem to complete the basic conditions, it's not easy.
So, I want to put this feature in next releases, along with DROP GTT FORCE.
Also, in view of your comments, I roll back the feature of TRUNCATE GTT DROP.



Wenjing




Yes, I think we need a way to delete the GTT local storage without closing the session.

I provide the TRUNCATE tablename DROP to clear the data in the GTT and delete the storage files.
This feature requires the current transaction to commit immediately after it finishes truncate.

Hi Wenjing,
Thanks for the patch(v30) for the new syntax support for (TRUNCATE table_name DROP) for deleting storage files after TRUNCATE on GTT.

This syntax looks strange, and I don't think so it solve anything in practical life, because without lock the table will be used in few seconds by other sessions.

If a dba wants to delete or modify a GTT, he can use locks to help him make the change.

postgres=# begin;
BEGIN
postgres=*# LOCK TABLE gtt2 IN ACCESS EXCLUSIVE MODE;
postgres=*# select * from pg_gtt_attached_pids ;

Kill session or let session do TRUNCATE tablename DROP

postgres=*# drop table gtt2;
DROP TABLE
postgres=*# commit;
COMMIT

yes, user can lock a tables. But I think so it is user friendly design. I don't remember any statement in Postgres, where I have to use table locks explicitly.

For builtin commands it should be done transparently (for user).
It can be improved ,like DROP GTT FORCE.


Regards

Pavel
 


This is same topic when we talked about ALTER - when and where the changes should be applied.

The CLUSTER commands works only on session private data, so it should not to need some special lock or some special cleaning before.

Regards

Pavel
 
 
Please check below scenarios:

Case1:
-- session1:
postgres=# create global temporary table gtt2 (c1 integer) on commit preserve rows;
CREATE TABLE
postgres=# create index  idx1 on gtt2 (c1);
CREATE INDEX
postgres=# create index  idx2 on gtt2 (c1) where c1%2 =0;
CREATE INDEX
postgres=#
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
ERROR:  cannot cluster on partial index "idx2"        

Case2:
-- Session2:
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
CLUSTER

postgres=# insert into gtt2 values(1);
INSERT 0 1
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
ERROR:  cannot cluster on partial index "idx2"

Case3:
-- Session2:
postgres=# TRUNCATE gtt2 DROP;
TRUNCATE TABLE
postgres=# CLUSTER gtt2 USING idx1;
CLUSTER
postgres=# CLUSTER gtt2 USING idx2;
CLUSTER

In Case2, Case3 we can observe, with the absence of data in GTT, we are able to "CLUSTER gtt2 USING idx2;" (having partial index)
But why does the same query fail for Case1 (absence of data)?

Thanks,
Prabhat Sahu

 


Wenjing



Regards

Pavel


All in all, I think the current implementation is sufficient for dba to manage GTT.

2020年4月2日 下午4:45,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi All,

I have noted down few behavioral difference in our GTT implementation in PG as compared to Oracle DB:
As per my understanding, the behavior of DROP TABLE in case of "Normal table and GTT" in Oracle DB are as below:
  1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
  2. For a completed transaction on a normal table having data, we will be able to DROP from another session. If the transaction is not yet complete, and we are trying to drop the table from another session, then we will get an error. (working as expected)
  3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
  4. For a completed transaction on GTT with(on commit preserve rows) with data in a session, we will not be able to DROP from any session(not even from the session in which GTT is created), we need to truncate the table data first from all the session(session1, session2) which is having data.
1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
Session1:
create table t1 (c1 integer);
create global temporary table gtt1 (c1 integer) on commit delete rows;
create global temporary table gtt2 (c1 integer) on commit preserve rows;


Session2:
drop table t1;
drop table gtt1;
drop table gtt2;


-- Issue 1: But we are able to drop a simple table and failed to drop GTT as below.
postgres=# drop table t1;
DROP TABLE
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table
postgres=# drop table gtt2;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
Session1:
create global temporary table gtt1 (c1 integer) on commit delete rows;

Session2:
drop table gtt1;

-- Issue 2: But we are getting error for GTT with(on_commit_delete_rows) without data.
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table

4. For a completed transaction on GTT with(on commit preserve rows) with data in any session, we will not be able to DROP from any session(not even from the session in which GTT is created)

Case1:
create global temporary table gtt2 (c1 integer) on commit preserve rows;
insert into gtt2 values(100);
drop table gtt2;


SQL> drop table gtt2;
drop table gtt2
  *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use


-- Issue 3: But, we are able to drop the GTT(having data) which we have created in the same session.
postgres=# drop table gtt2;
DROP TABLE

Case2: GTT with(on commit preserve rows) having data in both session1 and session2
Session1:
create global temporary table gtt2 (c1 integer) on commit preserve rows;
insert into gtt2 values(100);


Session2:
insert into gtt2 values(200);

-- If we try to drop the table from any session we should get an error, it is working fine.
drop table gtt2;
SQL> drop table gtt2;
drop table gtt2
  *
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use

postgres=# drop table gtt2 ;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

-- To drop the table gtt2 from any session1/session2, we need to truncate the table data first from all the session(session1, session2) which is having data.
Session1:
truncate table gtt2;
-- Session2:
truncate table gtt2;

Session 2:
SQL> drop table gtt2;

Table dropped.


-- Issue 4: But we are not able to drop the GTT, even after TRUNCATE the table in all the sessions.
-- truncate from all sessions where GTT have data.
postgres=# truncate gtt2 ;
TRUNCATE TABLE


-- try to DROP GTT still, we are getting error.
postgres=# drop table gtt2 ;
ERROR:  can not drop relation gtt2 when other backend attached this global temp table

To drop the GTT from any session, we need to exit from all other sessions.
postgres=# drop table gtt2 ;
DROP TABLE


Kindly let me know if I am missing something.


On Wed, Apr 1, 2020 at 6:26 PM Prabhat Sahu <prabhat.sahu@enterprisedb.com> wrote:
Hi Wenjing,
I hope we need to change the below error message.

postgres=# create global temporary table gtt(c1 int) on commit preserve rows;
CREATE TABLE

postgres=# create materialized view mvw as select * from gtt;
ERROR: materialized views must not use global temporary tables or views


Anyways we are not allowed to create a "global temporary view", 
so the above ERROR message should change(i.e. " or view" need to be removed from the error message) something like:
"ERROR: materialized views must not use global temporary tables"

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com



--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com





--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com



Attachment

Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
Hi Wenjing,

Please check the below scenario shows different error message with "DROP TABLE gtt;" for gtt with and without index.
-- Session1:
postgres=# create global temporary table gtt1 (c1 int);
CREATE TABLE
postgres=# create global temporary table gtt2 (c1 int);
CREATE TABLE
postgres=# create index idx2 on gtt2(c1);
CREATE INDEX

-- Session2:
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table
postgres=# drop table gtt2;
ERROR:  can not drop index gtt2 when other backend attached this global temp table.

--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年4月24日 下午12:55,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi Wenjing,

Please check, the server getting crash with the below scenario(CLUSTER gtt using INDEX).

-- Session1:
postgres=# create global temporary table gtt (c1 integer) on commit preserve rows;
CREATE TABLE
postgres=# create index idx1 on gtt (c1);
CREATE INDEX

-- Session2:
postgres=# create index idx2 on gtt (c1);
CREATE INDEX

-- Session1:
postgres=# cluster gtt using idx1;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
Thanks for review, I fixed In v31.


Wenjing


!?>

-- Below is the stacktrace:
[edb@localhost bin]$ gdb -q -c data/core.95690 postgres
Reading symbols from /home/edb/PG/PGsrcNew/postgresql/inst/bin/postgres...done.
[New LWP 95690]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: edb postgres [local] CLUSTER                    '.
Program terminated with signal 6, Aborted.
#0  0x00007f9c574ee337 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-292.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-37.el7_7.2.x86_64 libcom_err-1.42.9-16.el7.x86_64 libgcc-4.8.5-39.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 openssl-libs-1.0.2k-19.el7.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0  0x00007f9c574ee337 in raise () from /lib64/libc.so.6
#1  0x00007f9c574efa28 in abort () from /lib64/libc.so.6
#2  0x0000000000ab3a3c in ExceptionalCondition (conditionName=0xb5e2e8 "!ReindexIsProcessingIndex(indexOid)", errorType=0xb5d365 "FailedAssertion",
    fileName=0xb5d4e9 "index.c", lineNumber=3825) at assert.c:67
#3  0x00000000005b0412 in reindex_relation (relid=16384, flags=2, options=0) at index.c:3825
#4  0x000000000065e36d in finish_heap_swap (OIDOldHeap=16384, OIDNewHeap=16389, is_system_catalog=false, swap_toast_by_content=false,
    check_constraints=false, is_internal=true, frozenXid=491, cutoffMulti=1, newrelpersistence=103 'g') at cluster.c:1448
#5  0x000000000065ccef in rebuild_relation (OldHeap=0x7f9c589adef0, indexOid=16387, verbose=false) at cluster.c:602
#6  0x000000000065c757 in cluster_rel (tableOid=16384, indexOid=16387, options=0) at cluster.c:418
#7  0x000000000065c2cf in cluster (stmt=0x2cd1600, isTopLevel=true) at cluster.c:180
#8  0x000000000093b213 in standard_ProcessUtility (pstmt=0x2cd16c8, queryString=0x2cd0b30 "cluster gtt using idx1;", context=PROCESS_UTILITY_TOPLEVEL,
    params=0x0, queryEnv=0x0, dest=0x2cd19a8, qc=0x7ffcd32604b0) at utility.c:819
#9  0x000000000093aa50 in ProcessUtility (pstmt=0x2cd16c8, queryString=0x2cd0b30 "cluster gtt using idx1;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0,
    queryEnv=0x0, dest=0x2cd19a8, qc=0x7ffcd32604b0) at utility.c:522
#10 0x00000000009398c2 in PortalRunUtility (portal=0x2d36ba0, pstmt=0x2cd16c8, isTopLevel=true, setHoldSnapshot=false, dest=0x2cd19a8, qc=0x7ffcd32604b0)
    at pquery.c:1157
#11 0x0000000000939ad8 in PortalRunMulti (portal=0x2d36ba0, isTopLevel=true, setHoldSnapshot=false, dest=0x2cd19a8, altdest=0x2cd19a8, qc=0x7ffcd32604b0)
    at pquery.c:1303
#12 0x0000000000938ff6 in PortalRun (portal=0x2d36ba0, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x2cd19a8, altdest=0x2cd19a8,
    qc=0x7ffcd32604b0) at pquery.c:779
#13 0x00000000009331b0 in exec_simple_query (query_string=0x2cd0b30 "cluster gtt using idx1;") at postgres.c:1239
#14 0x00000000009371bc in PostgresMain (argc=1, argv=0x2cfab80, dbname=0x2cfaa78 "postgres", username=0x2cfaa58 "edb") at postgres.c:4315
#15 0x00000000008872a9 in BackendRun (port=0x2cf2b50) at postmaster.c:4510
#16 0x0000000000886a9e in BackendStartup (port=0x2cf2b50) at postmaster.c:4202
#17 0x000000000088301c in ServerLoop () at postmaster.c:1727
#18 0x00000000008828f3 in PostmasterMain (argc=3, argv=0x2ccb460) at postmaster.c:1400
#19 0x0000000000789c54 in main (argc=3, argv=0x2ccb460) at main.c:210
(gdb) 


--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年4月24日 下午3:28,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi Wenjing, 

With the new patch(v30) as you mentioned the new syntax support for "TRUNCATE TABLE gtt DROP", but we also observe the syntax "DROP TABLE gtt DROP" is working as below:

postgres=# create global temporary table gtt(c1 int) on commit preserve rows;
CREATE TABLE
postgres=# DROP TABLE gtt DROP;
DROP TABLE
Fixed in v31.
The truncate GTT drop was also removed.


Wenjing




Does this syntax intensional? If not, we should get a syntax error.

On Fri, Apr 24, 2020 at 10:25 AM Prabhat Sahu <prabhat.sahu@enterprisedb.com> wrote:
Hi Wenjing,

Please check, the server getting crash with the below scenario(CLUSTER gtt using INDEX).

-- Session1:
postgres=# create global temporary table gtt (c1 integer) on commit preserve rows;
CREATE TABLE
postgres=# create index idx1 on gtt (c1);
CREATE INDEX

-- Session2:
postgres=# create index idx2 on gtt (c1);
CREATE INDEX

-- Session1:
postgres=# cluster gtt using idx1;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!?>

-- Below is the stacktrace:
[edb@localhost bin]$ gdb -q -c data/core.95690 postgres
Reading symbols from /home/edb/PG/PGsrcNew/postgresql/inst/bin/postgres...done.
[New LWP 95690]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: edb postgres [local] CLUSTER                    '.
Program terminated with signal 6, Aborted.
#0  0x00007f9c574ee337 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-292.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-37.el7_7.2.x86_64 libcom_err-1.42.9-16.el7.x86_64 libgcc-4.8.5-39.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 openssl-libs-1.0.2k-19.el7.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0  0x00007f9c574ee337 in raise () from /lib64/libc.so.6
#1  0x00007f9c574efa28 in abort () from /lib64/libc.so.6
#2  0x0000000000ab3a3c in ExceptionalCondition (conditionName=0xb5e2e8 "!ReindexIsProcessingIndex(indexOid)", errorType=0xb5d365 "FailedAssertion",
    fileName=0xb5d4e9 "index.c", lineNumber=3825) at assert.c:67
#3  0x00000000005b0412 in reindex_relation (relid=16384, flags=2, options=0) at index.c:3825
#4  0x000000000065e36d in finish_heap_swap (OIDOldHeap=16384, OIDNewHeap=16389, is_system_catalog=false, swap_toast_by_content=false,
    check_constraints=false, is_internal=true, frozenXid=491, cutoffMulti=1, newrelpersistence=103 'g') at cluster.c:1448
#5  0x000000000065ccef in rebuild_relation (OldHeap=0x7f9c589adef0, indexOid=16387, verbose=false) at cluster.c:602
#6  0x000000000065c757 in cluster_rel (tableOid=16384, indexOid=16387, options=0) at cluster.c:418
#7  0x000000000065c2cf in cluster (stmt=0x2cd1600, isTopLevel=true) at cluster.c:180
#8  0x000000000093b213 in standard_ProcessUtility (pstmt=0x2cd16c8, queryString=0x2cd0b30 "cluster gtt using idx1;", context=PROCESS_UTILITY_TOPLEVEL,
    params=0x0, queryEnv=0x0, dest=0x2cd19a8, qc=0x7ffcd32604b0) at utility.c:819
#9  0x000000000093aa50 in ProcessUtility (pstmt=0x2cd16c8, queryString=0x2cd0b30 "cluster gtt using idx1;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0,
    queryEnv=0x0, dest=0x2cd19a8, qc=0x7ffcd32604b0) at utility.c:522
#10 0x00000000009398c2 in PortalRunUtility (portal=0x2d36ba0, pstmt=0x2cd16c8, isTopLevel=true, setHoldSnapshot=false, dest=0x2cd19a8, qc=0x7ffcd32604b0)
    at pquery.c:1157
#11 0x0000000000939ad8 in PortalRunMulti (portal=0x2d36ba0, isTopLevel=true, setHoldSnapshot=false, dest=0x2cd19a8, altdest=0x2cd19a8, qc=0x7ffcd32604b0)
    at pquery.c:1303
#12 0x0000000000938ff6 in PortalRun (portal=0x2d36ba0, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x2cd19a8, altdest=0x2cd19a8,
    qc=0x7ffcd32604b0) at pquery.c:779
#13 0x00000000009331b0 in exec_simple_query (query_string=0x2cd0b30 "cluster gtt using idx1;") at postgres.c:1239
#14 0x00000000009371bc in PostgresMain (argc=1, argv=0x2cfab80, dbname=0x2cfaa78 "postgres", username=0x2cfaa58 "edb") at postgres.c:4315
#15 0x00000000008872a9 in BackendRun (port=0x2cf2b50) at postmaster.c:4510
#16 0x0000000000886a9e in BackendStartup (port=0x2cf2b50) at postmaster.c:4202
#17 0x000000000088301c in ServerLoop () at postmaster.c:1727
#18 0x00000000008828f3 in PostmasterMain (argc=3, argv=0x2ccb460) at postmaster.c:1400
#19 0x0000000000789c54 in main (argc=3, argv=0x2ccb460) at main.c:210
(gdb) 


--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com



--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
曾文旌
Date:

> 2020年4月24日 下午9:03,tushar <tushar.ahuja@enterprisedb.com> 写道:
>
> On 4/22/20 2:49 PM, 曾文旌 wrote:
>>
>> I provide the TRUNCATE tablename DROP to clear the data in the GTT and delete the storage files.
>> This feature requires the current transaction to commit immediately after it finishes truncate.
>>
> Thanks Wenjing , Please refer this scenario
>
> postgres=# create global temp table testing (a int);
> CREATE TABLE
> postgres=# begin;
> BEGIN
> postgres=*# truncate testing;      -- working   [1]
> TRUNCATE TABLE
> postgres=*# truncate testing drop;
> ERROR:  Truncate global temporary table cannot run inside a transaction block    --that is throwing an error claiming
somethingwhich i did  successfully [1] 
The truncate GTT drop was removed.
So the problem goes away.


Wenjing


> postgres=!#
>
> --
> regards,tushar
> EnterpriseDB  https://www.enterprisedb.com/
> The Enterprise PostgreSQL Company


Attachment

Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年4月27日 下午5:26,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi Wenjing,

Please check the below scenario shows different error message with "DROP TABLE gtt;" for gtt with and without index.
-- Session1:
postgres=# create global temporary table gtt1 (c1 int);
CREATE TABLE
postgres=# create global temporary table gtt2 (c1 int);
CREATE TABLE
postgres=# create index idx2 on gtt2(c1);
CREATE INDEX

-- Session2:
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table
postgres=# drop table gtt2;
ERROR:  can not drop index gtt2 when other backend attached this global temp table.
For DROP GTT, we need to drop the index on the table first. 
So the indexes on the GTT are checked first.
But the error message needs to be fixed.
Fixed in v32


wenjing





--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:
Thanks Wenjing, for the fix patch for previous issues.
I have verified the issues, now those fix look good to me.
But the below error message is confusing(for gtt2).

postgres=# drop table gtt1;
ERROR:  cannot drop global temp table gtt1 when other backend attached it.

postgres=# drop table gtt2;
ERROR:  cannot drop index idx2 on global temp table gtt2 when other backend attached it.


I feel the above error message shown for "DROP TABLE gtt2;" is a bit confusing(looks similar to DROP INDEX gtt2;).
If possible, can we keep the error message simple as "ERROR:  cannot drop global temp table gtt2 when other backend attached it."?
I mean, without giving extra information for the index attached to that GTT.

On Mon, Apr 27, 2020 at 5:34 PM 曾文旌 <wenjing.zwj@alibaba-inc.com> wrote:


2020年4月27日 下午5:26,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi Wenjing,

Please check the below scenario shows different error message with "DROP TABLE gtt;" for gtt with and without index.
-- Session1:
postgres=# create global temporary table gtt1 (c1 int);
CREATE TABLE
postgres=# create global temporary table gtt2 (c1 int);
CREATE TABLE
postgres=# create index idx2 on gtt2(c1);
CREATE INDEX

-- Session2:
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table
postgres=# drop table gtt2;
ERROR:  can not drop index gtt2 when other backend attached this global temp table.
For DROP GTT, we need to drop the index on the table first. 
So the indexes on the GTT are checked first.
But the error message needs to be fixed.
Fixed in v32


wenjing





--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com




--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年4月27日 下午9:48,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Thanks Wenjing, for the fix patch for previous issues.
I have verified the issues, now those fix look good to me.
But the below error message is confusing(for gtt2).

postgres=# drop table gtt1;
ERROR:  cannot drop global temp table gtt1 when other backend attached it.

postgres=# drop table gtt2;
ERROR:  cannot drop index idx2 on global temp table gtt2 when other backend attached it.


I feel the above error message shown for "DROP TABLE gtt2;" is a bit confusing(looks similar to DROP INDEX gtt2;).
If possible, can we keep the error message simple as "ERROR:  cannot drop global temp table gtt2 when other backend attached it."?
I mean, without giving extra information for the index attached to that GTT.
Fixed the error message to make the expression more accurate. In v33.


Wenjing




On Mon, Apr 27, 2020 at 5:34 PM 曾文旌 <wenjing.zwj@alibaba-inc.com> wrote:


2020年4月27日 下午5:26,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Hi Wenjing,

Please check the below scenario shows different error message with "DROP TABLE gtt;" for gtt with and without index.
-- Session1:
postgres=# create global temporary table gtt1 (c1 int);
CREATE TABLE
postgres=# create global temporary table gtt2 (c1 int);
CREATE TABLE
postgres=# create index idx2 on gtt2(c1);
CREATE INDEX

-- Session2:
postgres=# drop table gtt1;
ERROR:  can not drop relation gtt1 when other backend attached this global temp table
postgres=# drop table gtt2;
ERROR:  can not drop index gtt2 when other backend attached this global temp table.
For DROP GTT, we need to drop the index on the table first. 
So the indexes on the GTT are checked first.
But the error message needs to be fixed.
Fixed in v32


wenjing





--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com




--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
tushar
Date:
On 4/29/20 8:52 AM, 曾文旌 wrote:
> Fixed the error message to make the expression more accurate. In v33.

Thanks wenjing

Please refer this scenario  , where getting an error while performing 
cluster o/p

1)

X terminal -

postgres=# create global temp table f(n int);
CREATE TABLE

Y Terminal -

postgres=# create index index12 on f(n);
CREATE INDEX
postgres=# \q

X terminal -

postgres=# reindex index  index12;
REINDEX
postgres=#  cluster f using index12;
ERROR:  cannot cluster on invalid index "index12"
postgres=# drop index index12;
DROP INDEX

if this is an expected  , could we try  to make the error message more 
simpler, if possible.

Another issue  -

X terminal -

postgres=# create global temp table f11(n int);
CREATE TABLE
postgres=# create index ind1 on f11(n);
CREATE INDEX
postgres=# create index ind2 on f11(n);
CREATE INDEX
postgres=#

Y terminal -

postgres=# drop table f11;
ERROR:  cannot drop index ind2 or global temporary table f11
HINT:  Because the index is created on the global temporary table and 
other backend attached it.
postgres=#

it is only mentioning about ind2 index but what about ind1 and what if  
- they have lots of indexes ?
i  think - we should not mix index information while dropping the table 
and vice versa.

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company




Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年4月29日 下午7:46,tushar <tushar.ahuja@enterprisedb.com> 写道:

On 4/29/20 8:52 AM, 曾文旌 wrote:
Fixed the error message to make the expression more accurate. In v33.

Thanks wenjing

Please refer this scenario  , where getting an error while performing cluster o/p

1)

X terminal -

postgres=# create global temp table f(n int);
CREATE TABLE

Y Terminal -

postgres=# create index index12 on f(n);
CREATE INDEX
postgres=# \q

X terminal -

postgres=# reindex index  index12;
REINDEX
postgres=#  cluster f using index12;
ERROR:  cannot cluster on invalid index "index12"
postgres=# drop index index12;
DROP INDEX

if this is an expected  , could we try  to make the error message more simpler, if possible.

Another issue  -

X terminal -

postgres=# create global temp table f11(n int);
CREATE TABLE
postgres=# create index ind1 on f11(n);
CREATE INDEX
postgres=# create index ind2 on f11(n);
CREATE INDEX
postgres=#

Y terminal -

postgres=# drop table f11;
ERROR:  cannot drop index ind2 or global temporary table f11
HINT:  Because the index is created on the global temporary table and other backend attached it.
postgres=#

it is only mentioning about ind2 index but what about ind1 and what if  - they have lots of indexes ?
i  think - we should not mix index information while dropping the table and vice versa.
postgres=# drop index index12;
ERROR:  cannot drop index index12 or global temporary table f
HINT:  Because the index is created on the global temporary table and other backend attached it.

postgres=# drop table f;
ERROR:  cannot drop index index12 or global temporary table f
HINT:  Because the index is created on the global temporary table and other backend attached it.
postgres=#

Dropping an index on a GTT and dropping a GTT with an index can both trigger this message, so the message looks like this, and it feels like there's no better way to do it.



Wenjing




--
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Attachment

Re: [Proposal] Global temporary tables

From
Prabhat Sahu
Date:


On Wed, Apr 29, 2020 at 8:52 AM 曾文旌 <wenjing.zwj@alibaba-inc.com> wrote:
2020年4月27日 下午9:48,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Thanks Wenjing, for the fix patch for previous issues.
I have verified the issues, now those fix look good to me.
But the below error message is confusing(for gtt2).

postgres=# drop table gtt1;
ERROR:  cannot drop global temp table gtt1 when other backend attached it.

postgres=# drop table gtt2;
ERROR:  cannot drop index idx2 on global temp table gtt2 when other backend attached it.


I feel the above error message shown for "DROP TABLE gtt2;" is a bit confusing(looks similar to DROP INDEX gtt2;).
If possible, can we keep the error message simple as "ERROR:  cannot drop global temp table gtt2 when other backend attached it."?
I mean, without giving extra information for the index attached to that GTT.
Fixed the error message to make the expression more accurate. In v33.
 
Thanks Wenjing. We verified your latest patch(gtt_v33) focusing on all reported issues and they work fine. 
Thanks.
--

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
曾文旌
Date:

2020年6月9日 下午8:15,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:



On Wed, Apr 29, 2020 at 8:52 AM 曾文旌 <wenjing.zwj@alibaba-inc.com> wrote:
2020年4月27日 下午9:48,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Thanks Wenjing, for the fix patch for previous issues.
I have verified the issues, now those fix look good to me.
But the below error message is confusing(for gtt2).

postgres=# drop table gtt1;
ERROR:  cannot drop global temp table gtt1 when other backend attached it.

postgres=# drop table gtt2;
ERROR:  cannot drop index idx2 on global temp table gtt2 when other backend attached it.


I feel the above error message shown for "DROP TABLE gtt2;" is a bit confusing(looks similar to DROP INDEX gtt2;).
If possible, can we keep the error message simple as "ERROR:  cannot drop global temp table gtt2 when other backend attached it."?
I mean, without giving extra information for the index attached to that GTT.
Fixed the error message to make the expression more accurate. In v33.
 
Thanks Wenjing. We verified your latest patch(gtt_v33) focusing on all reported issues and they work fine. 
Thanks.
--

I'm very glad to hear such good news.
I am especially grateful for your professional work on GTT.
Please feel free to let me know if there is anything you think could be improved.


Thanks.


Wenjing

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:
Hi

čt 11. 6. 2020 v 4:13 odesílatel 曾文旌 <wenjing.zwj@alibaba-inc.com> napsal:

2020年6月9日 下午8:15,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:



On Wed, Apr 29, 2020 at 8:52 AM 曾文旌 <wenjing.zwj@alibaba-inc.com> wrote:
2020年4月27日 下午9:48,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Thanks Wenjing, for the fix patch for previous issues.
I have verified the issues, now those fix look good to me.
But the below error message is confusing(for gtt2).

postgres=# drop table gtt1;
ERROR:  cannot drop global temp table gtt1 when other backend attached it.

postgres=# drop table gtt2;
ERROR:  cannot drop index idx2 on global temp table gtt2 when other backend attached it.


I feel the above error message shown for "DROP TABLE gtt2;" is a bit confusing(looks similar to DROP INDEX gtt2;).
If possible, can we keep the error message simple as "ERROR:  cannot drop global temp table gtt2 when other backend attached it."?
I mean, without giving extra information for the index attached to that GTT.
Fixed the error message to make the expression more accurate. In v33.
 
Thanks Wenjing. We verified your latest patch(gtt_v33) focusing on all reported issues and they work fine. 
Thanks.
--

I'm very glad to hear such good news.
I am especially grateful for your professional work on GTT.
Please feel free to let me know if there is anything you think could be improved.


Thanks.


Wenjing

this patch needs rebase

Regards

Pavel


With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年7月6日 下午11:31,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi

čt 11. 6. 2020 v 4:13 odesílatel 曾文旌 <wenjing.zwj@alibaba-inc.com> napsal:

2020年6月9日 下午8:15,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:



On Wed, Apr 29, 2020 at 8:52 AM 曾文旌 <wenjing.zwj@alibaba-inc.com> wrote:
2020年4月27日 下午9:48,Prabhat Sahu <prabhat.sahu@enterprisedb.com> 写道:

Thanks Wenjing, for the fix patch for previous issues.
I have verified the issues, now those fix look good to me.
But the below error message is confusing(for gtt2).

postgres=# drop table gtt1;
ERROR:  cannot drop global temp table gtt1 when other backend attached it.

postgres=# drop table gtt2;
ERROR:  cannot drop index idx2 on global temp table gtt2 when other backend attached it.


I feel the above error message shown for "DROP TABLE gtt2;" is a bit confusing(looks similar to DROP INDEX gtt2;).
If possible, can we keep the error message simple as "ERROR:  cannot drop global temp table gtt2 when other backend attached it."?
I mean, without giving extra information for the index attached to that GTT.
Fixed the error message to make the expression more accurate. In v33.
 
Thanks Wenjing. We verified your latest patch(gtt_v33) focusing on all reported issues and they work fine. 
Thanks.
--

I'm very glad to hear such good news.
I am especially grateful for your professional work on GTT.
Please feel free to let me know if there is anything you think could be improved.


Thanks.


Wenjing

this patch needs rebase

GTT Merge the latest PGMaster and resolves conflicts.


Wenjing











Regards

Pavel


With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com



Attachment

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:
Hi
 
GTT Merge the latest PGMaster and resolves conflicts.



I tested it and it looks fine. I think it is very usable in current form, but still there are some issues:

postgres=# create global temp table foo(a int);
CREATE TABLE
postgres=# insert into foo values(10);
INSERT 0 1
postgres=# alter table foo add column x int;
ALTER TABLE
postgres=# analyze foo;
WARNING:  reloid 16400 not support update attstat after add colunm
WARNING:  reloid 16400 not support update attstat after add colunm
ANALYZE

Please, can you summarize what is done, what limits are there, what can be implemented hard, what can be implemented easily?



I found one open question - how can be implemented table locks - because data is physically separated, then we don't need table locks as protection against race conditions.

Now, table locks are implemented on a global level. So exclusive lock on GTT in one session block insertion on the second session. Is it expected behaviour? It is safe, but maybe it is too strict.

We should define what table lock is meaning on GTT.

Regards

Pavel
 
Pavel


With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com



Re: [Proposal] Global temporary tables

From
wenjing zeng
Date:
HI all

I started using my personal email to respond to community issue.



2020年7月7日 下午6:05,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi
 
GTT Merge the latest PGMaster and resolves conflicts.



I tested it and it looks fine. I think it is very usable in current form, but still there are some issues:

postgres=# create global temp table foo(a int);
CREATE TABLE
postgres=# insert into foo values(10);
INSERT 0 1
postgres=# alter table foo add column x int;
ALTER TABLE
postgres=# analyze foo;
WARNING:  reloid 16400 not support update attstat after add colunm
WARNING:  reloid 16400 not support update attstat after add colunm
ANALYZE
This is a limitation that we can completely eliminate.


Please, can you summarize what is done, what limits are there, what can be implemented hard, what can be implemented easily?
Sure.

The current version of the GTT implementation supports all regular table operations.
1 what is done
1.1 insert/update/delete on GTT.
1.2 The GTT supports all types of indexes, and the query statement supports the use of GTT indexes to speed up the reading of data in the GTT.
1.3 GTT statistics keep a copy of THE GTT local statistics, which are provided to the optimizer to choose the best query plan.
1.4 analyze vacuum GTT.
1.5 truncate cluster GTT.
1.6 all DDL on GTT.
1.7 GTT table can use  GTT sequence  or Regular sequence.
1.8 Support for creating views on GTT.
1.9 Support for creating views on foreign key.
1.10 support global temp partition.

I feel like I cover all the necessary GTT requirements.

For cluster GTT,I think it's complicated.
I'm not sure the current implementation is quite reasonable. Maybe you can help review it.





I found one open question - how can be implemented table locks - because data is physically separated, then we don't need table locks as protection against race conditions.
Yes, but GTT’s DML DDL still requires table locking.
1 The DML requires table locks (RowExclusiveLock) to ensure that 
definitions do not change during run time (the DDL may modify or delete them).
This part of the implementation does not actually change the code,
because the DML on GTT does not block each other between sessions.

2 For truncate/analyze/vacuum reinidex cluster GTT is now like DML, 
they only modify local data and do not modify the GTT definition.
So I lowered the table lock level held by the GTT, only need RowExclusiveLock.

3 For DDLs that need to be modified the GTT table definition(Drop GTT Alter GTT), 
an exclusive level of table locking is required(AccessExclusiveLock), 
as is the case for regular table.
This part of the implementation also does not actually change the code.

Summary: What I have done is to adjust the GTT lock levels in different types of statements based on the above thinking.
For example, truncate GTT, I'm reducing the GTT holding table lock level to RowExclusiveLock,
So We can truncate data in the same GTT between different sessions at the same time.

What do you think about table locks on GTT?


Wenjing



Now, table locks are implemented on a global level. So exclusive lock on GTT in one session block insertion on the second session. Is it expected behaviour? It is safe, but maybe it is too strict.

We should define what table lock is meaning on GTT.

Regards

Pavel
 
Pavel


With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com




Re: [Proposal] Global temporary tables

From
wenjing zeng
Date:


2020年7月10日 下午5:03,wenjing zeng <wjzeng2012@gmail.com> 写道:

HI all

I started using my personal email to respond to community issue.



2020年7月7日 下午6:05,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi
 
GTT Merge the latest PGMaster and resolves conflicts.



I tested it and it looks fine. I think it is very usable in current form, but still there are some issues:

postgres=# create global temp table foo(a int);
CREATE TABLE
postgres=# insert into foo values(10);
INSERT 0 1
postgres=# alter table foo add column x int;
ALTER TABLE
postgres=# analyze foo;
WARNING:  reloid 16400 not support update attstat after add colunm
WARNING:  reloid 16400 not support update attstat after add colunm
ANALYZE
This is a limitation that we can completely eliminate.


Please, can you summarize what is done, what limits are there, what can be implemented hard, what can be implemented easily?
Sure.

The current version of the GTT implementation supports all regular table operations.
1 what is done
1.1 insert/update/delete on GTT.
1.2 The GTT supports all types of indexes, and the query statement supports the use of GTT indexes to speed up the reading of data in the GTT.
1.3 GTT statistics keep a copy of THE GTT local statistics, which are provided to the optimizer to choose the best query plan.
1.4 analyze vacuum GTT.
1.5 truncate cluster GTT.
1.6 all DDL on GTT.
1.7 GTT table can use  GTT sequence  or Regular sequence.
1.8 Support for creating views on GTT.
1.9 Support for creating views on foreign key.
1.10 support global temp partition.

I feel like I cover all the necessary GTT requirements.

For cluster GTT,I think it's complicated.
I'm not sure the current implementation is quite reasonable. Maybe you can help review it.





I found one open question - how can be implemented table locks - because data is physically separated, then we don't need table locks as protection against race conditions.
Yes, but GTT’s DML DDL still requires table locking.
1 The DML requires table locks (RowExclusiveLock) to ensure that 
definitions do not change during run time (the DDL may modify or delete them).
This part of the implementation does not actually change the code,
because the DML on GTT does not block each other between sessions.
As a side note, since the same row of GTT data can not modified by different sessions,
So, I don't see the need to care the GTT's PG_class.relminmxID.
What do you think?


Wenjing



2 For truncate/analyze/vacuum reinidex cluster GTT is now like DML, 
they only modify local data and do not modify the GTT definition.
So I lowered the table lock level held by the GTT, only need RowExclusiveLock.

3 For DDLs that need to be modified the GTT table definition(Drop GTT Alter GTT), 
an exclusive level of table locking is required(AccessExclusiveLock), 
as is the case for regular table.
This part of the implementation also does not actually change the code.

Summary: What I have done is to adjust the GTT lock levels in different types of statements based on the above thinking.
For example, truncate GTT, I'm reducing the GTT holding table lock level to RowExclusiveLock,
So We can truncate data in the same GTT between different sessions at the same time.

What do you think about table locks on GTT?


Wenjing



Now, table locks are implemented on a global level. So exclusive lock on GTT in one session block insertion on the second session. Is it expected behaviour? It is safe, but maybe it is too strict.

We should define what table lock is meaning on GTT.

Regards

Pavel
 
Pavel


With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com





Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


po 13. 7. 2020 v 13:59 odesílatel wenjing zeng <wjzeng2012@gmail.com> napsal:


2020年7月10日 下午5:03,wenjing zeng <wjzeng2012@gmail.com> 写道:

HI all

I started using my personal email to respond to community issue.



2020年7月7日 下午6:05,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi
 
GTT Merge the latest PGMaster and resolves conflicts.



I tested it and it looks fine. I think it is very usable in current form, but still there are some issues:

postgres=# create global temp table foo(a int);
CREATE TABLE
postgres=# insert into foo values(10);
INSERT 0 1
postgres=# alter table foo add column x int;
ALTER TABLE
postgres=# analyze foo;
WARNING:  reloid 16400 not support update attstat after add colunm
WARNING:  reloid 16400 not support update attstat after add colunm
ANALYZE
This is a limitation that we can completely eliminate.


Please, can you summarize what is done, what limits are there, what can be implemented hard, what can be implemented easily?
Sure.

The current version of the GTT implementation supports all regular table operations.
1 what is done
1.1 insert/update/delete on GTT.
1.2 The GTT supports all types of indexes, and the query statement supports the use of GTT indexes to speed up the reading of data in the GTT.
1.3 GTT statistics keep a copy of THE GTT local statistics, which are provided to the optimizer to choose the best query plan.
1.4 analyze vacuum GTT.
1.5 truncate cluster GTT.
1.6 all DDL on GTT.
1.7 GTT table can use  GTT sequence  or Regular sequence.
1.8 Support for creating views on GTT.
1.9 Support for creating views on foreign key.
1.10 support global temp partition.

I feel like I cover all the necessary GTT requirements.

For cluster GTT,I think it's complicated.
I'm not sure the current implementation is quite reasonable. Maybe you can help review it.





I found one open question - how can be implemented table locks - because data is physically separated, then we don't need table locks as protection against race conditions.
Yes, but GTT’s DML DDL still requires table locking.
1 The DML requires table locks (RowExclusiveLock) to ensure that 
definitions do not change during run time (the DDL may modify or delete them).
This part of the implementation does not actually change the code,
because the DML on GTT does not block each other between sessions.
As a side note, since the same row of GTT data can not modified by different sessions,
So, I don't see the need to care the GTT's PG_class.relminmxID.
What do you think?

yes, probably it is not necessary

Regards

Pavel


Wenjing



2 For truncate/analyze/vacuum reinidex cluster GTT is now like DML, 
they only modify local data and do not modify the GTT definition.
So I lowered the table lock level held by the GTT, only need RowExclusiveLock.

3 For DDLs that need to be modified the GTT table definition(Drop GTT Alter GTT), 
an exclusive level of table locking is required(AccessExclusiveLock), 
as is the case for regular table.
This part of the implementation also does not actually change the code.

Summary: What I have done is to adjust the GTT lock levels in different types of statements based on the above thinking.
For example, truncate GTT, I'm reducing the GTT holding table lock level to RowExclusiveLock,
So We can truncate data in the same GTT between different sessions at the same time.

What do you think about table locks on GTT?


Wenjing



Now, table locks are implemented on a global level. So exclusive lock on GTT in one session block insertion on the second session. Is it expected behaviour? It is safe, but maybe it is too strict.

We should define what table lock is meaning on GTT.

Regards

Pavel
 
Pavel


With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com





Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


pá 10. 7. 2020 v 11:04 odesílatel wenjing zeng <wjzeng2012@gmail.com> napsal:
HI all

I started using my personal email to respond to community issue.



2020年7月7日 下午6:05,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi
 
GTT Merge the latest PGMaster and resolves conflicts.



I tested it and it looks fine. I think it is very usable in current form, but still there are some issues:

postgres=# create global temp table foo(a int);
CREATE TABLE
postgres=# insert into foo values(10);
INSERT 0 1
postgres=# alter table foo add column x int;
ALTER TABLE
postgres=# analyze foo;
WARNING:  reloid 16400 not support update attstat after add colunm
WARNING:  reloid 16400 not support update attstat after add colunm
ANALYZE
This is a limitation that we can completely eliminate.


Please, can you summarize what is done, what limits are there, what can be implemented hard, what can be implemented easily?
Sure.

The current version of the GTT implementation supports all regular table operations.
1 what is done
1.1 insert/update/delete on GTT.
1.2 The GTT supports all types of indexes, and the query statement supports the use of GTT indexes to speed up the reading of data in the GTT.
1.3 GTT statistics keep a copy of THE GTT local statistics, which are provided to the optimizer to choose the best query plan.
1.4 analyze vacuum GTT.
1.5 truncate cluster GTT.
1.6 all DDL on GTT.
1.7 GTT table can use  GTT sequence  or Regular sequence.
1.8 Support for creating views on GTT.
1.9 Support for creating views on foreign key.
1.10 support global temp partition.

I feel like I cover all the necessary GTT requirements.

For cluster GTT,I think it's complicated.
I'm not sure the current implementation is quite reasonable. Maybe you can help review it.





I found one open question - how can be implemented table locks - because data is physically separated, then we don't need table locks as protection against race conditions.
Yes, but GTT’s DML DDL still requires table locking.
1 The DML requires table locks (RowExclusiveLock) to ensure that 
definitions do not change during run time (the DDL may modify or delete them).
This part of the implementation does not actually change the code,
because the DML on GTT does not block each other between sessions.

2 For truncate/analyze/vacuum reinidex cluster GTT is now like DML, 
they only modify local data and do not modify the GTT definition.
So I lowered the table lock level held by the GTT, only need RowExclusiveLock.

3 For DDLs that need to be modified the GTT table definition(Drop GTT Alter GTT), 
an exclusive level of table locking is required(AccessExclusiveLock), 
as is the case for regular table.
This part of the implementation also does not actually change the code.

Summary: What I have done is to adjust the GTT lock levels in different types of statements based on the above thinking.
For example, truncate GTT, I'm reducing the GTT holding table lock level to RowExclusiveLock,
So We can truncate data in the same GTT between different sessions at the same time.

What do you think about table locks on GTT?

I am thinking about explicit LOCK statements. Some applications use explicit locking from some reasons - typically as protection against race conditions.

But on GTT race conditions are not possible. So my question is - does the exclusive lock on GTT  protection other sessions do insert into their own instances of the same GTT?

What is a level where table locks are active? shared part of GTT or session instance part of GTT?





Wenjing



Now, table locks are implemented on a global level. So exclusive lock on GTT in one session block insertion on the second session. Is it expected behaviour? It is safe, but maybe it is too strict.

We should define what table lock is meaning on GTT.

Regards

Pavel
 
Pavel


With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com




Re: [Proposal] Global temporary tables

From
wenjing zeng
Date:


2020年7月14日 下午10:28,Pavel Stehule <pavel.stehule@gmail.com> 写道:



pá 10. 7. 2020 v 11:04 odesílatel wenjing zeng <wjzeng2012@gmail.com> napsal:
HI all

I started using my personal email to respond to community issue.



2020年7月7日 下午6:05,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi
 
GTT Merge the latest PGMaster and resolves conflicts.



I tested it and it looks fine. I think it is very usable in current form, but still there are some issues:

postgres=# create global temp table foo(a int);
CREATE TABLE
postgres=# insert into foo values(10);
INSERT 0 1
postgres=# alter table foo add column x int;
ALTER TABLE
postgres=# analyze foo;
WARNING:  reloid 16400 not support update attstat after add colunm
WARNING:  reloid 16400 not support update attstat after add colunm
ANALYZE
This is a limitation that we can completely eliminate.


Please, can you summarize what is done, what limits are there, what can be implemented hard, what can be implemented easily?
Sure.

The current version of the GTT implementation supports all regular table operations.
1 what is done
1.1 insert/update/delete on GTT.
1.2 The GTT supports all types of indexes, and the query statement supports the use of GTT indexes to speed up the reading of data in the GTT.
1.3 GTT statistics keep a copy of THE GTT local statistics, which are provided to the optimizer to choose the best query plan.
1.4 analyze vacuum GTT.
1.5 truncate cluster GTT.
1.6 all DDL on GTT.
1.7 GTT table can use  GTT sequence  or Regular sequence.
1.8 Support for creating views on GTT.
1.9 Support for creating views on foreign key.
1.10 support global temp partition.

I feel like I cover all the necessary GTT requirements.

For cluster GTT,I think it's complicated.
I'm not sure the current implementation is quite reasonable. Maybe you can help review it.





I found one open question - how can be implemented table locks - because data is physically separated, then we don't need table locks as protection against race conditions.
Yes, but GTT’s DML DDL still requires table locking.
1 The DML requires table locks (RowExclusiveLock) to ensure that 
definitions do not change during run time (the DDL may modify or delete them).
This part of the implementation does not actually change the code,
because the DML on GTT does not block each other between sessions.

2 For truncate/analyze/vacuum reinidex cluster GTT is now like DML, 
they only modify local data and do not modify the GTT definition.
So I lowered the table lock level held by the GTT, only need RowExclusiveLock.

3 For DDLs that need to be modified the GTT table definition(Drop GTT Alter GTT), 
an exclusive level of table locking is required(AccessExclusiveLock), 
as is the case for regular table.
This part of the implementation also does not actually change the code.

Summary: What I have done is to adjust the GTT lock levels in different types of statements based on the above thinking.
For example, truncate GTT, I'm reducing the GTT holding table lock level to RowExclusiveLock,
So We can truncate data in the same GTT between different sessions at the same time.

What do you think about table locks on GTT?

I am thinking about explicit LOCK statements. Some applications use explicit locking from some reasons - typically as protection against race conditions.

But on GTT race conditions are not possible. So my question is - does the exclusive lock on GTT  protection other sessions do insert into their own instances of the same GTT?
In my opinion, with a GTT, always work on the private data of the session, 
there is no need to do anything by holding the lock, so the lock statement should do nothing (The same is true for ORACLE GTT)

What do you think?


What is a level where table locks are active? shared part of GTT or session instance part of GTT?
I don't quite understand what you mean, could you explain it a little bit?



Wenjing








Wenjing



Now, table locks are implemented on a global level. So exclusive lock on GTT in one session block insertion on the second session. Is it expected behaviour? It is safe, but maybe it is too strict.

We should define what table lock is meaning on GTT.

Regards

Pavel
 
Pavel


With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com





Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:

I am thinking about explicit LOCK statements. Some applications use explicit locking from some reasons - typically as protection against race conditions.

But on GTT race conditions are not possible. So my question is - does the exclusive lock on GTT  protection other sessions do insert into their own instances of the same GTT?
In my opinion, with a GTT, always work on the private data of the session, 
there is no need to do anything by holding the lock, so the lock statement should do nothing (The same is true for ORACLE GTT)

What do you think?


What is a level where table locks are active? shared part of GTT or session instance part of GTT?
I don't quite understand what you mean, could you explain it a little bit?

It is about perspective, how we should see GTT tables. GTT table is a mix of two concepts - session private (data), and session shared (catalog). And hypothetically we can place locks to the private part (no effect) or shared part (usual effect how we know it). But can has sense, and both have an advantage and disadvantage. I afraid little bit about behaviour of stupid ORM systems - but the most important part of table are data - and then I prefer empty lock implementation for GTT.

Regards

Pavel





Wenjing








Wenjing



Now, table locks are implemented on a global level. So exclusive lock on GTT in one session block insertion on the second session. Is it expected behaviour? It is safe, but maybe it is too strict.

We should define what table lock is meaning on GTT.

Regards

Pavel
 
Pavel


With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com





Re: [Proposal] Global temporary tables

From
wenjing zeng
Date:


2020年7月23日 下午2:54,Pavel Stehule <pavel.stehule@gmail.com> 写道:


I am thinking about explicit LOCK statements. Some applications use explicit locking from some reasons - typically as protection against race conditions.

But on GTT race conditions are not possible. So my question is - does the exclusive lock on GTT  protection other sessions do insert into their own instances of the same GTT?
In my opinion, with a GTT, always work on the private data of the session, 
there is no need to do anything by holding the lock, so the lock statement should do nothing (The same is true for ORACLE GTT)

What do you think?


What is a level where table locks are active? shared part of GTT or session instance part of GTT?
I don't quite understand what you mean, could you explain it a little bit?

It is about perspective, how we should see GTT tables. GTT table is a mix of two concepts - session private (data), and session shared (catalog). And hypothetically we can place locks to the private part (no effect) or shared part (usual effect how we know it). But can has sense, and both have an advantage and disadvantage. I afraid little bit about behaviour of stupid ORM systems - but the most important part of table are data - and then I prefer empty lock implementation for GTT.
This is empty lock implementation for GTT.

Please continue to review the code.

Thanks


Wenjing



Regards

Pavel





Wenjing








Wenjing



Now, table locks are implemented on a global level. So exclusive lock on GTT in one session block insertion on the second session. Is it expected behaviour? It is safe, but maybe it is too strict.

We should define what table lock is meaning on GTT.

Regards

Pavel
 
Pavel


With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com






Attachment

Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Thu, Jul 30, 2020 at 8:09 AM wenjing zeng <wjzeng2012@gmail.com> wrote:
> Please continue to review the code.

This patch is pretty light on comments. Many of the new functions have
no header comments, for example. There are comments here and there in
the body of the new functions that are added, and in places where
existing code is changed there are comments here and there, but
overall it's not a whole lot. There's no documentation and no README,
either. Since this adds a new feature and a bunch of new SQL-callable
functions that interact with that feature, the feature itself should
be documented, along with its limitations and the new SQL-callable
functions that interact with it. I think there should be either a
lengthy comment in some suitable file, or maybe various comments in
various files, or else a README file, that clearly sets out the major
design principles behind the patch, and explaining also what that
means in terms of features and limitations. Without that, it's really
hard for anyone to jump into reviewing this code, and it will be hard
for people who have to maintain it in the future to understand it,
either. Or for users, for that matter.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [Proposal] Global temporary tables

From
曾文旌
Date:

> 2020年7月31日 上午4:57,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Thu, Jul 30, 2020 at 8:09 AM wenjing zeng <wjzeng2012@gmail.com> wrote:
>> Please continue to review the code.
>
> This patch is pretty light on comments. Many of the new functions have
> no header comments, for example. There are comments here and there in
> the body of the new functions that are added, and in places where
> existing code is changed there are comments here and there, but
> overall it's not a whole lot. There's no documentation and no README,
> either. Since this adds a new feature and a bunch of new SQL-callable
> functions that interact with that feature, the feature itself should
> be documented, along with its limitations and the new SQL-callable
> functions that interact with it. I think there should be either a
> lengthy comment in some suitable file, or maybe various comments in
> various files, or else a README file, that clearly sets out the major
> design principles behind the patch, and explaining also what that
> means in terms of features and limitations. Without that, it's really
> hard for anyone to jump into reviewing this code, and it will be hard
> for people who have to maintain it in the future to understand it,
> either. Or for users, for that matter.
Your suggestion is to the point. I do lack a lot of comments, as is necessary.
I'll do this.


Wenjing



>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company


Attachment

Re: [Proposal] Global temporary tables

From
曾文旌
Date:
I have written the README for the GTT, which contains the GTT requirements and design.
I found that compared to my first email a year ago, many GTT Limitations are now gone.
Now, I'm adding comments to some of the necessary functions.


Wenjing





> 2020年7月31日 上午4:57,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Thu, Jul 30, 2020 at 8:09 AM wenjing zeng <wjzeng2012@gmail.com> wrote:
>> Please continue to review the code.
>
> This patch is pretty light on comments. Many of the new functions have
> no header comments, for example. There are comments here and there in
> the body of the new functions that are added, and in places where
> existing code is changed there are comments here and there, but
> overall it's not a whole lot. There's no documentation and no README,
> either. Since this adds a new feature and a bunch of new SQL-callable
> functions that interact with that feature, the feature itself should
> be documented, along with its limitations and the new SQL-callable
> functions that interact with it. I think there should be either a
> lengthy comment in some suitable file, or maybe various comments in
> various files, or else a README file, that clearly sets out the major
> design principles behind the patch, and explaining also what that
> means in terms of features and limitations. Without that, it's really
> hard for anyone to jump into reviewing this code, and it will be hard
> for people who have to maintain it in the future to understand it,
> either. Or for users, for that matter.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company


Attachment

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:
Hi

pá 11. 9. 2020 v 17:00 odesílatel 曾文旌 <wenjing.zwj@alibaba-inc.com> napsal:
I have written the README for the GTT, which contains the GTT requirements and design.
I found that compared to my first email a year ago, many GTT Limitations are now gone.
Now, I'm adding comments to some of the necessary functions.

There are problems with patching. Please, can you rebase your patch?

Regards

Pavel



Wenjing





> 2020年7月31日 上午4:57,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Thu, Jul 30, 2020 at 8:09 AM wenjing zeng <wjzeng2012@gmail.com> wrote:
>> Please continue to review the code.
>
> This patch is pretty light on comments. Many of the new functions have
> no header comments, for example. There are comments here and there in
> the body of the new functions that are added, and in places where
> existing code is changed there are comments here and there, but
> overall it's not a whole lot. There's no documentation and no README,
> either. Since this adds a new feature and a bunch of new SQL-callable
> functions that interact with that feature, the feature itself should
> be documented, along with its limitations and the new SQL-callable
> functions that interact with it. I think there should be either a
> lengthy comment in some suitable file, or maybe various comments in
> various files, or else a README file, that clearly sets out the major
> design principles behind the patch, and explaining also what that
> means in terms of features and limitations. Without that, it's really
> hard for anyone to jump into reviewing this code, and it will be hard
> for people who have to maintain it in the future to understand it,
> either. Or for users, for that matter.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年11月21日 02:28,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi

pá 11. 9. 2020 v 17:00 odesílatel 曾文旌 <wenjing.zwj@alibaba-inc.com> napsal:
I have written the README for the GTT, which contains the GTT requirements and design.
I found that compared to my first email a year ago, many GTT Limitations are now gone.
Now, I'm adding comments to some of the necessary functions.

There are problems with patching. Please, can you rebase your patch?
Sure.
I'm still working on sort code and comments.
If you have any suggestions, please let me know.


Wenjing



Regards

Pavel



Wenjing





> 2020年7月31日 上午4:57,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Thu, Jul 30, 2020 at 8:09 AM wenjing zeng <wjzeng2012@gmail.com> wrote:
>> Please continue to review the code.
>
> This patch is pretty light on comments. Many of the new functions have
> no header comments, for example. There are comments here and there in
> the body of the new functions that are added, and in places where
> existing code is changed there are comments here and there, but
> overall it's not a whole lot. There's no documentation and no README,
> either. Since this adds a new feature and a bunch of new SQL-callable
> functions that interact with that feature, the feature itself should
> be documented, along with its limitations and the new SQL-callable
> functions that interact with it. I think there should be either a
> lengthy comment in some suitable file, or maybe various comments in
> various files, or else a README file, that clearly sets out the major
> design principles behind the patch, and explaining also what that
> means in terms of features and limitations. Without that, it's really
> hard for anyone to jump into reviewing this code, and it will be hard
> for people who have to maintain it in the future to understand it,
> either. Or for users, for that matter.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company


Attachment

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


po 23. 11. 2020 v 10:27 odesílatel 曾文旌 <wenjing.zwj@alibaba-inc.com> napsal:


2020年11月21日 02:28,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi

pá 11. 9. 2020 v 17:00 odesílatel 曾文旌 <wenjing.zwj@alibaba-inc.com> napsal:
I have written the README for the GTT, which contains the GTT requirements and design.
I found that compared to my first email a year ago, many GTT Limitations are now gone.
Now, I'm adding comments to some of the necessary functions.

There are problems with patching. Please, can you rebase your patch?
Sure.
I'm still working on sort code and comments.
If you have any suggestions, please let me know.

It is broken again

There is bad white space

+   /*
+    * For global temp table only
+    * use ShareUpdateExclusiveLock for ensure safety
+    */
+   {
+       {
+           "on_commit_delete_rows",
+           "global temp table on commit options",
+           RELOPT_KIND_HEAP | RELOPT_KIND_PARTITIONED,
+           ShareUpdateExclusiveLock
+       },
+       true
+   },  <=================
    /* list terminator */
    {{NULL}}

+7 OTHERS
+Parallel query
+Planner does not produce parallel query plans for SQL related to GTT. Because <=================
+GTT private data cannot be accessed across processes.
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile


+/*
+ * Update global temp table relstats(relpage/reltuple/relallvisible) <========================
+ * to local hashtable
+ */
+void

+/*
+ * Search global temp table relstats(relpage/reltuple/relallvisible) <==============
+ * from lo

and there are lot of more places ...

I found other issue

postgres=# create global temp table foo(a int);
CREATE TABLE
postgres=# create index on foo(a);
CREATE INDEX

close session and in new session

postgres=# reindex index foo_a_idx ;
WARNING:  relcache reference leak: relation "foo" not closed
REINDEX

Regards

Pavel




Wenjing



Regards

Pavel



Wenjing





> 2020年7月31日 上午4:57,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Thu, Jul 30, 2020 at 8:09 AM wenjing zeng <wjzeng2012@gmail.com> wrote:
>> Please continue to review the code.
>
> This patch is pretty light on comments. Many of the new functions have
> no header comments, for example. There are comments here and there in
> the body of the new functions that are added, and in places where
> existing code is changed there are comments here and there, but
> overall it's not a whole lot. There's no documentation and no README,
> either. Since this adds a new feature and a bunch of new SQL-callable
> functions that interact with that feature, the feature itself should
> be documented, along with its limitations and the new SQL-callable
> functions that interact with it. I think there should be either a
> lengthy comment in some suitable file, or maybe various comments in
> various files, or else a README file, that clearly sets out the major
> design principles behind the patch, and explaining also what that
> means in terms of features and limitations. Without that, it's really
> hard for anyone to jump into reviewing this code, and it will be hard
> for people who have to maintain it in the future to understand it,
> either. Or for users, for that matter.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company


Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2020年11月25日 14:19,Pavel Stehule <pavel.stehule@gmail.com> 写道:



po 23. 11. 2020 v 10:27 odesílatel 曾文旌 <wenjing.zwj@alibaba-inc.com> napsal:


2020年11月21日 02:28,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi

pá 11. 9. 2020 v 17:00 odesílatel 曾文旌 <wenjing.zwj@alibaba-inc.com> napsal:
I have written the README for the GTT, which contains the GTT requirements and design.
I found that compared to my first email a year ago, many GTT Limitations are now gone.
Now, I'm adding comments to some of the necessary functions.

There are problems with patching. Please, can you rebase your patch?
Sure.
I'm still working on sort code and comments.
If you have any suggestions, please let me know.

It is broken again

There is bad white space

+   /*
+    * For global temp table only
+    * use ShareUpdateExclusiveLock for ensure safety
+    */
+   {
+       {
+           "on_commit_delete_rows",
+           "global temp table on commit options",
+           RELOPT_KIND_HEAP | RELOPT_KIND_PARTITIONED,
+           ShareUpdateExclusiveLock
+       },
+       true
+   },  <=================
    /* list terminator */
    {{NULL}}

+7 OTHERS
+Parallel query
+Planner does not produce parallel query plans for SQL related to GTT. Because <=================
+GTT private data cannot be accessed across processes.
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile


+/*
+ * Update global temp table relstats(relpage/reltuple/relallvisible) <========================
+ * to local hashtable
+ */
+void

+/*
+ * Search global temp table relstats(relpage/reltuple/relallvisible) <==============
+ * from lo

and there are lot of more places ...

I found other issue

postgres=# create global temp table foo(a int);
CREATE TABLE
postgres=# create index on foo(a);
CREATE INDEX

close session and in new session

postgres=# reindex index foo_a_idx ;
WARNING:  relcache reference leak: relation "foo" not closed
REINDEX

I fixed all the above issues and rebase code.
Please review the new version code again.


Wenjing




Regards

Pavel




Wenjing



Regards

Pavel



Wenjing





> 2020年7月31日 上午4:57,Robert Haas <robertmhaas@gmail.com> 写道:
>
> On Thu, Jul 30, 2020 at 8:09 AM wenjing zeng <wjzeng2012@gmail.com> wrote:
>> Please continue to review the code.
>
> This patch is pretty light on comments. Many of the new functions have
> no header comments, for example. There are comments here and there in
> the body of the new functions that are added, and in places where
> existing code is changed there are comments here and there, but
> overall it's not a whole lot. There's no documentation and no README,
> either. Since this adds a new feature and a bunch of new SQL-callable
> functions that interact with that feature, the feature itself should
> be documented, along with its limitations and the new SQL-callable
> functions that interact with it. I think there should be either a
> lengthy comment in some suitable file, or maybe various comments in
> various files, or else a README file, that clearly sets out the major
> design principles behind the patch, and explaining also what that
> means in terms of features and limitations. Without that, it's really
> hard for anyone to jump into reviewing this code, and it will be hard
> for people who have to maintain it in the future to understand it,
> either. Or for users, for that matter.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company



Attachment

Re: [Proposal] Global temporary tables

From
曾文旌
Date:

I found that the new Patch mail failed to register to Commitfest
I don't know what's wrong and how to check it?
Could you help me figure it out?



2020年11月25日 14:19,Pavel Stehule <pavel.stehule@gmail.com> 写道:



po 23. 11. 2020 v 10:27 odesílatel 曾文旌 <wenjing.zwj@alibaba-inc.com> napsal:


2020年11月21日 02:28,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi

pá 11. 9. 2020 v 17:00 odesílatel 曾文旌 <wenjing.zwj@alibaba-inc.com> napsal:
I have written the README for the GTT, which contains the GTT requirements and design.
I found that compared to my first email a year ago, many GTT Limitations are now gone.
Now, I'm adding comments to some of the necessary functions.

There are problems with patching. Please, can you rebase your patch?
Sure.
I'm still working on sort code and comments.
If you have any suggestions, please let me know.

It is broken again 

There is bad white space

+   /*
+    * For global temp table only
+    * use ShareUpdateExclusiveLock for ensure safety
+    */
+   {
+       {
+           "on_commit_delete_rows",
+           "global temp table on commit options",
+           RELOPT_KIND_HEAP | RELOPT_KIND_PARTITIONED,
+           ShareUpdateExclusiveLock
+       },
+       true
+   },  <=================
    /* list terminator */
    {{NULL}}

+7 OTHERS
+Parallel query
+Planner does not produce parallel query plans for SQL related to GTT. Because <=================
+GTT private data cannot be accessed across processes.
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile


+/*
+ * Update global temp table relstats(relpage/reltuple/relallvisible) <========================
+ * to local hashtable
+ */
+void

+/*
+ * Search global temp table relstats(relpage/reltuple/relallvisible) <==============
+ * from lo

and there are lot of more places ...

I found other issue

postgres=# create global temp table foo(a int);
CREATE TABLE
postgres=# create index on foo(a);
CREATE INDEX

close session and in new session

postgres=# reindex index foo_a_idx ;
WARNING:  relcache reference leak: relation "foo" not closed
REINDEX

Regards

Pavel




Wenjing



Regards

Pavel



Wenjing





> 2020年7月31日 上午4:57,Robert Haas <robertmhaas@gmail.com> 写道:
> 
> On Thu, Jul 30, 2020 at 8:09 AM wenjing zeng <wjzeng2012@gmail.com> wrote:
>> Please continue to review the code.
> 
> This patch is pretty light on comments. Many of the new functions have
> no header comments, for example. There are comments here and there in
> the body of the new functions that are added, and in places where
> existing code is changed there are comments here and there, but
> overall it's not a whole lot. There's no documentation and no README,
> either. Since this adds a new feature and a bunch of new SQL-callable
> functions that interact with that feature, the feature itself should
> be documented, along with its limitations and the new SQL-callable
> functions that interact with it. I think there should be either a
> lengthy comment in some suitable file, or maybe various comments in
> various files, or else a README file, that clearly sets out the major
> design principles behind the patch, and explaining also what that
> means in terms of features and limitations. Without that, it's really
> hard for anyone to jump into reviewing this code, and it will be hard
> for people who have to maintain it in the future to understand it,
> either. Or for users, for that matter.
> 
> -- 
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

Attachment

Re: [Proposal] Global temporary tables

From
Julien Rouhaud
Date:
On Thu, Nov 26, 2020 at 4:05 PM 曾文旌 <wenjing.zwj@alibaba-inc.com> wrote:
>
> I found that the new Patch mail failed to register to Commitfest
> https://commitfest.postgresql.org/28/2349/#
> I don't know what's wrong and how to check it?
> Could you help me figure it out?

Apparently the attachment in
https://www.postgresql.org/message-id/A3F1EBD9-E694-4384-8049-37B09308491B@alibaba-inc.com
wasn't detected.  I have no idea why, maybe Magnus will know.
Otherwise you could try to ask on -www.



Re: [Proposal] Global temporary tables

From
Magnus Hagander
Date:
On Thu, Nov 26, 2020 at 11:16 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Thu, Nov 26, 2020 at 4:05 PM 曾文旌 <wenjing.zwj@alibaba-inc.com> wrote:
> >
> > I found that the new Patch mail failed to register to Commitfest
> > https://commitfest.postgresql.org/28/2349/#
> > I don't know what's wrong and how to check it?
> > Could you help me figure it out?
>
> Apparently the attachment in
> https://www.postgresql.org/message-id/A3F1EBD9-E694-4384-8049-37B09308491B@alibaba-inc.com
> wasn't detected.  I have no idea why, maybe Magnus will know.
> Otherwise you could try to ask on -www.

Not offhand. The email appeas to have a fairly complex nested mime
structure, so something in the python library that parses the MIME
decides that it's not there. For some reason the email is 7 parts. 1
is the signature, the rest seems complexly nested. And the attachment
seems to be squeezed in between two different HTML parts.

Basically, at the top it's multipart/alternative, which says there are
two choices. One is text/plain, which is what the archives uses. The
other is a combination of text/html followed by
application/octetstream (the patch) followed by another text/html.

The archives picks the first alternative, which is text/plain, which
does not contain the attachment. The attachment only exists in the
HTML view.

I think the easiest solution is to re-send as plain text email with
the attachment, which would then put the attachment on the email
itself instead of embedded in the HTML, I would guess.

--
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:
Hi

st 17. 3. 2021 v 12:59 odesílatel wenjing <wjzeng2012@gmail.com> napsal:
ok

The cause of the problem is that the name of the dependent function (readNextTransactionID) has changed. I fixed it.

This patch(V43) is base on 9fd2952cf4920d563e9cea51634c5b364d57f71a

Wenjing

I tested this patch and make check-world fails

make[2]: Vstupuje se do adresáře „/home/pavel/src/postgresql.master/src/test/recovery“
rm -rf '/home/pavel/src/postgresql.master/src/test/recovery'/tmp_check
/usr/bin/mkdir -p '/home/pavel/src/postgresql.master/src/test/recovery'/tmp_check
cd . && TESTDIR='/home/pavel/src/postgresql.master/src/test/recovery' PATH="/home/pavel/src/postgresql.master/tmp_install/usr/local/pgsql/master/bin:$PATH" LD_LIBRARY_PATH="/home/pavel/src/postgresql.master/tmp_install/usr/local/pgsql/master/lib"  PGPORT='65432' PG_REGRESS='/home/pavel/src/postgresql.master/src/test/recovery/../../../src/test/regress/pg_regress' REGRESS_SHLIB='/home/pavel/src/postgresql.master/src/test/regress/regress.so' /usr/bin/prove -I ../../../src/test/perl/ -I .  t/*.pl
t/001_stream_rep.pl .................. ok    
t/002_archiving.pl ................... ok  
t/003_recovery_targets.pl ............ ok  
t/004_timeline_switch.pl ............. ok  
t/005_replay_delay.pl ................ ok  
t/006_logical_decoding.pl ............ ok    
t/007_sync_rep.pl .................... ok    
t/008_fsm_truncation.pl .............. ok  
t/009_twophase.pl .................... ok    
t/010_logical_decoding_timelines.pl .. ok    
t/011_crash_recovery.pl .............. ok  
t/012_subtransactions.pl ............. ok    
t/013_crash_restart.pl ............... ok    
t/014_unlogged_reinit.pl ............. ok    
t/015_promotion_pages.pl ............. ok  
t/016_min_consistency.pl ............. ok  
t/017_shm.pl ......................... skipped: SysV shared memory not supported by this platform
t/018_wal_optimize.pl ................ ok    
t/019_replslot_limit.pl .............. ok    
t/020_archive_status.pl .............. ok    
t/021_row_visibility.pl .............. ok    
t/022_crash_temp_files.pl ............ 1/9
#   Failed test 'one temporary file'
#   at t/022_crash_temp_files.pl line 231.
#          got: '0'
#     expected: '1'
t/022_crash_temp_files.pl ............ 9/9 # Looks like you failed 1 test of 9.
t/022_crash_temp_files.pl ............ Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/9 subtests
t/023_pitr_prepared_xact.pl .......... ok  

Test Summary Report
-------------------
t/022_crash_temp_files.pl          (Wstat: 256 Tests: 9 Failed: 1)
  Failed test:  8
  Non-zero exit status: 1
Files=23, Tests=259, 115 wallclock secs ( 0.21 usr  0.06 sys + 28.57 cusr 18.01 csys = 46.85 CPU)
Result: FAIL
make[2]: *** [Makefile:19: check] Chyba 1
make[2]: Opouští se adresář „/home/pavel/src/postgresql.master/src/test/recovery“
make[1]: *** [Makefile:49: check-recovery-recurse] Chyba 2
make[1]: Opouští se adresář „/home/pavel/src/postgresql.master/src/test“
make: *** [GNUmakefile:71: check-world-src/test-recurse] Chyba 2

Regards

Pavel

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:
Hi

I wrote simple benchmarks. I checked the possible slowdown of connections to postgres when GTT is used.

/usr/local/pgsql/master/bin/pgbench -c 10 -C -f script4.sql -t 1000

script has one line just with INSERT or SELECT LIMIT 1;

PATCH
insert to global temp table (with connect) -- 349 tps (10 clients 443tps)
select from gtt (with connects) -- 370 tps (10 clients 446tps)
insert to normal table (with connect) - 115 tps (10 clients 417 tps)
select from normal table (with connect) -- 358 (10 clients 445 tps)

MASTER
insert to temp table (with connect) -- 58 tps (10 clients 352 tps) -- after test pg_attribute bloated to 11MB
insert into normal table (with connect) -- 118 tps (10 clients 385)
select from normal table (with connect) -- 346 tps (10 clients 449)

The measurement doesn't show anything interesting - it is not possible to see the impact of usage of GTT on connect time.

It is interesting to see the overhead of local temp tables against global temp tables - the performance is about 6x worse, and there is a significant bloat of  the pg_attribute table. And the tested table had only one column. So an idea or concept of global temp tables is very good, and implementation looks well (from performance perspective).

I didn't check the code yet, I just tested behaviour and I think it is very satisfiable for the first stage and first release. The patch is long now, and for the first step is good to stop in implemented features.

Next steps should be supporting DDL for actively used GTT tables. This topic is pretty complex, there are possible more scenarios. I think so GTT behaviour should be the same like behaviour of normal tables (by default) - but I see an advantage of other possibilities, so I don't want to open the discussion about this topic now. Current implementation should not block any possible implementations in future.

Regards

Pavel


Re: [Proposal] Global temporary tables

From
Andrew Dunstan
Date:
On 3/17/21 7:59 AM, wenjing wrote:
> ok
>
> The cause of the problem is that the name of the dependent function
> (readNextTransactionID) has changed. I fixed it.
>
> This patch(V43) is base on 9fd2952cf4920d563e9cea51634c5b364d57f71a
>
> Wenjing
>
>

I have fixed this patch so that

a) it applies cleanly

b) it uses project best practice for catalog Oid assignment.

However, as noted elsewhere it fails the recovery TAP test.

I also note this:


diff --git a/src/test/regress/parallel_schedule
b/src/test/regress/parallel_schedule
index 312c11a4bd..d44fa62f4e 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -129,3 +129,10 @@ test: fast_default
 
 # run stats by itself because its delay may be insufficient under heavy
load
 test: stats
+
+# global temp table test
+test: gtt_stats
+test: gtt_function
+test: gtt_prepare
+test: gtt_parallel_1 gtt_parallel_2
+test: gtt_clean


Tests that need to run in parallel should use either the isolation
tester framework (which is explicitly for testing things concurrently)
or the TAP test framework.

Adding six test files to the regression test suite for this one feature
is not a good idea. You should have one regression test script ideally,
and it should be added as appropriate to both the parallel and serial
schedules (and not at the end). Any further tests should be added using
the other frameworks mentioned.


cheers


andrew


-- 

Andrew Dunstan
EDB: https://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


ne 28. 3. 2021 v 15:07 odesílatel Andrew Dunstan <andrew@dunslane.net> napsal:

On 3/17/21 7:59 AM, wenjing wrote:
> ok
>
> The cause of the problem is that the name of the dependent function
> (readNextTransactionID) has changed. I fixed it.
>
> This patch(V43) is base on 9fd2952cf4920d563e9cea51634c5b364d57f71a
>
> Wenjing
>
>

I have fixed this patch so that

a) it applies cleanly

b) it uses project best practice for catalog Oid assignment.

However, as noted elsewhere it fails the recovery TAP test.

I also note this:


diff --git a/src/test/regress/parallel_schedule
b/src/test/regress/parallel_schedule
index 312c11a4bd..d44fa62f4e 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -129,3 +129,10 @@ test: fast_default
 
 # run stats by itself because its delay may be insufficient under heavy
load
 test: stats
+
+# global temp table test
+test: gtt_stats
+test: gtt_function
+test: gtt_prepare
+test: gtt_parallel_1 gtt_parallel_2
+test: gtt_clean


Tests that need to run in parallel should use either the isolation
tester framework (which is explicitly for testing things concurrently)
or the TAP test framework.

Adding six test files to the regression test suite for this one feature
is not a good idea. You should have one regression test script ideally,
and it should be added as appropriate to both the parallel and serial
schedules (and not at the end). Any further tests should be added using
the other frameworks mentioned.


* bad name of GTT-README - the convention is README.gtt

* Typo - "ofa"

2) Use beforeshmemexit to ensure that all files ofa session GTT are deleted when
the session exits.

* Typo "nd"

3) GTT storage file cleanup during abnormal situations
When a backend exits abnormally (such as oom kill), the startup process starts
recovery before accepting client connection. The same startup process checks
nd removes all GTT files before redo WAL.

* This comment is wrong

  /*
+ * Global temporary table is allowed to be dropped only when the
+ * current session is using it.
+ */
+ if (RELATION_IS_GLOBAL_TEMP(rel))
+ {
+ if (is_other_backend_use_gtt(RelationGetRelid(rel)))
+ ereport(ERROR,
+ (errcode(ERRCODE_DEPENDENT_OBJECTS_STILL_EXIST),
+ errmsg("cannot drop global temporary table %s when other backend attached it.",
+ RelationGetRelationName(rel))));
+ }

* same wrong comment

  /*
+ * Global temporary table is allowed to be dropped only when the
+ * current session is using it.
+ */
+ if (RELATION_IS_GLOBAL_TEMP(rel))
+ {
+ if (is_other_backend_use_gtt(RelationGetRelid(rel)))
+ ereport(ERROR,
+ (errcode(ERRCODE_DEPENDENT_OBJECTS_STILL_EXIST),
+ errmsg("cannot drop global temporary table %s when other backend attached it.",
+ RelationGetRelationName(rel))));
+ }

* typo "backand"

+/*
+ * Check if there are other backends using this GTT besides the current backand.
+ */

There is not user's documentation

Regards

Pavel

 

cheers


andrew


--

Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2021年3月28日 15:27,Pavel Stehule <pavel.stehule@gmail.com> 写道:

Hi

st 17. 3. 2021 v 12:59 odesílatel wenjing <wjzeng2012@gmail.com> napsal:
ok

The cause of the problem is that the name of the dependent function (readNextTransactionID) has changed. I fixed it.

This patch(V43) is base on 9fd2952cf4920d563e9cea51634c5b364d57f71a

Wenjing

I tested this patch and make check-world fails

make[2]: Vstupuje se do adresáře „/home/pavel/src/postgresql.master/src/test/recovery“
rm -rf '/home/pavel/src/postgresql.master/src/test/recovery'/tmp_check
/usr/bin/mkdir -p '/home/pavel/src/postgresql.master/src/test/recovery'/tmp_check
cd . && TESTDIR='/home/pavel/src/postgresql.master/src/test/recovery' PATH="/home/pavel/src/postgresql.master/tmp_install/usr/local/pgsql/master/bin:$PATH" LD_LIBRARY_PATH="/home/pavel/src/postgresql.master/tmp_install/usr/local/pgsql/master/lib"  PGPORT='65432' PG_REGRESS='/home/pavel/src/postgresql.master/src/test/recovery/../../../src/test/regress/pg_regress' REGRESS_SHLIB='/home/pavel/src/postgresql.master/src/test/regress/regress.so' /usr/bin/prove -I ../../../src/test/perl/ -I .  t/*.pl
t/001_stream_rep.pl .................. ok    
t/002_archiving.pl ................... ok  
t/003_recovery_targets.pl ............ ok  
t/004_timeline_switch.pl ............. ok  
t/005_replay_delay.pl ................ ok  
t/006_logical_decoding.pl ............ ok    
t/007_sync_rep.pl .................... ok    
t/008_fsm_truncation.pl .............. ok  
t/009_twophase.pl .................... ok    
t/010_logical_decoding_timelines.pl .. ok    
t/011_crash_recovery.pl .............. ok  
t/012_subtransactions.pl ............. ok    
t/013_crash_restart.pl ............... ok    
t/014_unlogged_reinit.pl ............. ok    
t/015_promotion_pages.pl ............. ok  
t/016_min_consistency.pl ............. ok  
t/017_shm.pl ......................... skipped: SysV shared memory not supported by this platform
t/018_wal_optimize.pl ................ ok    
t/019_replslot_limit.pl .............. ok    
t/020_archive_status.pl .............. ok    
t/021_row_visibility.pl .............. ok    
t/022_crash_temp_files.pl ............ 1/9
#   Failed test 'one temporary file'
#   at t/022_crash_temp_files.pl line 231.
#          got: '0'
#     expected: '1'
t/022_crash_temp_files.pl ............ 9/9 # Looks like you failed 1 test of 9.
t/022_crash_temp_files.pl ............ Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/9 subtests
t/023_pitr_prepared_xact.pl .......... ok  

Test Summary Report
-------------------
t/022_crash_temp_files.pl          (Wstat: 256 Tests: 9 Failed: 1)
  Failed test:  8
  Non-zero exit status: 1
Files=23, Tests=259, 115 wallclock secs ( 0.21 usr  0.06 sys + 28.57 cusr 18.01 csys = 46.85 CPU)
Result: FAIL
make[2]: *** [Makefile:19: check] Chyba 1
make[2]: Opouští se adresář „/home/pavel/src/postgresql.master/src/test/recovery“
make[1]: *** [Makefile:49: check-recovery-recurse] Chyba 2
make[1]: Opouští se adresář „/home/pavel/src/postgresql.master/src/test“
make: *** [GNUmakefile:71: check-world-src/test-recurse] Chyba 2

This is because part of the logic of GTT is duplicated with the new commid  cd91de0d17952b5763466cfa663e98318f26d357
that is commit by Tomas Vondra merge 11 days ago: "Remove Temporary Files after Backend Crash”.
The "Remove Temporary Files after Backend Crash” is exactly what GTT needs, or even better.
Therefore, I chose to delete the temporary file cleanup logic in the GTT path.

Let me update a new version.


Wenjing


Regards

Pavel

Attachment

Re: [Proposal] Global temporary tables

From
曾文旌
Date:

> 2021年3月28日 21:07,Andrew Dunstan <andrew@dunslane.net> 写道:
>
>
> On 3/17/21 7:59 AM, wenjing wrote:
>> ok
>>
>> The cause of the problem is that the name of the dependent function
>> (readNextTransactionID) has changed. I fixed it.
>>
>> This patch(V43) is base on 9fd2952cf4920d563e9cea51634c5b364d57f71a
>>
>> Wenjing
>>
>>
>
> I have fixed this patch so that
>
> a) it applies cleanly
>
> b) it uses project best practice for catalog Oid assignment.
>
> However, as noted elsewhere it fails the recovery TAP test.
>
> I also note this:
>
>
> diff --git a/src/test/regress/parallel_schedule
> b/src/test/regress/parallel_schedule
> index 312c11a4bd..d44fa62f4e 100644
> --- a/src/test/regress/parallel_schedule
> +++ b/src/test/regress/parallel_schedule
> @@ -129,3 +129,10 @@ test: fast_default
>
>  # run stats by itself because its delay may be insufficient under heavy
> load
>  test: stats
> +
> +# global temp table test
> +test: gtt_stats
> +test: gtt_function
> +test: gtt_prepare
> +test: gtt_parallel_1 gtt_parallel_2
> +test: gtt_clean
>
>
> Tests that need to run in parallel should use either the isolation
> tester framework (which is explicitly for testing things concurrently)
> or the TAP test framework.
>
> Adding six test files to the regression test suite for this one feature
> is not a good idea. You should have one regression test script ideally,
> and it should be added as appropriate to both the parallel and serial
> schedules (and not at the end). Any further tests should be added using
> the other frameworks mentioned.
You're right, it doesn't look good.
I'll organize them and put them in place.


Wenjing.

>
>
> cheers
>
>
> andrew
>
>
> --
>
> Andrew Dunstan
> EDB: https://www.enterprisedb.com
>
> <global_temporary_table_v44-pg14.patch.gz>


Attachment

Re: [Proposal] Global temporary tables

From
曾文旌
Date:


2021年3月29日 16:37,Pavel Stehule <pavel.stehule@gmail.com> 写道:



ne 28. 3. 2021 v 15:07 odesílatel Andrew Dunstan <andrew@dunslane.net> napsal:

On 3/17/21 7:59 AM, wenjing wrote:
> ok
>
> The cause of the problem is that the name of the dependent function
> (readNextTransactionID) has changed. I fixed it.
>
> This patch(V43) is base on 9fd2952cf4920d563e9cea51634c5b364d57f71a
>
> Wenjing
>
>

I have fixed this patch so that

a) it applies cleanly

b) it uses project best practice for catalog Oid assignment.

However, as noted elsewhere it fails the recovery TAP test.

I also note this:


diff --git a/src/test/regress/parallel_schedule
b/src/test/regress/parallel_schedule
index 312c11a4bd..d44fa62f4e 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -129,3 +129,10 @@ test: fast_default
 
 # run stats by itself because its delay may be insufficient under heavy
load
 test: stats
+
+# global temp table test
+test: gtt_stats
+test: gtt_function
+test: gtt_prepare
+test: gtt_parallel_1 gtt_parallel_2
+test: gtt_clean


Tests that need to run in parallel should use either the isolation
tester framework (which is explicitly for testing things concurrently)
or the TAP test framework.

Adding six test files to the regression test suite for this one feature
is not a good idea. You should have one regression test script ideally,
and it should be added as appropriate to both the parallel and serial
schedules (and not at the end). Any further tests should be added using
the other frameworks mentioned.


* bad name of GTT-README - the convention is README.gtt

* Typo - "ofa"

2) Use beforeshmemexit to ensure that all files ofa session GTT are deleted when
the session exits.

* Typo "nd"

3) GTT storage file cleanup during abnormal situations
When a backend exits abnormally (such as oom kill), the startup process starts
recovery before accepting client connection. The same startup process checks
nd removes all GTT files before redo WAL.

* This comment is wrong

  /*
+ * Global temporary table is allowed to be dropped only when the
+ * current session is using it.
+ */
+ if (RELATION_IS_GLOBAL_TEMP(rel))
+ {
+ if (is_other_backend_use_gtt(RelationGetRelid(rel)))
+ ereport(ERROR,
+ (errcode(ERRCODE_DEPENDENT_OBJECTS_STILL_EXIST),
+ errmsg("cannot drop global temporary table %s when other backend attached it.",
+ RelationGetRelationName(rel))));
+ }

* same wrong comment

  /*
+ * Global temporary table is allowed to be dropped only when the
+ * current session is using it.
+ */
+ if (RELATION_IS_GLOBAL_TEMP(rel))
+ {
+ if (is_other_backend_use_gtt(RelationGetRelid(rel)))
+ ereport(ERROR,
+ (errcode(ERRCODE_DEPENDENT_OBJECTS_STILL_EXIST),
+ errmsg("cannot drop global temporary table %s when other backend attached it.",
+ RelationGetRelationName(rel))));
+ }

* typo "backand"

+/*
+ * Check if there are other backends using this GTT besides the current backand.
+ */

There is not user's documentation
This is necessary, and I will make a separate document patch.


Wenjing.



Regards

Pavel

 

cheers


andrew


--

Andrew Dunstan
EDB: https://www.enterprisedb.com


Attachment

Re: [Proposal] Global temporary tables

From
wenjing
Date:
HI all

I fixed the document description error and the regression test bug mentioned by Pavel.
This patch(V45) is base on 30aaab26e52144097a1a5bbb0bb66ea1ebc0cb81
Please give me feedback.


Wenjing


Attachment

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


po 29. 3. 2021 v 13:45 odesílatel wenjing <wjzeng2012@gmail.com> napsal:
HI all

I fixed the document description error and the regression test bug mentioned by Pavel.
This patch(V45) is base on 30aaab26e52144097a1a5bbb0bb66ea1ebc0cb81
Please give me feedback.

Yes, it is working.

So please, can you write some user documentation?




Wenjing


Re: [Proposal] Global temporary tables

From
wenjing
Date:
HI Pavel

I added user documentation.
Please give me feedback.


Wenjing

Attachment

Re: [Proposal] Global temporary tables

From
shawn wang
Date:
wenjing <wjzeng2012@gmail.com> 于2021年4月15日周四 下午3:26写道:
HI Pavel

I added user documentation.
Please give me feedback.


Wenjing

 
Hi, Wenjing,

I have checked your documentation section and fixed a spelling mistake, adjusted some sentences for you.
All the modified content is in the new patch, and please check it.

Regards

Shawn

 
Attachment

Re: [Proposal] Global temporary tables

From
wenjing
Date:
shawn wang <shawn.wang.pg@gmail.com> 于2021年4月15日周四 下午4:49写道:
wenjing <wjzeng2012@gmail.com> 于2021年4月15日周四 下午3:26写道:
HI Pavel

I added user documentation.
Please give me feedback.


Wenjing

 
Hi, Wenjing,

I have checked your documentation section and fixed a spelling mistake, adjusted some sentences for you.
All the modified content is in the new patch, and please check it.
Thank you for your comments.
I made some repairs and fixed a bug.
Looking forward to your feedback.


Wenjing

Regards

Shawn



Attachment

Re: [Proposal] Global temporary tables

From
Dilip Kumar
Date:
On Thu, Apr 22, 2021 at 1:11 PM wenjing <wjzeng2012@gmail.com> wrote:
>

I have briefly looked into the design comments added by the patch.  I
have a few questions.

+Feature description
+--------------------------------
+
+Previously, temporary tables are defined once and automatically
+created (starting with empty contents) in every session before using them.


I don’t think this statement is correct, I mean if we define a temp
table in one session then it doesn’t automatically create in all the
sessions.


+
+Like local temporary table, Global Temporary Table supports ON COMMIT
PRESERVE ROWS
+or ON COMMIT DELETE ROWS clause, so that data in the temporary table can be
+cleaned up or reserved automatically when a session exits or a
transaction COMMITs.

/reserved/preserved


I was trying to look into the “Main design idea” section.

+1) CATALOG
+GTTs store session-specific data. The storage information of GTTs'data, their
+transaction information, and their statistics are not stored in the catalog.

I did not understand what do you mean by “transaction information” is
not stored in the catalog?  Mean what transaction information are
stored in catalog in the normal table which is not stored for GTT?

+Changes to the GTT's metadata affect all sessions.
+The operations making those changes include truncate GTT, Vacuum/Cluster GTT,
+and Lock GTT.

How does Truncate or Vacuum affect all the sessions, I mean truncate
should only truncate the data of the current session and the same is
true for the vacuum no?

I will try to do a more detailed review.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: [Proposal] Global temporary tables

From
wenjing
Date:


Dilip Kumar <dilipbalaut@gmail.com> 于2021年5月10日周一 下午6:44写道:
On Thu, Apr 22, 2021 at 1:11 PM wenjing <wjzeng2012@gmail.com> wrote:
>

I have briefly looked into the design comments added by the patch.  I
have a few questions.

+Feature description
+--------------------------------
+
+Previously, temporary tables are defined once and automatically
+created (starting with empty contents) in every session before using them.


I don’t think this statement is correct, I mean if we define a temp
table in one session then it doesn’t automatically create in all the
sessions.
The point is the schema definition of GTT which is shared between sessions.
When a session creates a GTT, once the transaction for the Create Table is committed, other sessions can see the GTT and can use it.
so I modified the description as follows:
automatically exist in every session that needs them.

What do you think?


+
+Like local temporary table, Global Temporary Table supports ON COMMIT
PRESERVE ROWS
+or ON COMMIT DELETE ROWS clause, so that data in the temporary table can be
+cleaned up or reserved automatically when a session exits or a
transaction COMMITs.

/reserved/preserved

OK, I fixed it.
 

I was trying to look into the “Main design idea” section.

+1) CATALOG
+GTTs store session-specific data. The storage information of GTTs'data, their
+transaction information, and their statistics are not stored in the catalog.

I did not understand what do you mean by “transaction information” is
not stored in the catalog?  Mean what transaction information are
stored in catalog in the normal table which is not stored for GTT?
"Transaction Information" refers to the GTT's relfrozenXID,
The relfrozenxid of a normal table is stored in pg_class, but GTT is not.
 
Each row of the data (the tuple header) contains transaction information (such as xmin xmax).
At the same time, for regular table we record the oldest XID (as relfrozenXID) in each piece of data into the pg_class, which is used to clean up the data and clog and reuse transactional resources.
My design is: 
Each session in GTT has a local copy of data (session level relfrozenXID), which is stored in memory (local hashtable). and vacuum will refer to this information.


+Changes to the GTT's metadata affect all sessions.
+The operations making those changes include truncate GTT, Vacuum/Cluster GTT,
+and Lock GTT.

How does Truncate or Vacuum affect all the sessions, I mean truncate
should only truncate the data of the current session and the same is
true for the vacuum no?
Your understanding is correct.
Truncate GTT, VACUUM/CLUuster GTT, and Lock GTT affect current session and without causing exclusive locking.
"Changes to the GTT's metadata affect All Sessions. "is not used to describe the lock behavior. I deleted it.


I will try to do a more detailed review.
Thank you very much for your careful review. We are closer to success.
 

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




I updated the code and passed the regression tests.

Regards,
wjzeng
 
Attachment

Re: [Proposal] Global temporary tables

From
wenjing
Date:
Rebase code based on the latest version.

Regards,
wenjing

Attachment

Re: [Proposal] Global temporary tables

From
Ming Li
Date:
Hi Wenjing,

Some suggestions may help:

1) It seems that no test case covers the below scenario: 2 sessions attach the same gtt, and insert/update/select concurrently. It is better to use the test framework in src/test/isolation like the code changes in https://commitfest.postgresql.org/24/2233/.

2) CREATE GLOBAL TEMP SEQUENCE also need to be supported in src/bin/psql/tab-complete.c


On Wed, Jul 14, 2021 at 10:36 AM wenjing <wjzeng2012@gmail.com> wrote:
Rebase code based on the latest version.

Regards,
wenjing

Re: [Proposal] Global temporary tables

From
wenjing
Date:


Ming Li <mli@apache.org> 于2021年7月14日周三 上午10:56写道:
Hi Wenjing,

Some suggestions may help:

1) It seems that no test case covers the below scenario: 2 sessions attach the same gtt, and insert/update/select concurrently. It is better to use the test framework in src/test/isolation like the code changes in https://commitfest.postgresql.org/24/2233/.

Thanks for pointing this out, I am working on this issue.
 

2) CREATE GLOBAL TEMP SEQUENCE also need to be supported in src/bin/psql/tab-complete.c
It has been fixed in V51, please check

Regards,
wenjing


On Wed, Jul 14, 2021 at 10:36 AM wenjing <wjzeng2012@gmail.com> wrote:
Rebase code based on the latest version.

Regards,
wenjing







Attachment

Re: [Proposal] Global temporary tables

From
Tony Zhu
Date:
Hi Wenjing

would you please rebase the code?

Thank you very much
Tony

The new status of this patch is: Waiting on Author

Re: [Proposal] Global temporary tables

From
wenjing zeng
Date:


2021年7月28日 23:09,Tony Zhu <tony.zhu@ww-it.cn> 写道:

Hi Wenjing

would you please rebase the code?
Thank you for your attention.
According to the test, the latest pgmaster code can merge the latest patch and pass the test.
If you have any questions, please give me feedback.


Wenjing



Thank you very much
Tony

The new status of this patch is: Waiting on Author

Re: [Proposal] Global temporary tables

From
ZHU XIAN WEN
Date:
Hi WenJing


Thanks for the feedback,

I have tested the code, it seems okay, and regression tests got pass

and I have reviewed the code, and I don't find any issue anymore


Hello all


Review and comments for the patches V51 is welcome.


if there is no feedback, I'm going to changed the status to 'Ready for
Committer' on Aug 25


big thanks

Tony



On 2021/7/29 23:19, wenjing zeng wrote:
>
>> 2021年7月28日 23:09,Tony Zhu <tony.zhu@ww-it.cn> 写道:
>>
>> Hi Wenjing
>>
>> would you please rebase the code?
> Thank you for your attention.
> According to the test, the latest pgmaster code can merge the latest patch and pass the test.
> https://www.travis-ci.com/github/wjzeng/postgres/builds <https://www.travis-ci.com/github/wjzeng/postgres/builds>
> If you have any questions, please give me feedback.
>
>
> Wenjing
>
>
>> Thank you very much
>> Tony
>>
>> The new status of this patch is: Waiting on Author
>


Attachment

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:
Hi

looks so this patch is broken again. Please, can you do rebase?

Regards

Pavel

čt 16. 9. 2021 v 8:28 odesílatel wenjing <wjzeng2012@gmail.com> napsal:








Re: [Proposal] Global temporary tables

From
wenjing
Date:


Pavel Stehule <pavel.stehule@gmail.com> 于2021年9月16日周四 下午2:30写道:
Hi

looks so this patch is broken again. Please, can you do rebase?
GTT update to V52 and merge with the latest code.

Wenjing

Regards

Pavel

čt 16. 9. 2021 v 8:28 odesílatel wenjing <wjzeng2012@gmail.com> napsal:










 
Attachment

Re: [Proposal] Global temporary tables

From
wenjing
Date:



2021年7月14日 10:56,Ming Li <mli@apache.org> 写道:

Hi Wenjing,

Some suggestions may help:

1) It seems that no test case covers the below scenario: 2 sessions attach the same gtt, and insert/update/select concurrently. It is better to use the test framework in src/test/isolation like the code changes in https://commitfest.postgresql.org/24/2233/.

I rewrote the case under regress to make it easier to read.
and I used the Isolation module to add some concurrent cases and fix some bugs.

Please check code(v52) and give me feedback.


Wenjing

2) CREATE GLOBAL TEMP SEQUENCE also need to be supported in src/bin/psql/tab-complete.c


On Wed, Jul 14, 2021 at 10:36 AM wenjing <wjzeng2012@gmail.com> wrote:
Rebase code based on the latest version.

Regards,
wenjing


Re: [Proposal] Global temporary tables

From
wenjing
Date:


Andrew Dunstan <andrew@dunslane.net> 于2021年3月28日周日 下午9:07写道:

On 3/17/21 7:59 AM, wenjing wrote:
> ok
>
> The cause of the problem is that the name of the dependent function
> (readNextTransactionID) has changed. I fixed it.
>
> This patch(V43) is base on 9fd2952cf4920d563e9cea51634c5b364d57f71a
>
> Wenjing
>
>

I have fixed this patch so that

a) it applies cleanly

b) it uses project best practice for catalog Oid assignment.

However, as noted elsewhere it fails the recovery TAP test.

I also note this:


diff --git a/src/test/regress/parallel_schedule
b/src/test/regress/parallel_schedule
index 312c11a4bd..d44fa62f4e 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -129,3 +129,10 @@ test: fast_default
 
 # run stats by itself because its delay may be insufficient under heavy
load
 test: stats
+
+# global temp table test
+test: gtt_stats
+test: gtt_function
+test: gtt_prepare
+test: gtt_parallel_1 gtt_parallel_2
+test: gtt_clean


Tests that need to run in parallel should use either the isolation
tester framework (which is explicitly for testing things concurrently)
or the TAP test framework.

Adding six test files to the regression test suite for this one feature
is not a good idea. You should have one regression test script ideally,
and it should be added as appropriate to both the parallel and serial
schedules (and not at the end). Any further tests should be added using
the other frameworks mentioned.
Thank you for your advice.
I have simplified the case in regress and put further tests into the Isolation Tester Framework based on your suggestion.
And I found a few bugs and fixed them.

Please review the GTT v52 and give me feedback.


Wenjing

 


cheers


andrew


--

Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: [Proposal] Global temporary tables

From
Tony Zhu
Date:
Hi Wenjing

we have reviewed the code, and done the regression tests,  all tests is pass,  we believe the feature code quality is
readyfor production ; and I will change the status to "Ready for commit" 

Re: [Proposal] Global temporary tables

From
wenjing
Date:


2021年9月23日 21:55,Tony Zhu <tony.zhu@ww-it.cn> 写道:

Hi Wenjing

we have reviewed the code, and done the regression tests,  all tests is pass,  we believe the feature code quality is ready for production ; and I will change the status to "Ready for commit”
Thank you very much for your attention and testing.
As we communicated, I fixed several issues and attached the latest patch.


Wenjing


Attachment

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:
hi

ne 26. 9. 2021 v 6:05 odesílatel wenjing <wjzeng2012@gmail.com> napsal:


2021年9月23日 21:55,Tony Zhu <tony.zhu@ww-it.cn> 写道:

Hi Wenjing

we have reviewed the code, and done the regression tests,  all tests is pass,  we believe the feature code quality is ready for production ; and I will change the status to "Ready for commit”
Thank you very much for your attention and testing.
As we communicated, I fixed several issues and attached the latest patch.

looks so windows build is broken


Regards

Pavel


Wenjing


Re: [Proposal] Global temporary tables

From
wenjing
Date:


Pavel Stehule <pavel.stehule@gmail.com> 于2021年9月29日周三 下午1:53写道:
hi

ne 26. 9. 2021 v 6:05 odesílatel wenjing <wjzeng2012@gmail.com> napsal:


2021年9月23日 21:55,Tony Zhu <tony.zhu@ww-it.cn> 写道:

Hi Wenjing

we have reviewed the code, and done the regression tests,  all tests is pass,  we believe the feature code quality is ready for production ; and I will change the status to "Ready for commit”
Thank you very much for your attention and testing.
As we communicated, I fixed several issues and attached the latest patch.

looks so windows build is broken

This is indeed a problem and it has been fixed in the new version(v54).
Thank you for pointing it out, please review the code again.


Wenjing

Regards

Pavel


Wenjing




Attachment

Re: [Proposal] Global temporary tables

From
Andrew Bille
Date:
On master with the v54 patches applied the following script leads to crash:
export ASAN_OPTIONS=detect_leaks=0:abort_on_error=1:disable_coredump=0:strict_string_checks=1:check_initialization_order=1:strict_init_order=1
initdb -D data
pg_ctl -w -t 5 -D data -l server.log start
psql -c "create global temp table tmp_table_test_statistics(a int); insert into temp_table_test_statistics values(generate_series(1,1000000000));" &
sleep 1
pg_ctl -w -t 5 -D data -l server.log stop

and i got error
=================================================================
==1022892==ERROR: AddressSanitizer: heap-use-after-free on address 0x62500004c640 at pc 0x562435348750 bp 0x7ffee8487e60 sp 0x7ffee8487e50
READ of size 8 at 0x62500004c640 thread T0
---

with backtrace:

Core was generated by `postgres: andrew regression [local] INSERT                                    '.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007fa8fd008859 in __GI_abort () at abort.c:79
#2  0x000056243471eae2 in __sanitizer::Abort() ()
#3  0x000056243472968c in __sanitizer::Die() ()
#4  0x000056243470ad1c in __asan::ScopedInErrorReport::~ScopedInErrorReport() ()
#5  0x000056243470a793 in __asan::ReportGenericError(unsigned long, unsigned long, unsigned long, unsigned long, bool, unsigned long, unsigned int, bool) ()
#6  0x000056243470b5db in __asan_report_load8 ()
#7  0x0000562435348750 in DropRelFileNodesAllBuffers (smgr_reln=smgr_reln@entry=0x62500004c640, nnodes=nnodes@entry=1) at bufmgr.c:3211
#8  0x00005624353ec8a8 in smgrdounlinkall (rels=rels@entry=0x62500004c640, nrels=nrels@entry=1, isRedo=isRedo@entry=false) at smgr.c:397
#9  0x0000562434aa76e1 in gtt_storage_removeall (code=<optimized out>, arg=<optimized out>) at storage_gtt.c:726
#10 0x0000562435371962 in shmem_exit (code=code@entry=1) at ipc.c:236
#11 0x0000562435371d4f in proc_exit_prepare (code=code@entry=1) at ipc.c:194
#12 0x0000562435371f74 in proc_exit (code=code@entry=1) at ipc.c:107
#13 0x000056243581e35c in errfinish (filename=<optimized out>, filename@entry=0x562435b800e0 "postgres.c", lineno=lineno@entry=3191, funcname=funcname@entry=0x562435b836a0 <__func__.26025> "ProcessInterrupts") at elog.c:666
#14 0x00005624353f5f86 in ProcessInterrupts () at postgres.c:3191
#15 0x0000562434eb26d6 in ExecProjectSet (pstate=0x62500003f150) at nodeProjectSet.c:51
#16 0x0000562434eaae8e in ExecProcNode (node=0x62500003f150) at ../../../src/include/executor/executor.h:257
#17 ExecModifyTable (pstate=0x62500003ec98) at nodeModifyTable.c:2429
#18 0x0000562434df5755 in ExecProcNodeFirst (node=0x62500003ec98) at execProcnode.c:463
#19 0x0000562434dd678a in ExecProcNode (node=0x62500003ec98) at ../../../src/include/executor/executor.h:257
#20 ExecutePlan (estate=estate@entry=0x62500003ea20, planstate=0x62500003ec98, use_parallel_mode=<optimized out>, use_parallel_mode@entry=false, operation=operation@entry=CMD_INSERT, sendTuples=false, numberTuples=numberTuples@entry=0, direction=ForwardScanDirection,
    dest=0x625000045550, execute_once=true) at execMain.c:1555
#21 0x0000562434dd9867 in standard_ExecutorRun (queryDesc=0x6190000015a0, direction=ForwardScanDirection, count=0, execute_once=execute_once@entry=true) at execMain.c:361
#22 0x0000562434dd9a83 in ExecutorRun (queryDesc=queryDesc@entry=0x6190000015a0, direction=direction@entry=ForwardScanDirection, count=count@entry=0, execute_once=execute_once@entry=true) at execMain.c:305
#23 0x0000562435401be6 in ProcessQuery (plan=plan@entry=0x625000045480, sourceText=0x625000005220 "insert into temp_table_test_statistics values(generate_series(1,1000000000));", params=0x0, queryEnv=0x0, dest=dest@entry=0x625000045550, qc=qc@entry=0x7ffee84886d0)
    at pquery.c:160
#24 0x0000562435404a32 in PortalRunMulti (portal=portal@entry=0x625000020a20, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x625000045550, altdest=altdest@entry=0x625000045550, qc=qc@entry=0x7ffee84886d0)
    at pquery.c:1274
#25 0x000056243540598d in PortalRun (portal=portal@entry=0x625000020a20, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x625000045550, altdest=altdest@entry=0x625000045550, qc=<optimized out>)
    at pquery.c:788
#26 0x00005624353fa917 in exec_simple_query (query_string=query_string@entry=0x625000005220 "insert into temp_table_test_statistics values(generate_series(1,1000000000));") at postgres.c:1214
#27 0x00005624353ff61d in PostgresMain (dbname=dbname@entry=0x629000011278 "regression", username=username@entry=0x629000011258 "andrew") at postgres.c:4497
#28 0x00005624351f65c7 in BackendRun (port=port@entry=0x615000002d80) at postmaster.c:4560
#29 0x00005624351ff1c5 in BackendStartup (port=port@entry=0x615000002d80) at postmaster.c:4288
#30 0x00005624351ff970 in ServerLoop () at postmaster.c:1801
#31 0x0000562435201da4 in PostmasterMain (argc=3, argv=<optimized out>) at postmaster.c:1473
#32 0x0000562434f3ab2d in main (argc=3, argv=0x603000000280) at main.c:198
---

I've built the server with sanitizers using gcc 9 as following:
CPPFLAGS="-Og -fsanitize=address -fsanitize=undefined -fno-sanitize=nonnull-attribute  -fno-sanitize-recover -fno-sanitize=alignment -fstack-protector" LDFLAGS='-fsanitize=address -fsanitize=undefined -static-libasan' ./configure --enable-tap-tests --enable-debug

Re: [Proposal] Global temporary tables

From
wenjing
Date:


Andrew Bille <andrewbille@gmail.com> 于2021年10月7日周四 上午12:30写道:
On master with the v54 patches applied the following script leads to crash:
Thank you for pointing it out. 
This is a bug that occurs during transaction rollback and process exit, I fixed it, please confirm it.
 

Wenjing

export ASAN_OPTIONS=detect_leaks=0:abort_on_error=1:disable_coredump=0:strict_string_checks=1:check_initialization_order=1:strict_init_order=1
initdb -D data
pg_ctl -w -t 5 -D data -l server.log start
psql -c "create global temp table tmp_table_test_statistics(a int); insert into temp_table_test_statistics values(generate_series(1,1000000000));" &
sleep 1
pg_ctl -w -t 5 -D data -l server.log stop

and i got error
=================================================================
==1022892==ERROR: AddressSanitizer: heap-use-after-free on address 0x62500004c640 at pc 0x562435348750 bp 0x7ffee8487e60 sp 0x7ffee8487e50
READ of size 8 at 0x62500004c640 thread T0
---

with backtrace:

Core was generated by `postgres: andrew regression [local] INSERT                                    '.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007fa8fd008859 in __GI_abort () at abort.c:79
#2  0x000056243471eae2 in __sanitizer::Abort() ()
#3  0x000056243472968c in __sanitizer::Die() ()
#4  0x000056243470ad1c in __asan::ScopedInErrorReport::~ScopedInErrorReport() ()
#5  0x000056243470a793 in __asan::ReportGenericError(unsigned long, unsigned long, unsigned long, unsigned long, bool, unsigned long, unsigned int, bool) ()
#6  0x000056243470b5db in __asan_report_load8 ()
#7  0x0000562435348750 in DropRelFileNodesAllBuffers (smgr_reln=smgr_reln@entry=0x62500004c640, nnodes=nnodes@entry=1) at bufmgr.c:3211
#8  0x00005624353ec8a8 in smgrdounlinkall (rels=rels@entry=0x62500004c640, nrels=nrels@entry=1, isRedo=isRedo@entry=false) at smgr.c:397
#9  0x0000562434aa76e1 in gtt_storage_removeall (code=<optimized out>, arg=<optimized out>) at storage_gtt.c:726
#10 0x0000562435371962 in shmem_exit (code=code@entry=1) at ipc.c:236
#11 0x0000562435371d4f in proc_exit_prepare (code=code@entry=1) at ipc.c:194
#12 0x0000562435371f74 in proc_exit (code=code@entry=1) at ipc.c:107
#13 0x000056243581e35c in errfinish (filename=<optimized out>, filename@entry=0x562435b800e0 "postgres.c", lineno=lineno@entry=3191, funcname=funcname@entry=0x562435b836a0 <__func__.26025> "ProcessInterrupts") at elog.c:666
#14 0x00005624353f5f86 in ProcessInterrupts () at postgres.c:3191
#15 0x0000562434eb26d6 in ExecProjectSet (pstate=0x62500003f150) at nodeProjectSet.c:51
#16 0x0000562434eaae8e in ExecProcNode (node=0x62500003f150) at ../../../src/include/executor/executor.h:257
#17 ExecModifyTable (pstate=0x62500003ec98) at nodeModifyTable.c:2429
#18 0x0000562434df5755 in ExecProcNodeFirst (node=0x62500003ec98) at execProcnode.c:463
#19 0x0000562434dd678a in ExecProcNode (node=0x62500003ec98) at ../../../src/include/executor/executor.h:257
#20 ExecutePlan (estate=estate@entry=0x62500003ea20, planstate=0x62500003ec98, use_parallel_mode=<optimized out>, use_parallel_mode@entry=false, operation=operation@entry=CMD_INSERT, sendTuples=false, numberTuples=numberTuples@entry=0, direction=ForwardScanDirection,
    dest=0x625000045550, execute_once=true) at execMain.c:1555
#21 0x0000562434dd9867 in standard_ExecutorRun (queryDesc=0x6190000015a0, direction=ForwardScanDirection, count=0, execute_once=execute_once@entry=true) at execMain.c:361
#22 0x0000562434dd9a83 in ExecutorRun (queryDesc=queryDesc@entry=0x6190000015a0, direction=direction@entry=ForwardScanDirection, count=count@entry=0, execute_once=execute_once@entry=true) at execMain.c:305
#23 0x0000562435401be6 in ProcessQuery (plan=plan@entry=0x625000045480, sourceText=0x625000005220 "insert into temp_table_test_statistics values(generate_series(1,1000000000));", params=0x0, queryEnv=0x0, dest=dest@entry=0x625000045550, qc=qc@entry=0x7ffee84886d0)
    at pquery.c:160
#24 0x0000562435404a32 in PortalRunMulti (portal=portal@entry=0x625000020a20, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x625000045550, altdest=altdest@entry=0x625000045550, qc=qc@entry=0x7ffee84886d0)
    at pquery.c:1274
#25 0x000056243540598d in PortalRun (portal=portal@entry=0x625000020a20, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x625000045550, altdest=altdest@entry=0x625000045550, qc=<optimized out>)
    at pquery.c:788
#26 0x00005624353fa917 in exec_simple_query (query_string=query_string@entry=0x625000005220 "insert into temp_table_test_statistics values(generate_series(1,1000000000));") at postgres.c:1214
#27 0x00005624353ff61d in PostgresMain (dbname=dbname@entry=0x629000011278 "regression", username=username@entry=0x629000011258 "andrew") at postgres.c:4497
#28 0x00005624351f65c7 in BackendRun (port=port@entry=0x615000002d80) at postmaster.c:4560
#29 0x00005624351ff1c5 in BackendStartup (port=port@entry=0x615000002d80) at postmaster.c:4288
#30 0x00005624351ff970 in ServerLoop () at postmaster.c:1801
#31 0x0000562435201da4 in PostmasterMain (argc=3, argv=<optimized out>) at postmaster.c:1473
#32 0x0000562434f3ab2d in main (argc=3, argv=0x603000000280) at main.c:198
---

I've built the server with sanitizers using gcc 9 as following:
CPPFLAGS="-Og -fsanitize=address -fsanitize=undefined -fno-sanitize=nonnull-attribute  -fno-sanitize-recover -fno-sanitize=alignment -fstack-protector" LDFLAGS='-fsanitize=address -fsanitize=undefined -static-libasan' ./configure --enable-tap-tests --enable-debug



 
Attachment

Re: [Proposal] Global temporary tables

From
Andrew Bille
Date:
Thanks for the fix. It works for me.

Now I'm exploring another crash related to GTT, but I need a few days to present a simple repro.

On Sat, Oct 9, 2021 at 2:41 PM wenjing <wjzeng2012@gmail.com> wrote:

Thank you for pointing it out. 
This is a bug that occurs during transaction rollback and process exit, I fixed it, please confirm it.

Wenjing

Re: [Proposal] Global temporary tables

From
wenjing zeng
Date:


2021年10月13日 13:08,Andrew Bille <andrewbille@gmail.com> 写道:

Thanks for the fix. It works for me.

Now I'm exploring another crash related to GTT, but I need a few days to present a simple repro.

Be deeply grateful.
Perhaps you can give the stack of problems so that you can start analyzing them as soon as possible.


Wenjing


On Sat, Oct 9, 2021 at 2:41 PM wenjing <wjzeng2012@gmail.com> wrote:

Thank you for pointing it out. 
This is a bug that occurs during transaction rollback and process exit, I fixed it, please confirm it.

Wenjing

Re: [Proposal] Global temporary tables

From
Andrew Bille
Date:
On master with the v55 patches applied the following script leads to crash:
initdb -D data
pg_ctl -w -t 5 -D data -l server.log start

psql -t -c "begin; create global temp table gtt_with_index(a int primary key, b text); commit; select pg_sleep(5);" >psql1.log &
psql -t -c "select pg_sleep(1); create index idx_b on gtt_with_index(b);" >psql2.log &
for i in `seq 40`; do (psql -t -c "select pg_sleep(1); insert into gtt_with_index values(1,'test');" &); done

sleep 10


and I got crash
INSERT 0 1
...
INSERT 0 1
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
connection to server was lost
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
connection to server was lost

and some coredumps with the following stack:

[New LWP 1821493]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: andrew regression [local] INSERT                                    '.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f021d809859 in __GI_abort () at abort.c:79
#2  0x0000564dc1bd22e8 in ExceptionalCondition (conditionName=conditionName@entry=0x564dc1c5c957 "index->rd_index->indisvalid", errorType=errorType@entry=0x564dc1c2a00b "FailedAssertion", fileName=fileName@entry=0x564dc1c5c854 "storage_gtt.c",
    lineNumber=lineNumber@entry=1381) at assert.c:69
#3  0x0000564dc185778b in init_gtt_storage (operation=operation@entry=CMD_INSERT, resultRelInfo=resultRelInfo@entry=0x564dc306f6c0) at storage_gtt.c:1381
#4  0x0000564dc194c888 in ExecInsert (mtstate=0x564dc306f4a8, resultRelInfo=0x564dc306f6c0, slot=0x564dc30706d0, planSlot=0x564dc306fca0, estate=0x564dc306f230, canSetTag=<optimized out>) at nodeModifyTable.c:638
#5  0x0000564dc194d945 in ExecModifyTable (pstate=<optimized out>) at nodeModifyTable.c:2565
#6  0x0000564dc191ca83 in ExecProcNode (node=0x564dc306f4a8) at ../../../src/include/executor/executor.h:257
#7  ExecutePlan (execute_once=<optimized out>, dest=0x564dc310ed80, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_INSERT, use_parallel_mode=<optimized out>, planstate=0x564dc306f4a8, estate=0x564dc306f230) at execMain.c:1555
#8  standard_ExecutorRun (queryDesc=0x564dc306bce0, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:361
#9  0x0000564dc1ab47a0 in ProcessQuery (plan=<optimized out>, sourceText=0x564dc3049a30 "select pg_sleep(1); insert into gtt_with_index values(1,'test');", params=0x0, queryEnv=0x0, dest=0x564dc310ed80, qc=0x7ffd3a6cf2e0) at pquery.c:160
#10 0x0000564dc1ab52e2 in PortalRunMulti (portal=portal@entry=0x564dc30acd80, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x564dc310ed80, altdest=altdest@entry=0x564dc310ed80, qc=qc@entry=0x7ffd3a6cf2e0)
    at pquery.c:1274
#11 0x0000564dc1ab5861 in PortalRun (portal=portal@entry=0x564dc30acd80, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x564dc310ed80, altdest=altdest@entry=0x564dc310ed80, qc=0x7ffd3a6cf2e0)
    at pquery.c:788
#12 0x0000564dc1ab1522 in exec_simple_query (query_string=0x564dc3049a30 "select pg_sleep(1); insert into gtt_with_index values(1,'test');") at postgres.c:1214
#13 0x0000564dc1ab327a in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4497
#14 0x0000564dc1a1f539 in BackendRun (port=<optimized out>, port=<optimized out>) at postmaster.c:4560
#15 BackendStartup (port=<optimized out>) at postmaster.c:4288
#16 ServerLoop () at postmaster.c:1801
#17 0x0000564dc1a2053c in PostmasterMain (argc=<optimized out>, argv=0x564dc3043fc0) at postmaster.c:1473
#18 0x0000564dc1750180 in main (argc=3, argv=0x564dc3043fc0) at main.c:198
(gdb) q


I've built the server using gcc 9 as following:
./configure --enable-debug --enable-cassert

Thanks to Alexander Lakhin for simplifying the repro.

On Thu, Oct 14, 2021 at 3:29 PM wenjing zeng <wjzeng2012@gmail.com> wrote:

Be deeply grateful.
Perhaps you can give the stack of problems so that you can start analyzing them as soon as possible.

Wenjing

Re: [Proposal] Global temporary tables

From
wenjing
Date:

Andrew Bille <andrewbille@gmail.com> 于2021年10月15日周五 下午3:44写道:
On master with the v55 patches applied the following script leads to crash:
initdb -D data
pg_ctl -w -t 5 -D data -l server.log start

psql -t -c "begin; create global temp table gtt_with_index(a int primary key, b text); commit; select pg_sleep(5);" >psql1.log &
psql -t -c "select pg_sleep(1); create index idx_b on gtt_with_index(b);" >psql2.log &
for i in `seq 40`; do (psql -t -c "select pg_sleep(1); insert into gtt_with_index values(1,'test');" &); done

sleep 10


and I got crash
INSERT 0 1
...
INSERT 0 1
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
connection to server was lost
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
connection to server was lost

and some coredumps with the following stack:

[New LWP 1821493]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: andrew regression [local] INSERT                                    '.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f021d809859 in __GI_abort () at abort.c:79
#2  0x0000564dc1bd22e8 in ExceptionalCondition (conditionName=conditionName@entry=0x564dc1c5c957 "index->rd_index->indisvalid", errorType=errorType@entry=0x564dc1c2a00b "FailedAssertion", fileName=fileName@entry=0x564dc1c5c854 "storage_gtt.c",
    lineNumber=lineNumber@entry=1381) at assert.c:69
#3  0x0000564dc185778b in init_gtt_storage (operation=operation@entry=CMD_INSERT, resultRelInfo=resultRelInfo@entry=0x564dc306f6c0) at storage_gtt.c:1381
#4  0x0000564dc194c888 in ExecInsert (mtstate=0x564dc306f4a8, resultRelInfo=0x564dc306f6c0, slot=0x564dc30706d0, planSlot=0x564dc306fca0, estate=0x564dc306f230, canSetTag=<optimized out>) at nodeModifyTable.c:638
#5  0x0000564dc194d945 in ExecModifyTable (pstate=<optimized out>) at nodeModifyTable.c:2565
#6  0x0000564dc191ca83 in ExecProcNode (node=0x564dc306f4a8) at ../../../src/include/executor/executor.h:257
#7  ExecutePlan (execute_once=<optimized out>, dest=0x564dc310ed80, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_INSERT, use_parallel_mode=<optimized out>, planstate=0x564dc306f4a8, estate=0x564dc306f230) at execMain.c:1555
#8  standard_ExecutorRun (queryDesc=0x564dc306bce0, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:361
#9  0x0000564dc1ab47a0 in ProcessQuery (plan=<optimized out>, sourceText=0x564dc3049a30 "select pg_sleep(1); insert into gtt_with_index values(1,'test');", params=0x0, queryEnv=0x0, dest=0x564dc310ed80, qc=0x7ffd3a6cf2e0) at pquery.c:160
#10 0x0000564dc1ab52e2 in PortalRunMulti (portal=portal@entry=0x564dc30acd80, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x564dc310ed80, altdest=altdest@entry=0x564dc310ed80, qc=qc@entry=0x7ffd3a6cf2e0)
    at pquery.c:1274
#11 0x0000564dc1ab5861 in PortalRun (portal=portal@entry=0x564dc30acd80, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x564dc310ed80, altdest=altdest@entry=0x564dc310ed80, qc=0x7ffd3a6cf2e0)
    at pquery.c:788
#12 0x0000564dc1ab1522 in exec_simple_query (query_string=0x564dc3049a30 "select pg_sleep(1); insert into gtt_with_index values(1,'test');") at postgres.c:1214
#13 0x0000564dc1ab327a in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4497
#14 0x0000564dc1a1f539 in BackendRun (port=<optimized out>, port=<optimized out>) at postmaster.c:4560
#15 BackendStartup (port=<optimized out>) at postmaster.c:4288
#16 ServerLoop () at postmaster.c:1801
#17 0x0000564dc1a2053c in PostmasterMain (argc=<optimized out>, argv=0x564dc3043fc0) at postmaster.c:1473
#18 0x0000564dc1750180 in main (argc=3, argv=0x564dc3043fc0) at main.c:198
(gdb) q


I've built the server using gcc 9 as following:
./configure --enable-debug --enable-cassert

Thanks to Alexander Lakhin for simplifying the repro.

On Thu, Oct 14, 2021 at 3:29 PM wenjing zeng <wjzeng2012@gmail.com> wrote:

Be deeply grateful.
Perhaps you can give the stack of problems so that you can start analyzing them as soon as possible.

Wenjing


Hi Andrew
I fixed the problem, please confirm again.
Thanks

Wenjing


Attachment

Re: [Proposal] Global temporary tables

From
Andrew Bille
Date:
Another thanks for the fix. It works for me.

But I found another crash!

On master with the v56 patches applied:

initdb -D data
pg_ctl -w -t 5 -D data -l server.log start
echo "create global temp table t(i int4); insert into t values (1); vacuum t;" > tmp.sql
psql < tmp.sql

CREATE TABLE
INSERT 0 1
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
connection to server was lost

with following stack:
[New LWP 2192409]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: andrew regression [local] VACUUM                                    '.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007fb26b558859 in __GI_abort () at abort.c:79
#2  0x00005627ddd8466c in ExceptionalCondition (conditionName=conditionName@entry=0x5627dde153d0 "TransactionIdIsNormal(relfrozenxid)", errorType=errorType@entry=0x5627ddde100b "FailedAssertion", fileName=fileName@entry=0x5627dddfa697 "vacuum.c", lineNumber=lineNumber@entry=1170) at assert.c:69
#3  0x00005627dda70808 in vacuum_xid_failsafe_check (relfrozenxid=<optimized out>, relminmxid=<optimized out>) at vacuum.c:1170
#4  0x00005627dd8db7ee in lazy_check_wraparound_failsafe (vacrel=vacrel@entry=0x5627df5c9680) at vacuumlazy.c:2607
#5  0x00005627dd8ded18 in lazy_scan_heap (vacrel=vacrel@entry=0x5627df5c9680, params=params@entry=0x7fffb3d36100, aggressive=aggressive@entry=true) at vacuumlazy.c:978
#6  0x00005627dd8e019a in heap_vacuum_rel (rel=0x7fb26218af70, params=0x7fffb3d36100, bstrategy=<optimized out>) at vacuumlazy.c:644
#7  0x00005627dda70033 in table_relation_vacuum (bstrategy=<optimized out>, params=0x7fffb3d36100, rel=0x7fb26218af70) at ../../../src/include/access/tableam.h:1678
#8  vacuum_rel (relid=16385, relation=<optimized out>, params=params@entry=0x7fffb3d36100) at vacuum.c:2124
#9  0x00005627dda71624 in vacuum (relations=0x5627df610598, params=params@entry=0x7fffb3d36100, bstrategy=<optimized out>, bstrategy@entry=0x0, isTopLevel=isTopLevel@entry=true) at vacuum.c:476
#10 0x00005627dda71eb1 in ExecVacuum (pstate=pstate@entry=0x5627df567440, vacstmt=vacstmt@entry=0x5627df545e70, isTopLevel=isTopLevel@entry=true) at vacuum.c:269
#11 0x00005627ddc4a8cc in standard_ProcessUtility (pstmt=0x5627df5461c0, queryString=0x5627df545380 "vacuum t;", readOnlyTree=<optimized out>, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x5627df5462b0, qc=0x7fffb3d36470) at utility.c:858
#12 0x00005627ddc4ada1 in ProcessUtility (pstmt=pstmt@entry=0x5627df5461c0, queryString=<optimized out>, readOnlyTree=<optimized out>, context=context@entry=PROCESS_UTILITY_TOPLEVEL, params=<optimized out>, queryEnv=<optimized out>, dest=0x5627df5462b0, qc=0x7fffb3d36470) at utility.c:527
#13 0x00005627ddc4822d in PortalRunUtility (portal=portal@entry=0x5627df5a67e0, pstmt=pstmt@entry=0x5627df5461c0, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x5627df5462b0, qc=qc@entry=0x7fffb3d36470) at pquery.c:1155
#14 0x00005627ddc48551 in PortalRunMulti (portal=portal@entry=0x5627df5a67e0, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x5627df5462b0, altdest=altdest@entry=0x5627df5462b0, qc=qc@entry=0x7fffb3d36470) at pquery.c:1312
#15 0x00005627ddc4896c in PortalRun (portal=portal@entry=0x5627df5a67e0, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x5627df5462b0, altdest=altdest@entry=0x5627df5462b0, qc=0x7fffb3d36470) at pquery.c:788
#16 0x00005627ddc44afb in exec_simple_query (query_string=query_string@entry=0x5627df545380 "vacuum t;") at postgres.c:1214
#17 0x00005627ddc469df in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4497
#18 0x00005627ddb9fe7d in BackendRun (port=port@entry=0x5627df566580) at postmaster.c:4560
#19 0x00005627ddba3001 in BackendStartup (port=port@entry=0x5627df566580) at postmaster.c:4288
#20 0x00005627ddba3248 in ServerLoop () at postmaster.c:1801
#21 0x00005627ddba482a in PostmasterMain (argc=3, argv=<optimized out>) at postmaster.c:1473
#22 0x00005627ddae4d1d in main (argc=3, argv=0x5627df53f750) at main.c:198

On Mon, Oct 18, 2021 at 7:00 PM wenjing <wjzeng2012@gmail.com> wrote:
Hi Andrew
I fixed the problem, please confirm again.
Thanks

Wenjing

Re: [Proposal] Global temporary tables

From
wenjing
Date:


Andrew Bille <andrewbille@gmail.com> 于2021年10月20日周三 上午2:59写道:
Another thanks for the fix. It works for me.

But I found another crash!
This is a check code that was added this year, but it did find a problem and I fixed it.
Please review the new code(v57) again.


Wenjing
 

On master with the v56 patches applied:

initdb -D data
pg_ctl -w -t 5 -D data -l server.log start
echo "create global temp table t(i int4); insert into t values (1); vacuum t;" > tmp.sql
psql < tmp.sql

CREATE TABLE
INSERT 0 1
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
connection to server was lost

with following stack:
[New LWP 2192409]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: andrew regression [local] VACUUM                                    '.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007fb26b558859 in __GI_abort () at abort.c:79
#2  0x00005627ddd8466c in ExceptionalCondition (conditionName=conditionName@entry=0x5627dde153d0 "TransactionIdIsNormal(relfrozenxid)", errorType=errorType@entry=0x5627ddde100b "FailedAssertion", fileName=fileName@entry=0x5627dddfa697 "vacuum.c", lineNumber=lineNumber@entry=1170) at assert.c:69
#3  0x00005627dda70808 in vacuum_xid_failsafe_check (relfrozenxid=<optimized out>, relminmxid=<optimized out>) at vacuum.c:1170
#4  0x00005627dd8db7ee in lazy_check_wraparound_failsafe (vacrel=vacrel@entry=0x5627df5c9680) at vacuumlazy.c:2607
#5  0x00005627dd8ded18 in lazy_scan_heap (vacrel=vacrel@entry=0x5627df5c9680, params=params@entry=0x7fffb3d36100, aggressive=aggressive@entry=true) at vacuumlazy.c:978
#6  0x00005627dd8e019a in heap_vacuum_rel (rel=0x7fb26218af70, params=0x7fffb3d36100, bstrategy=<optimized out>) at vacuumlazy.c:644
#7  0x00005627dda70033 in table_relation_vacuum (bstrategy=<optimized out>, params=0x7fffb3d36100, rel=0x7fb26218af70) at ../../../src/include/access/tableam.h:1678
#8  vacuum_rel (relid=16385, relation=<optimized out>, params=params@entry=0x7fffb3d36100) at vacuum.c:2124
#9  0x00005627dda71624 in vacuum (relations=0x5627df610598, params=params@entry=0x7fffb3d36100, bstrategy=<optimized out>, bstrategy@entry=0x0, isTopLevel=isTopLevel@entry=true) at vacuum.c:476
#10 0x00005627dda71eb1 in ExecVacuum (pstate=pstate@entry=0x5627df567440, vacstmt=vacstmt@entry=0x5627df545e70, isTopLevel=isTopLevel@entry=true) at vacuum.c:269
#11 0x00005627ddc4a8cc in standard_ProcessUtility (pstmt=0x5627df5461c0, queryString=0x5627df545380 "vacuum t;", readOnlyTree=<optimized out>, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x5627df5462b0, qc=0x7fffb3d36470) at utility.c:858
#12 0x00005627ddc4ada1 in ProcessUtility (pstmt=pstmt@entry=0x5627df5461c0, queryString=<optimized out>, readOnlyTree=<optimized out>, context=context@entry=PROCESS_UTILITY_TOPLEVEL, params=<optimized out>, queryEnv=<optimized out>, dest=0x5627df5462b0, qc=0x7fffb3d36470) at utility.c:527
#13 0x00005627ddc4822d in PortalRunUtility (portal=portal@entry=0x5627df5a67e0, pstmt=pstmt@entry=0x5627df5461c0, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x5627df5462b0, qc=qc@entry=0x7fffb3d36470) at pquery.c:1155
#14 0x00005627ddc48551 in PortalRunMulti (portal=portal@entry=0x5627df5a67e0, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x5627df5462b0, altdest=altdest@entry=0x5627df5462b0, qc=qc@entry=0x7fffb3d36470) at pquery.c:1312
#15 0x00005627ddc4896c in PortalRun (portal=portal@entry=0x5627df5a67e0, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x5627df5462b0, altdest=altdest@entry=0x5627df5462b0, qc=0x7fffb3d36470) at pquery.c:788
#16 0x00005627ddc44afb in exec_simple_query (query_string=query_string@entry=0x5627df545380 "vacuum t;") at postgres.c:1214
#17 0x00005627ddc469df in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4497
#18 0x00005627ddb9fe7d in BackendRun (port=port@entry=0x5627df566580) at postmaster.c:4560
#19 0x00005627ddba3001 in BackendStartup (port=port@entry=0x5627df566580) at postmaster.c:4288
#20 0x00005627ddba3248 in ServerLoop () at postmaster.c:1801
#21 0x00005627ddba482a in PostmasterMain (argc=3, argv=<optimized out>) at postmaster.c:1473
#22 0x00005627ddae4d1d in main (argc=3, argv=0x5627df53f750) at main.c:198

On Mon, Oct 18, 2021 at 7:00 PM wenjing <wjzeng2012@gmail.com> wrote:
Hi Andrew
I fixed the problem, please confirm again.
Thanks

Wenjing



 
Attachment

Re: [Proposal] Global temporary tables

From
Andrew Bille
Date:
Thanks, the vacuum is fixed

But I found another crash (on v57 patches), reproduced with:

psql -t -c "create global temp table t (a integer); insert into t values (1); select count(*) from t group by t;"
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
connection to server was lost

with trace:

[New LWP 2580215]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: andrew postgres [local] SELECT                                      '.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f258d482859 in __GI_abort () at abort.c:79
#2  0x000055ad0be8878f in ExceptionalCondition (conditionName=conditionName@entry=0x55ad0bf19743 "gtt_rnode->att_stat_tups[i]", errorType=errorType@entry=0x55ad0bee500b "FailedAssertion", fileName=fileName@entry=0x55ad0bf1966b "storage_gtt.c", lineNumber=lineNumber@entry=902) at assert.c:69
#3  0x000055ad0ba9379f in get_gtt_att_statistic (reloid=<optimized out>, attnum=0, inh=<optimized out>) at storage_gtt.c:902
#4  0x000055ad0be35625 in examine_simple_variable (root=root@entry=0x55ad0c498748, var=var@entry=0x55ad0c498c68, vardata=vardata@entry=0x7fff06c9ebf0) at selfuncs.c:5391
#5  0x000055ad0be36a89 in examine_variable (root=root@entry=0x55ad0c498748, node=node@entry=0x55ad0c498c68, varRelid=varRelid@entry=0, vardata=vardata@entry=0x7fff06c9ebf0) at selfuncs.c:4990
#6  0x000055ad0be3ad64 in estimate_num_groups (root=root@entry=0x55ad0c498748, groupExprs=<optimized out>, input_rows=input_rows@entry=255, pgset=pgset@entry=0x0, estinfo=estinfo@entry=0x0) at selfuncs.c:3455
#7  0x000055ad0bc50835 in get_number_of_groups (root=root@entry=0x55ad0c498748, path_rows=255, gd=gd@entry=0x0, target_list=0x55ad0c498bb8) at planner.c:3241
#8  0x000055ad0bc5576f in create_ordinary_grouping_paths (root=root@entry=0x55ad0c498748, input_rel=input_rel@entry=0x55ad0c3ce148, grouped_rel=grouped_rel@entry=0x55ad0c4983f0, agg_costs=agg_costs@entry=0x7fff06c9edb0, gd=gd@entry=0x0, extra=extra@entry=0x7fff06c9ede0, partially_grouped_rel_p=0x7fff06c9eda8)
    at planner.c:3628
#9  0x000055ad0bc55a72 in create_grouping_paths (root=root@entry=0x55ad0c498748, input_rel=input_rel@entry=0x55ad0c3ce148, target=target@entry=0x55ad0c4c95d8, target_parallel_safe=target_parallel_safe@entry=true, gd=gd@entry=0x0) at planner.c:3377
#10 0x000055ad0bc5686d in grouping_planner (root=root@entry=0x55ad0c498748, tuple_fraction=<optimized out>, tuple_fraction@entry=0) at planner.c:1592
#11 0x000055ad0bc57910 in subquery_planner (glob=glob@entry=0x55ad0c497880, parse=parse@entry=0x55ad0c3cdbb8, parent_root=parent_root@entry=0x0, hasRecursion=hasRecursion@entry=false, tuple_fraction=tuple_fraction@entry=0) at planner.c:1025
#12 0x000055ad0bc57f36 in standard_planner (parse=0x55ad0c3cdbb8, query_string=<optimized out>, cursorOptions=2048, boundParams=0x0) at planner.c:406
#13 0x000055ad0bc584d4 in planner (parse=parse@entry=0x55ad0c3cdbb8, query_string=query_string@entry=0x55ad0c3cc470 "create global temp table t (a integer); insert into t values (1); select count(*) from t group by t;", cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at planner.c:277
#14 0x000055ad0bd4855f in pg_plan_query (querytree=querytree@entry=0x55ad0c3cdbb8, query_string=query_string@entry=0x55ad0c3cc470 "create global temp table t (a integer); insert into t values (1); select count(*) from t group by t;", cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0)
    at postgres.c:847
#15 0x000055ad0bd4863b in pg_plan_queries (querytrees=0x55ad0c4986f0, query_string=query_string@entry=0x55ad0c3cc470 "create global temp table t (a integer); insert into t values (1); select count(*) from t group by t;", cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at postgres.c:939
#16 0x000055ad0bd48b20 in exec_simple_query (query_string=query_string@entry=0x55ad0c3cc470 "create global temp table t (a integer); insert into t values (1); select count(*) from t group by t;") at postgres.c:1133
#17 0x000055ad0bd4aaf3 in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4497
#18 0x000055ad0bca3f91 in BackendRun (port=port@entry=0x55ad0c3f1020) at postmaster.c:4560
#19 0x000055ad0bca7115 in BackendStartup (port=port@entry=0x55ad0c3f1020) at postmaster.c:4288
#20 0x000055ad0bca735c in ServerLoop () at postmaster.c:1801
#21 0x000055ad0bca893e in PostmasterMain (argc=3, argv=<optimized out>) at postmaster.c:1473
#22 0x000055ad0bbe8e31 in main (argc=3, argv=0x55ad0c3c6660) at main.c:198

On Thu, Oct 21, 2021 at 4:25 PM wenjing <wjzeng2012@gmail.com> wrote:


Andrew Bille <andrewbille@gmail.com> 于2021年10月20日周三 上午2:59写道:
Another thanks for the fix. It works for me.

But I found another crash!
This is a check code that was added this year, but it did find a problem and I fixed it.
Please review the new code(v57) again.


Re: [Proposal] Global temporary tables

From
wenjing
Date:


Andrew Bille <andrewbille@gmail.com> 于2021年10月23日周六 下午9:22写道:
Thanks, the vacuum is fixed

But I found another crash (on v57 patches), reproduced with:

psql -t -c "create global temp table t (a integer); insert into t values (1); select count(*) from t group by t;"
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
connection to server was lost

I missed whole row and system column. It has been fixed in v58.
Please review the new code(v58) again


Wenjing

with trace:

[New LWP 2580215]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: andrew postgres [local] SELECT                                      '.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f258d482859 in __GI_abort () at abort.c:79
#2  0x000055ad0be8878f in ExceptionalCondition (conditionName=conditionName@entry=0x55ad0bf19743 "gtt_rnode->att_stat_tups[i]", errorType=errorType@entry=0x55ad0bee500b "FailedAssertion", fileName=fileName@entry=0x55ad0bf1966b "storage_gtt.c", lineNumber=lineNumber@entry=902) at assert.c:69
#3  0x000055ad0ba9379f in get_gtt_att_statistic (reloid=<optimized out>, attnum=0, inh=<optimized out>) at storage_gtt.c:902
#4  0x000055ad0be35625 in examine_simple_variable (root=root@entry=0x55ad0c498748, var=var@entry=0x55ad0c498c68, vardata=vardata@entry=0x7fff06c9ebf0) at selfuncs.c:5391
#5  0x000055ad0be36a89 in examine_variable (root=root@entry=0x55ad0c498748, node=node@entry=0x55ad0c498c68, varRelid=varRelid@entry=0, vardata=vardata@entry=0x7fff06c9ebf0) at selfuncs.c:4990
#6  0x000055ad0be3ad64 in estimate_num_groups (root=root@entry=0x55ad0c498748, groupExprs=<optimized out>, input_rows=input_rows@entry=255, pgset=pgset@entry=0x0, estinfo=estinfo@entry=0x0) at selfuncs.c:3455
#7  0x000055ad0bc50835 in get_number_of_groups (root=root@entry=0x55ad0c498748, path_rows=255, gd=gd@entry=0x0, target_list=0x55ad0c498bb8) at planner.c:3241
#8  0x000055ad0bc5576f in create_ordinary_grouping_paths (root=root@entry=0x55ad0c498748, input_rel=input_rel@entry=0x55ad0c3ce148, grouped_rel=grouped_rel@entry=0x55ad0c4983f0, agg_costs=agg_costs@entry=0x7fff06c9edb0, gd=gd@entry=0x0, extra=extra@entry=0x7fff06c9ede0, partially_grouped_rel_p=0x7fff06c9eda8)
    at planner.c:3628
#9  0x000055ad0bc55a72 in create_grouping_paths (root=root@entry=0x55ad0c498748, input_rel=input_rel@entry=0x55ad0c3ce148, target=target@entry=0x55ad0c4c95d8, target_parallel_safe=target_parallel_safe@entry=true, gd=gd@entry=0x0) at planner.c:3377
#10 0x000055ad0bc5686d in grouping_planner (root=root@entry=0x55ad0c498748, tuple_fraction=<optimized out>, tuple_fraction@entry=0) at planner.c:1592
#11 0x000055ad0bc57910 in subquery_planner (glob=glob@entry=0x55ad0c497880, parse=parse@entry=0x55ad0c3cdbb8, parent_root=parent_root@entry=0x0, hasRecursion=hasRecursion@entry=false, tuple_fraction=tuple_fraction@entry=0) at planner.c:1025
#12 0x000055ad0bc57f36 in standard_planner (parse=0x55ad0c3cdbb8, query_string=<optimized out>, cursorOptions=2048, boundParams=0x0) at planner.c:406
#13 0x000055ad0bc584d4 in planner (parse=parse@entry=0x55ad0c3cdbb8, query_string=query_string@entry=0x55ad0c3cc470 "create global temp table t (a integer); insert into t values (1); select count(*) from t group by t;", cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at planner.c:277
#14 0x000055ad0bd4855f in pg_plan_query (querytree=querytree@entry=0x55ad0c3cdbb8, query_string=query_string@entry=0x55ad0c3cc470 "create global temp table t (a integer); insert into t values (1); select count(*) from t group by t;", cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0)
    at postgres.c:847
#15 0x000055ad0bd4863b in pg_plan_queries (querytrees=0x55ad0c4986f0, query_string=query_string@entry=0x55ad0c3cc470 "create global temp table t (a integer); insert into t values (1); select count(*) from t group by t;", cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at postgres.c:939
#16 0x000055ad0bd48b20 in exec_simple_query (query_string=query_string@entry=0x55ad0c3cc470 "create global temp table t (a integer); insert into t values (1); select count(*) from t group by t;") at postgres.c:1133
#17 0x000055ad0bd4aaf3 in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4497
#18 0x000055ad0bca3f91 in BackendRun (port=port@entry=0x55ad0c3f1020) at postmaster.c:4560
#19 0x000055ad0bca7115 in BackendStartup (port=port@entry=0x55ad0c3f1020) at postmaster.c:4288
#20 0x000055ad0bca735c in ServerLoop () at postmaster.c:1801
#21 0x000055ad0bca893e in PostmasterMain (argc=3, argv=<optimized out>) at postmaster.c:1473
#22 0x000055ad0bbe8e31 in main (argc=3, argv=0x55ad0c3c6660) at main.c:198

On Thu, Oct 21, 2021 at 4:25 PM wenjing <wjzeng2012@gmail.com> wrote:


Andrew Bille <andrewbille@gmail.com> 于2021年10月20日周三 上午2:59写道:
Another thanks for the fix. It works for me.

But I found another crash!
This is a check code that was added this year, but it did find a problem and I fixed it.
Please review the new code(v57) again.





 
Attachment

Re: [Proposal] Global temporary tables

From
Andrew Bille
Date:
Thanks, the "group by" is fixed

Yet another crash (on v58 patches), reproduced with:

psql -t -c "create global temp table t(b text) with(on_commit_delete_rows=true); create index idx_b on t (b); insert into t values('test'); alter table t alter b type varchar;"
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
connection to server was lost

with trace:

[New LWP 569199]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: andrew postgres [local] ALTER TABLE                                 '.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f197493f859 in __GI_abort () at abort.c:79
#2  0x00005562b3306fb9 in ExceptionalCondition (conditionName=0x5562b34dd740 "reln->md_num_open_segs[forkNum] == 0", errorType=0x5562b34dd72c "FailedAssertion", fileName=0x5562b34dd727 "md.c", lineNumber=187) at assert.c:69
#3  0x00005562b3148f15 in mdcreate (reln=0x5562b41abdc0, forkNum=MAIN_FORKNUM, isRedo=false) at md.c:187
#4  0x00005562b314b73f in smgrcreate (reln=0x5562b41abdc0, forknum=MAIN_FORKNUM, isRedo=false) at smgr.c:335
#5  0x00005562b2d88b23 in RelationCreateStorage (rnode=..., relpersistence=103 'g', rel=0x7f196b597270) at storage.c:154
#6  0x00005562b2d5a408 in index_build (heapRelation=0x7f196b58dc40, indexRelation=0x7f196b597270, indexInfo=0x5562b4167d60, isreindex=true, parallel=false) at index.c:3038
#7  0x00005562b2d533c1 in RelationTruncateIndexes (heapRelation=0x7f196b58dc40, lockmode=1) at heap.c:3354
#8  0x00005562b2d5360b in heap_truncate_one_rel (rel=0x7f196b58dc40) at heap.c:3452
#9  0x00005562b2d53544 in heap_truncate (relids=0x5562b4167c58, is_global_temp=true) at heap.c:3410
#10 0x00005562b2ea09fc in PreCommit_on_commit_actions () at tablecmds.c:16495
#11 0x00005562b2d0d4ee in CommitTransaction () at xact.c:2140
#12 0x00005562b2d0e320 in CommitTransactionCommand () at xact.c:2979
#13 0x00005562b3151b7e in finish_xact_command () at postgres.c:2721
#14 0x00005562b314f340 in exec_simple_query (query_string=0x5562b40c2170 "create global temp table t(b text) with(on_commit_delete_rows=true); create index idx_b on t (b); insert into t values('test'); alter table t alter b type varchar;") at postgres.c:1239
#15 0x00005562b3153f0a in PostgresMain (dbname=0x5562b40ed6e8 "postgres", username=0x5562b40ed6c8 "andrew") at postgres.c:4497
#16 0x00005562b307df6e in BackendRun (port=0x5562b40e4500) at postmaster.c:4560
#17 0x00005562b307d853 in BackendStartup (port=0x5562b40e4500) at postmaster.c:4288
#18 0x00005562b3079a1d in ServerLoop () at postmaster.c:1801
#19 0x00005562b30791b6 in PostmasterMain (argc=3, argv=0x5562b40bc5b0) at postmaster.c:1473
#20 0x00005562b2f6d98e in main (argc=3, argv=0x5562b40bc5b0) at main.c:198

On Mon, Oct 25, 2021 at 7:13 PM wenjing <wjzeng2012@gmail.com> wrote:

I missed whole row and system column. It has been fixed in v58.
Please review the new code(v58) again
 

Re: [Proposal] Global temporary tables

From
wenjing
Date:


Andrew Bille <andrewbille@gmail.com> 于2021年10月28日周四 下午6:30写道:
Thanks, the "group by" is fixed

Yet another crash (on v58 patches), reproduced with:

psql -t -c "create global temp table t(b text) with(on_commit_delete_rows=true); create index idx_b on t (b); insert into t values('test'); alter table t alter b type varchar;"
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
connection to server was lost
Thank you for pointing that out.
This is due to an optimization point: ALTER Table reuses the relfilenode of the old index.
I have banned this optimization point for GTT, I am not entirely sure it is appropriate, maybe you can give some suggestions.
Please review the new code(v59).


Wenjing

 

with trace:

[New LWP 569199]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: andrew postgres [local] ALTER TABLE                                 '.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f197493f859 in __GI_abort () at abort.c:79
#2  0x00005562b3306fb9 in ExceptionalCondition (conditionName=0x5562b34dd740 "reln->md_num_open_segs[forkNum] == 0", errorType=0x5562b34dd72c "FailedAssertion", fileName=0x5562b34dd727 "md.c", lineNumber=187) at assert.c:69
#3  0x00005562b3148f15 in mdcreate (reln=0x5562b41abdc0, forkNum=MAIN_FORKNUM, isRedo=false) at md.c:187
#4  0x00005562b314b73f in smgrcreate (reln=0x5562b41abdc0, forknum=MAIN_FORKNUM, isRedo=false) at smgr.c:335
#5  0x00005562b2d88b23 in RelationCreateStorage (rnode=..., relpersistence=103 'g', rel=0x7f196b597270) at storage.c:154
#6  0x00005562b2d5a408 in index_build (heapRelation=0x7f196b58dc40, indexRelation=0x7f196b597270, indexInfo=0x5562b4167d60, isreindex=true, parallel=false) at index.c:3038
#7  0x00005562b2d533c1 in RelationTruncateIndexes (heapRelation=0x7f196b58dc40, lockmode=1) at heap.c:3354
#8  0x00005562b2d5360b in heap_truncate_one_rel (rel=0x7f196b58dc40) at heap.c:3452
#9  0x00005562b2d53544 in heap_truncate (relids=0x5562b4167c58, is_global_temp=true) at heap.c:3410
#10 0x00005562b2ea09fc in PreCommit_on_commit_actions () at tablecmds.c:16495
#11 0x00005562b2d0d4ee in CommitTransaction () at xact.c:2140
#12 0x00005562b2d0e320 in CommitTransactionCommand () at xact.c:2979
#13 0x00005562b3151b7e in finish_xact_command () at postgres.c:2721
#14 0x00005562b314f340 in exec_simple_query (query_string=0x5562b40c2170 "create global temp table t(b text) with(on_commit_delete_rows=true); create index idx_b on t (b); insert into t values('test'); alter table t alter b type varchar;") at postgres.c:1239
#15 0x00005562b3153f0a in PostgresMain (dbname=0x5562b40ed6e8 "postgres", username=0x5562b40ed6c8 "andrew") at postgres.c:4497
#16 0x00005562b307df6e in BackendRun (port=0x5562b40e4500) at postmaster.c:4560
#17 0x00005562b307d853 in BackendStartup (port=0x5562b40e4500) at postmaster.c:4288
#18 0x00005562b3079a1d in ServerLoop () at postmaster.c:1801
#19 0x00005562b30791b6 in PostmasterMain (argc=3, argv=0x5562b40bc5b0) at postmaster.c:1473
#20 0x00005562b2f6d98e in main (argc=3, argv=0x5562b40bc5b0) at main.c:198

On Mon, Oct 25, 2021 at 7:13 PM wenjing <wjzeng2012@gmail.com> wrote:

I missed whole row and system column. It has been fixed in v58.
Please review the new code(v58) again
 

 
Attachment

Re: [Proposal] Global temporary tables

From
wenjing
Date:


wenjing <wjzeng2012@gmail.com> 于2021年10月30日周六 上午1:28写道:


Andrew Bille <andrewbille@gmail.com> 于2021年10月28日周四 下午6:30写道:
Thanks, the "group by" is fixed

Yet another crash (on v58 patches), reproduced with:

psql -t -c "create global temp table t(b text) with(on_commit_delete_rows=true); create index idx_b on t (b); insert into t values('test'); alter table t alter b type varchar;"
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
connection to server was lost
Thank you for pointing that out.
This is due to an optimization point: ALTER Table reuses the relfilenode of the old index.
I have banned this optimization point for GTT, I am not entirely sure it is appropriate, maybe you can give some suggestions.
Please review the new code(v59).


Wenjing

 

with trace:

[New LWP 569199]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: andrew postgres [local] ALTER TABLE                                 '.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f197493f859 in __GI_abort () at abort.c:79
#2  0x00005562b3306fb9 in ExceptionalCondition (conditionName=0x5562b34dd740 "reln->md_num_open_segs[forkNum] == 0", errorType=0x5562b34dd72c "FailedAssertion", fileName=0x5562b34dd727 "md.c", lineNumber=187) at assert.c:69
#3  0x00005562b3148f15 in mdcreate (reln=0x5562b41abdc0, forkNum=MAIN_FORKNUM, isRedo=false) at md.c:187
#4  0x00005562b314b73f in smgrcreate (reln=0x5562b41abdc0, forknum=MAIN_FORKNUM, isRedo=false) at smgr.c:335
#5  0x00005562b2d88b23 in RelationCreateStorage (rnode=..., relpersistence=103 'g', rel=0x7f196b597270) at storage.c:154
#6  0x00005562b2d5a408 in index_build (heapRelation=0x7f196b58dc40, indexRelation=0x7f196b597270, indexInfo=0x5562b4167d60, isreindex=true, parallel=false) at index.c:3038
#7  0x00005562b2d533c1 in RelationTruncateIndexes (heapRelation=0x7f196b58dc40, lockmode=1) at heap.c:3354
#8  0x00005562b2d5360b in heap_truncate_one_rel (rel=0x7f196b58dc40) at heap.c:3452
#9  0x00005562b2d53544 in heap_truncate (relids=0x5562b4167c58, is_global_temp=true) at heap.c:3410
#10 0x00005562b2ea09fc in PreCommit_on_commit_actions () at tablecmds.c:16495
#11 0x00005562b2d0d4ee in CommitTransaction () at xact.c:2140
#12 0x00005562b2d0e320 in CommitTransactionCommand () at xact.c:2979
#13 0x00005562b3151b7e in finish_xact_command () at postgres.c:2721
#14 0x00005562b314f340 in exec_simple_query (query_string=0x5562b40c2170 "create global temp table t(b text) with(on_commit_delete_rows=true); create index idx_b on t (b); insert into t values('test'); alter table t alter b type varchar;") at postgres.c:1239
#15 0x00005562b3153f0a in PostgresMain (dbname=0x5562b40ed6e8 "postgres", username=0x5562b40ed6c8 "andrew") at postgres.c:4497
#16 0x00005562b307df6e in BackendRun (port=0x5562b40e4500) at postmaster.c:4560
#17 0x00005562b307d853 in BackendStartup (port=0x5562b40e4500) at postmaster.c:4288
#18 0x00005562b3079a1d in ServerLoop () at postmaster.c:1801
#19 0x00005562b30791b6 in PostmasterMain (argc=3, argv=0x5562b40bc5b0) at postmaster.c:1473
#20 0x00005562b2f6d98e in main (argc=3, argv=0x5562b40bc5b0) at main.c:198

On Mon, Oct 25, 2021 at 7:13 PM wenjing <wjzeng2012@gmail.com> wrote:

I missed whole row and system column. It has been fixed in v58.
Please review the new code(v58) again
 

 
Hi Andrew

I fixed a problem found during testing.
GTT version updated to v60.


Wenjing.


 
Attachment

Re: [Proposal] Global temporary tables

From
wenjing
Date:


wenjing <wjzeng2012@gmail.com> 于2021年11月9日周二 下午4:51写道:


wenjing <wjzeng2012@gmail.com> 于2021年10月30日周六 上午1:28写道:


Andrew Bille <andrewbille@gmail.com> 于2021年10月28日周四 下午6:30写道:
Thanks, the "group by" is fixed

Yet another crash (on v58 patches), reproduced with:

psql -t -c "create global temp table t(b text) with(on_commit_delete_rows=true); create index idx_b on t (b); insert into t values('test'); alter table t alter b type varchar;"
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
connection to server was lost
Thank you for pointing that out.
This is due to an optimization point: ALTER Table reuses the relfilenode of the old index.
I have banned this optimization point for GTT, I am not entirely sure it is appropriate, maybe you can give some suggestions.
Please review the new code(v59).


Wenjing

 

with trace:

[New LWP 569199]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: andrew postgres [local] ALTER TABLE                                 '.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f197493f859 in __GI_abort () at abort.c:79
#2  0x00005562b3306fb9 in ExceptionalCondition (conditionName=0x5562b34dd740 "reln->md_num_open_segs[forkNum] == 0", errorType=0x5562b34dd72c "FailedAssertion", fileName=0x5562b34dd727 "md.c", lineNumber=187) at assert.c:69
#3  0x00005562b3148f15 in mdcreate (reln=0x5562b41abdc0, forkNum=MAIN_FORKNUM, isRedo=false) at md.c:187
#4  0x00005562b314b73f in smgrcreate (reln=0x5562b41abdc0, forknum=MAIN_FORKNUM, isRedo=false) at smgr.c:335
#5  0x00005562b2d88b23 in RelationCreateStorage (rnode=..., relpersistence=103 'g', rel=0x7f196b597270) at storage.c:154
#6  0x00005562b2d5a408 in index_build (heapRelation=0x7f196b58dc40, indexRelation=0x7f196b597270, indexInfo=0x5562b4167d60, isreindex=true, parallel=false) at index.c:3038
#7  0x00005562b2d533c1 in RelationTruncateIndexes (heapRelation=0x7f196b58dc40, lockmode=1) at heap.c:3354
#8  0x00005562b2d5360b in heap_truncate_one_rel (rel=0x7f196b58dc40) at heap.c:3452
#9  0x00005562b2d53544 in heap_truncate (relids=0x5562b4167c58, is_global_temp=true) at heap.c:3410
#10 0x00005562b2ea09fc in PreCommit_on_commit_actions () at tablecmds.c:16495
#11 0x00005562b2d0d4ee in CommitTransaction () at xact.c:2140
#12 0x00005562b2d0e320 in CommitTransactionCommand () at xact.c:2979
#13 0x00005562b3151b7e in finish_xact_command () at postgres.c:2721
#14 0x00005562b314f340 in exec_simple_query (query_string=0x5562b40c2170 "create global temp table t(b text) with(on_commit_delete_rows=true); create index idx_b on t (b); insert into t values('test'); alter table t alter b type varchar;") at postgres.c:1239
#15 0x00005562b3153f0a in PostgresMain (dbname=0x5562b40ed6e8 "postgres", username=0x5562b40ed6c8 "andrew") at postgres.c:4497
#16 0x00005562b307df6e in BackendRun (port=0x5562b40e4500) at postmaster.c:4560
#17 0x00005562b307d853 in BackendStartup (port=0x5562b40e4500) at postmaster.c:4288
#18 0x00005562b3079a1d in ServerLoop () at postmaster.c:1801
#19 0x00005562b30791b6 in PostmasterMain (argc=3, argv=0x5562b40bc5b0) at postmaster.c:1473
#20 0x00005562b2f6d98e in main (argc=3, argv=0x5562b40bc5b0) at main.c:198

On Mon, Oct 25, 2021 at 7:13 PM wenjing <wjzeng2012@gmail.com> wrote:

I missed whole row and system column. It has been fixed in v58.
Please review the new code(v58) again
 

 
Hi Andrew

I fixed a problem found during testing.
GTT version updated to v60.


Wenjing.


 

Fixed a bug in function pg_gtt_attached_pid.
Looking forward to your reply.


Wenjing


 
Attachment

Re: [Proposal] Global temporary tables

From
Andrew Bille
Date:
Thanks for the patches. The feature has become much more stable.
However, there is another simple case that generates an error:
Master with v61 patches

CREATE GLOBAL TEMPORARY TABLE t AS SELECT 1 AS a;
ERROR:  could not open file "base/13560/t3_16384": No such file or directory
Andrew

On Thu, Nov 11, 2021 at 3:15 PM wenjing <wjzeng2012@gmail.com> wrote:
Fixed a bug in function pg_gtt_attached_pid.
Looking forward to your reply.


Wenjing

Re: [Proposal] Global temporary tables

From
wenjing
Date:


Andrew Bille <andrewbille@gmail.com> 于2021年11月15日周一 下午6:34写道:
Thanks for the patches. The feature has become much more stable.
However, there is another simple case that generates an error:
Master with v61 patches

CREATE GLOBAL TEMPORARY TABLE t AS SELECT 1 AS a;
ERROR:  could not open file "base/13560/t3_16384": No such file or directory
Thank you for pointing out that this part is not reasonable enough.
This issue has been fixed in v62.
Looking forward to your reply.


Wenjing

 
Andrew

On Thu, Nov 11, 2021 at 3:15 PM wenjing <wjzeng2012@gmail.com> wrote:
Fixed a bug in function pg_gtt_attached_pid.
Looking forward to your reply.


Wenjing



 
Attachment

Re: [Proposal] Global temporary tables

From
wenjing zeng
Date:
Post GTT v63 to fixed conflicts with the latest code.



Hi Andrew

Have you found any new bugs recently?



Wenjing




2021年11月20日 01:31,wenjing <wjzeng2012@gmail.com> 写道:



Andrew Bille <andrewbille@gmail.com> 于2021年11月15日周一 下午6:34写道:
Thanks for the patches. The feature has become much more stable.
However, there is another simple case that generates an error:
Master with v61 patches

CREATE GLOBAL TEMPORARY TABLE t AS SELECT 1 AS a;
ERROR:  could not open file "base/13560/t3_16384": No such file or directory
Thank you for pointing out that this part is not reasonable enough.
This issue has been fixed in v62.
Looking forward to your reply.


Wenjing

 
Andrew

On Thu, Nov 11, 2021 at 3:15 PM wenjing <wjzeng2012@gmail.com> wrote:
Fixed a bug in function pg_gtt_attached_pid.
Looking forward to your reply.


Wenjing



 
<0001-gtt-v62-reademe.patch><0004-gtt-v62-regress.patch><0002-gtt-v62-doc.patch><0003-gtt-v62-implementation.patch>



Attachment

Re: [Proposal] Global temporary tables

From
Andrew Bille
Date:
Hi!
Thanks for new patches.
Yet another crash reproduced on master with v63 patches:

CREATE TABLESPACE ts LOCATION '/tmp/ts';
CREATE GLOBAL TEMP TABLE tbl (num1 bigint);
INSERT INTO tbl (num1) values (1);
CREATE INDEX tbl_idx ON tbl (num1);
REINDEX (TABLESPACE ts) TABLE tbl;

Got error:
CREATE TABLESPACE
CREATE TABLE
INSERT 0 1
CREATE INDEX
WARNING:  AbortTransaction while in COMMIT state
ERROR:  gtt relfilenode 16388 not found in rel 16388
PANIC:  cannot abort transaction 726, it was already committed
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
connection to server was lost

in log:
2021-12-21 12:54:08.273 +07 [208725] ERROR:  gtt relfilenode 16388 not found in rel 16388
2021-12-21 12:54:08.273 +07 [208725] STATEMENT:  REINDEX (TABLESPACE ts) TABLE tbl;
2021-12-21 12:54:08.273 +07 [208725] WARNING:  AbortTransaction while in COMMIT state
2021-12-21 12:54:08.273 +07 [208725] PANIC:  cannot abort transaction 726, it was already committed
2021-12-21 12:54:08.775 +07 [208716] LOG:  server process (PID 208725) was terminated by signal 6: Аварийный останов
2021-12-21 12:54:08.775 +07 [208716] DETAIL:  Failed process was running: REINDEX (TABLESPACE ts) TABLE tbl;
2021-12-21 12:54:08.775 +07 [208716] LOG:  terminating any other active server processes
2021-12-21 12:54:08.775 +07 [208716] LOG:  all server processes terminated; reinitializing

with dump:
[New LWP 208725]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: andrew postgres [local] REINDEX              '.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50      ../sysdeps/unix/sysv/linux/raise.c: Нет такого файла или каталога.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007feadfac7859 in __GI_abort () at abort.c:79
#2  0x000055e36b6d9ec7 in errfinish (filename=0x55e36b786e20 "xact.c", lineno=1729, funcname=0x55e36b788660 <__func__.29619> "RecordTransactionAbort") at elog.c:680
#3  0x000055e36b0d6e37 in RecordTransactionAbort (isSubXact=false) at xact.c:1729
#4  0x000055e36b0d7f64 in AbortTransaction () at xact.c:2787
#5  0x000055e36b0d88fa in AbortCurrentTransaction () at xact.c:3315
#6  0x000055e36b524f33 in PostgresMain (dbname=0x55e36d4d97b8 "postgres", username=0x55e36d4d9798 "andrew") at postgres.c:4252
#7  0x000055e36b44d1e0 in BackendRun (port=0x55e36d4d1020) at postmaster.c:4594
#8  0x000055e36b44cac5 in BackendStartup (port=0x55e36d4d1020) at postmaster.c:4322
#9  0x000055e36b448bad in ServerLoop () at postmaster.c:1802
#10 0x000055e36b448346 in PostmasterMain (argc=3, argv=0x55e36d4a84d0) at postmaster.c:1474
#11 0x000055e36b33b5ca in main (argc=3, argv=0x55e36d4a84d0) at main.c:198

Regards!

On Mon, Dec 20, 2021 at 7:42 PM wenjing zeng <wjzeng2012@gmail.com> wrote:
Post GTT v63 to fixed conflicts with the latest code.



Hi Andrew

Have you found any new bugs recently?



Wenjing




2021年11月20日 01:31,wenjing <wjzeng2012@gmail.com> 写道:



Andrew Bille <andrewbille@gmail.com> 于2021年11月15日周一 下午6:34写道:
Thanks for the patches. The feature has become much more stable.
However, there is another simple case that generates an error:
Master with v61 patches

CREATE GLOBAL TEMPORARY TABLE t AS SELECT 1 AS a;
ERROR:  could not open file "base/13560/t3_16384": No such file or directory
Thank you for pointing out that this part is not reasonable enough.
This issue has been fixed in v62.
Looking forward to your reply.


Wenjing

 
Andrew

On Thu, Nov 11, 2021 at 3:15 PM wenjing <wjzeng2012@gmail.com> wrote:
Fixed a bug in function pg_gtt_attached_pid.
Looking forward to your reply.


Wenjing



 
<0001-gtt-v62-reademe.patch><0004-gtt-v62-regress.patch><0002-gtt-v62-doc.patch><0003-gtt-v62-implementation.patch>



Re: [Proposal] Global temporary tables

From
wenjing
Date:


Andrew Bille <andrewbille@gmail.com> 于2021年12月21日周二 14:00写道:
Hi!
Thanks for new patches.
Yet another crash reproduced on master with v63 patches:

CREATE TABLESPACE ts LOCATION '/tmp/ts';
CREATE GLOBAL TEMP TABLE tbl (num1 bigint);
INSERT INTO tbl (num1) values (1);
CREATE INDEX tbl_idx ON tbl (num1);
REINDEX (TABLESPACE ts) TABLE tbl;
This is a feature made in PG14 that supports reindex change tablespaces. 
Thank you for pointing that out and I fixed it in v64.
Waiting for your feedback.

regards

Wenjing
 

Got error:
CREATE TABLESPACE
CREATE TABLE
INSERT 0 1
CREATE INDEX
WARNING:  AbortTransaction while in COMMIT state
ERROR:  gtt relfilenode 16388 not found in rel 16388
PANIC:  cannot abort transaction 726, it was already committed
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
connection to server was lost

in log:
2021-12-21 12:54:08.273 +07 [208725] ERROR:  gtt relfilenode 16388 not found in rel 16388
2021-12-21 12:54:08.273 +07 [208725] STATEMENT:  REINDEX (TABLESPACE ts) TABLE tbl;
2021-12-21 12:54:08.273 +07 [208725] WARNING:  AbortTransaction while in COMMIT state
2021-12-21 12:54:08.273 +07 [208725] PANIC:  cannot abort transaction 726, it was already committed
2021-12-21 12:54:08.775 +07 [208716] LOG:  server process (PID 208725) was terminated by signal 6: Аварийный останов
2021-12-21 12:54:08.775 +07 [208716] DETAIL:  Failed process was running: REINDEX (TABLESPACE ts) TABLE tbl;
2021-12-21 12:54:08.775 +07 [208716] LOG:  terminating any other active server processes
2021-12-21 12:54:08.775 +07 [208716] LOG:  all server processes terminated; reinitializing

with dump:
[New LWP 208725]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: andrew postgres [local] REINDEX              '.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50      ../sysdeps/unix/sysv/linux/raise.c: Нет такого файла или каталога.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007feadfac7859 in __GI_abort () at abort.c:79
#2  0x000055e36b6d9ec7 in errfinish (filename=0x55e36b786e20 "xact.c", lineno=1729, funcname=0x55e36b788660 <__func__.29619> "RecordTransactionAbort") at elog.c:680
#3  0x000055e36b0d6e37 in RecordTransactionAbort (isSubXact=false) at xact.c:1729
#4  0x000055e36b0d7f64 in AbortTransaction () at xact.c:2787
#5  0x000055e36b0d88fa in AbortCurrentTransaction () at xact.c:3315
#6  0x000055e36b524f33 in PostgresMain (dbname=0x55e36d4d97b8 "postgres", username=0x55e36d4d9798 "andrew") at postgres.c:4252
#7  0x000055e36b44d1e0 in BackendRun (port=0x55e36d4d1020) at postmaster.c:4594
#8  0x000055e36b44cac5 in BackendStartup (port=0x55e36d4d1020) at postmaster.c:4322
#9  0x000055e36b448bad in ServerLoop () at postmaster.c:1802
#10 0x000055e36b448346 in PostmasterMain (argc=3, argv=0x55e36d4a84d0) at postmaster.c:1474
#11 0x000055e36b33b5ca in main (argc=3, argv=0x55e36d4a84d0) at main.c:198

Regards!

On Mon, Dec 20, 2021 at 7:42 PM wenjing zeng <wjzeng2012@gmail.com> wrote:
Post GTT v63 to fixed conflicts with the latest code.



Hi Andrew

Have you found any new bugs recently?



Wenjing




2021年11月20日 01:31,wenjing <wjzeng2012@gmail.com> 写道:



Andrew Bille <andrewbille@gmail.com> 于2021年11月15日周一 下午6:34写道:
Thanks for the patches. The feature has become much more stable.
However, there is another simple case that generates an error:
Master with v61 patches

CREATE GLOBAL TEMPORARY TABLE t AS SELECT 1 AS a;
ERROR:  could not open file "base/13560/t3_16384": No such file or directory
Thank you for pointing out that this part is not reasonable enough.
This issue has been fixed in v62.
Looking forward to your reply.


Wenjing

 
Andrew

On Thu, Nov 11, 2021 at 3:15 PM wenjing <wjzeng2012@gmail.com> wrote:
Fixed a bug in function pg_gtt_attached_pid.
Looking forward to your reply.


Wenjing



 
<0001-gtt-v62-reademe.patch><0004-gtt-v62-regress.patch><0002-gtt-v62-doc.patch><0003-gtt-v62-implementation.patch>



Attachment

Re: [Proposal] Global temporary tables

From
Andrew Bille
Date:
Hi!

I could not detect crashes with your last patch, so I think the patch is ready for a review.
Please, also consider fixing error messages, as existing ones don't follow message writing guidelines. https://www.postgresql.org/docs/14/error-style-guide.html

Regards, Andrew

On Thu, Dec 23, 2021 at 7:36 PM wenjing <wjzeng2012@gmail.com> wrote:


Andrew Bille <andrewbille@gmail.com> 于2021年12月21日周二 14:00写道:
Hi!
Thanks for new patches.
Yet another crash reproduced on master with v63 patches:

CREATE TABLESPACE ts LOCATION '/tmp/ts';
CREATE GLOBAL TEMP TABLE tbl (num1 bigint);
INSERT INTO tbl (num1) values (1);
CREATE INDEX tbl_idx ON tbl (num1);
REINDEX (TABLESPACE ts) TABLE tbl;
This is a feature made in PG14 that supports reindex change tablespaces. 
Thank you for pointing that out and I fixed it in v64.
Waiting for your feedback.

Re: [Proposal] Global temporary tables

From
Wenjing Zeng
Date:
very glad to see your reply.
Thank you very much for your review of the code and found so many problems.
There was a conflict between the latest code and patch, I have corrected it and provided a new patch (V65).
Waiting for your feedback.


Regards, Wenjing.


Andrew Bille <andrewbille@gmail.com> 于2022年1月10日周一 17:17写道:
Hi!

I could not detect crashes with your last patch, so I think the patch is ready for a review.
Please, also consider fixing error messages, as existing ones don't follow message writing guidelines. https://www.postgresql.org/docs/14/error-style-guide.html

I corrected the ERROR message of GTT according to the link and the existing error message.
Some comments and code refactoring were also done.
 

Regards, Andrew

On Thu, Dec 23, 2021 at 7:36 PM wenjing <wjzeng2012@gmail.com> wrote:


Andrew Bille <andrewbille@gmail.com> 于2021年12月21日周二 14:00写道:
Hi!
Thanks for new patches.
Yet another crash reproduced on master with v63 patches:

CREATE TABLESPACE ts LOCATION '/tmp/ts';
CREATE GLOBAL TEMP TABLE tbl (num1 bigint);
INSERT INTO tbl (num1) values (1);
CREATE INDEX tbl_idx ON tbl (num1);
REINDEX (TABLESPACE ts) TABLE tbl;
This is a feature made in PG14 that supports reindex change tablespaces. 
Thank you for pointing that out and I fixed it in v64.
Waiting for your feedback.
 
Attachment

Re: [Proposal] Global temporary tables

From
Wenjing Zeng
Date:
Update GTT v66 to fix conflicts with the latest code.

Regards, Wenjing.


Wenjing Zeng <wjzeng2012@gmail.com> 于2022年1月20日周四 17:53写道:
very glad to see your reply.
Thank you very much for your review of the code and found so many problems.
There was a conflict between the latest code and patch, I have corrected it and provided a new patch (V65).
Waiting for your feedback.


Regards, Wenjing.


Andrew Bille <andrewbille@gmail.com> 于2022年1月10日周一 17:17写道:
Hi!

I could not detect crashes with your last patch, so I think the patch is ready for a review.
Please, also consider fixing error messages, as existing ones don't follow message writing guidelines. https://www.postgresql.org/docs/14/error-style-guide.html

I corrected the ERROR message of GTT according to the link and the existing error message.
Some comments and code refactoring were also done.
 

Regards, Andrew

On Thu, Dec 23, 2021 at 7:36 PM wenjing <wjzeng2012@gmail.com> wrote:


Andrew Bille <andrewbille@gmail.com> 于2021年12月21日周二 14:00写道:
Hi!
Thanks for new patches.
Yet another crash reproduced on master with v63 patches:

CREATE TABLESPACE ts LOCATION '/tmp/ts';
CREATE GLOBAL TEMP TABLE tbl (num1 bigint);
INSERT INTO tbl (num1) values (1);
CREATE INDEX tbl_idx ON tbl (num1);
REINDEX (TABLESPACE ts) TABLE tbl;
This is a feature made in PG14 that supports reindex change tablespaces. 
Thank you for pointing that out and I fixed it in v64.
Waiting for your feedback.
 

 
Attachment

Re: [Proposal] Global temporary tables

From
Wenjing Zeng
Date:
Update GTT v67 to fix conflicts with the latest code.

Regards, Wenjing.

Wenjing Zeng <wjzeng2012@gmail.com> 于2022年2月15日周二 17:03写道:
Update GTT v66 to fix conflicts with the latest code.

Regards, Wenjing.


Wenjing Zeng <wjzeng2012@gmail.com> 于2022年1月20日周四 17:53写道:
very glad to see your reply.
Thank you very much for your review of the code and found so many problems.
There was a conflict between the latest code and patch, I have corrected it and provided a new patch (V65).
Waiting for your feedback.


Regards, Wenjing.


Andrew Bille <andrewbille@gmail.com> 于2022年1月10日周一 17:17写道:
Hi!

I could not detect crashes with your last patch, so I think the patch is ready for a review.
Please, also consider fixing error messages, as existing ones don't follow message writing guidelines. https://www.postgresql.org/docs/14/error-style-guide.html

I corrected the ERROR message of GTT according to the link and the existing error message.
Some comments and code refactoring were also done.
 

Regards, Andrew

On Thu, Dec 23, 2021 at 7:36 PM wenjing <wjzeng2012@gmail.com> wrote:


Andrew Bille <andrewbille@gmail.com> 于2021年12月21日周二 14:00写道:
Hi!
Thanks for new patches.
Yet another crash reproduced on master with v63 patches:

CREATE TABLESPACE ts LOCATION '/tmp/ts';
CREATE GLOBAL TEMP TABLE tbl (num1 bigint);
INSERT INTO tbl (num1) values (1);
CREATE INDEX tbl_idx ON tbl (num1);
REINDEX (TABLESPACE ts) TABLE tbl;
This is a feature made in PG14 that supports reindex change tablespaces. 
Thank you for pointing that out and I fixed it in v64.
Waiting for your feedback.
 

 

 
Attachment

Re: [Proposal] Global temporary tables

From
Andres Freund
Date:
Hi,


This is a huge thread. Realistically reviewers and committers can't reread
it. I think there needs to be more of a description of how this works included
in the patchset and *why* it works that way. The readme does a bit of that,
but not particularly well.


On 2022-02-25 14:26:47 +0800, Wenjing Zeng wrote:
> +++ b/README.gtt.txt
> @@ -0,0 +1,172 @@
> +Global Temporary Table(GTT)
> +=========================================
> +
> +Feature description
> +-----------------------------------------
> +
> +Previously, temporary tables are defined once and automatically
> +exist (starting with empty contents) in every session before using them.

I think for a README "previously" etc isn't good language - if it were
commited, it'd not be understandable anymore. It makes more sense for commit
messages etc.


> +Main design ideas
> +-----------------------------------------
> +In general, GTT and LTT use the same storage and buffer design and
> +implementation. The storage files for both types of temporary tables are named
> +as t_backendid_relfilenode, and the local buffer is used to cache the data.

What does "named ast_backendid_relfilenode" mean?


> +The schema of GTTs is shared among sessions while their data are not. We build
> +a new mechanisms to manage those non-shared data and their statistics.
> +Here is the summary of changes:
> +
> +1) CATALOG
> +GTTs store session-specific data. The storage information of GTTs'data, their
> +transaction information, and their statistics are not stored in the catalog.
> +
> +2) STORAGE INFO & STATISTICS INFO & TRANSACTION INFO
> +In order to maintain durability and availability of GTTs'session-specific data,
> +their storage information, statistics, and transaction information is managed
> +in a local hash table tt_storage_local_hash.

"maintain durability"? Durable across what? In the context of databases it's
typically about crash safety, but that can't be the case here.


> +3) DDL
> +Currently, GTT supports almost all table'DDL except CLUSTER/VACUUM FULL.
> +Part of the DDL behavior is limited by shared definitions and multiple copies of
> +local data, and we added some structures to handle this.

> +A shared hash table active_gtt_shared_hash is added to track the state of the
> +GTT in a different session. This information is recorded in the hash table
> +during the DDL execution of the GTT.

> +The data stored in a GTT can only be modified or accessed by owning session.
> +The statements that only modify data in a GTT do not need a high level of
> +table locking. The operations making those changes include truncate GTT,
> +reindex GTT, and lock GTT.

I think you need to introduce a bit more terminology for any of this to make
sense. Sometimes GTT means the global catalog entity, sometimes, like here, it
appears to mean the session specific contents of a GTT.

What state of a GTT in a nother session?


How do GTTs handle something like BEGIN; TRUNCATE some_gtt_table; ROLLBACK;?


> +1.2 on commit clause
> +LTT's status associated with on commit DELETE ROWS and on commit PRESERVE ROWS
> +is not stored in catalog. Instead, GTTs need a bool value on_commit_delete_rows
> +in reloptions which is shared among sessions.

Why?



> +2.3 statistics info
> +1) relpages reltuples relallvisible relfilenode

?


> +3 DDL
> +3.1. active_gtt_shared_hash
> +This is the hash table created in shared memory to trace the GTT files initialized
> +in each session. Each hash entry contains a bitmap that records the backendid of
> +the initialized GTT file. With this hash table, we know which backend/session
> +is using this GTT. Such information is used during GTT's DDL operations.

So there's a separate locking protocol for GTTs that doesn't use the normal
locking infrastructure? Why?


> +3.7 CLUSTER GTT/VACUUM FULL GTT
> +The current version does not support.

Why?


> +4 MVCC commit log(clog) cleanup
> +
> +The GTT storage file contains transaction information. Queries for GTT data rely
> +on transaction information such as clog. The transaction information required by
> +each session may be completely different.

Why is transaction information different between sessions? Or does this just
mean that different transaction ids will be accessed?



0003-gtt-v67-implementation.patch
 71 files changed, 3167 insertions(+), 195 deletions(-)

This needs to be broken into smaller chunks to be reviewable.


> @@ -677,6 +678,14 @@ _bt_getrootheight(Relation rel)
>      {
>          Buffer        metabuf;
>  
> +        /*
> +         * If a global temporary table storage file is not initialized in the
> +         * this session, its index does not have a root page, just returns 0.
> +         */
> +        if (RELATION_IS_GLOBAL_TEMP(rel) &&
> +            !gtt_storage_attached(RelationGetRelid(rel)))
> +            return 0;
> +
>          metabuf = _bt_getbuf(rel, BTREE_METAPAGE, BT_READ);
>          metad = _bt_getmeta(rel, metabuf);

Stuff like this seems not acceptable. Accesses would have to be prevented much
earlier. Otherwise each index method is going to need copies of this logic. I
also doubt that _bt_getrootheight() is the only place that'd need this.


>  static void
>  index_update_stats(Relation rel,
>                     bool hasindex,
> -                   double reltuples)
> +                   double reltuples,
> +                   bool isreindex)
>  {
>      Oid            relid = RelationGetRelid(rel);
>      Relation    pg_class;
> @@ -2797,6 +2824,13 @@ index_update_stats(Relation rel,
>      Form_pg_class rd_rel;
>      bool        dirty;
>  
> +    /*
> +     * Most of the global Temp table data is updated to the local hash, and reindex
> +     * does not refresh relcache, so call a separate function.
> +     */
> +    if (RELATION_IS_GLOBAL_TEMP(rel))
> +        return index_update_gtt_relstats(rel, hasindex, reltuples, isreindex);
> +

So basically every single place in the code that does catalog accesses is
going to need a completely separate implementation for GTTs? That seems
unmaintainable.



> +/*-------------------------------------------------------------------------
> + *
> + * storage_gtt.c
> + *      The body implementation of Global temparary table.
> + *
> + * IDENTIFICATION
> + *      src/backend/catalog/storage_gtt.c
> + *
> + *      See src/backend/catalog/GTT_README for Global temparary table's
> + *      requirements and design.
> + *
> + *-------------------------------------------------------------------------
> + */

I don't think that path to the readme is correct.

Greetings,

Andres Freund



Re: [Proposal] Global temporary tables

From
Justin Pryzby
Date:
I read through this.
Find attached some language fixes.  You should be able to apply each "fix"
patch on top of your own local branch with git am, and then squish them
together.  Let me know if you have trouble with that.

I think get_seqence_start_value() should be static.  (Or otherwise, it should
be in lsyscache.c).

The include added to execPartition.c seems to be unused.

+#define RELATION_IS_TEMP_ON_CURRENT_SESSION(relation) \
+#define RELATION_IS_TEMP(relation) \
+#define RelpersistenceTsTemp(relpersistence) \
+#define RELATION_GTT_ON_COMMIT_DELETE(relation)    \

=> These macros can evaluate their arguments multiple times.
You should add a comment to warn about that.  And maybe avoid passing them a
function argument, like: RelpersistenceTsTemp(get_rel_persistence(rte->relid))

+list_all_backend_gtt_frozenxids should return TransactionId not int.
The function name should say "oldest" and not "all" ?

I think the GUC should have a longer name.  max_active_gtt is too short for a
global var.

+#define    MIN_NUM_ACTIVE_GTT          0
+#define    DEFAULT_NUM_ACTIVE_GTT          1000
+#define    MAX_NUM_ACTIVE_GTT          1000000

+int        max_active_gtt = MIN_NUM_ACTIVE_GTT

It's being initialized to MIN, but then the GUC machinery sets it to DEFAULT.
By convention, it should be initialized to default.

fout->remoteVersion >= 140000

=> should say 15

describe.c has gettext_noop("session"), which is a half-truth.  The data is
per-session but the table definition is persistent..

You redirect stats from pg_class and pg_statistics to a local hash table.
This is pretty hairy :(
I guess you'd also need to handle pg_statistic_ext and ext_data.
pg_stats doesn't work, since the data isn't in pg_statistic - it'd need to look
at pg_get_gtt_statistics.

I wonder if there's a better way to do it, like updating pg_statistic but
forcing the changes to be rolled back when the session ends...  But I think
that would make longrunning sessions behave badly, the same as "longrunning
transactions".

Have you looked at Gilles Darold's GTT extension ?

Attachment

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:
Hi

You redirect stats from pg_class and pg_statistics to a local hash table.
This is pretty hairy :(
I guess you'd also need to handle pg_statistic_ext and ext_data.
pg_stats doesn't work, since the data isn't in pg_statistic - it'd need to look
at pg_get_gtt_statistics.

Without this, the GTT will be terribly slow like current temporary tables with a lot of problems with bloating of pg_class, pg_attribute and pg_depend tables.

Regards

Pavel


Re: [Proposal] Global temporary tables

From
Andres Freund
Date:
Hi,

On 2022-02-27 04:17:52 +0100, Pavel Stehule wrote:
> > You redirect stats from pg_class and pg_statistics to a local hash table.
> > This is pretty hairy :(

As is I think the patch is architecturally completely unacceptable. Having
code everywhere to redirect to manually written in-memory catalog table code
isn't maintainable.


> > I guess you'd also need to handle pg_statistic_ext and ext_data.
> > pg_stats doesn't work, since the data isn't in pg_statistic - it'd need to
> > look
> > at pg_get_gtt_statistics.
>
> Without this, the GTT will be terribly slow like current temporary tables
> with a lot of problems with bloating of pg_class, pg_attribute and
> pg_depend tables.

I think it's not a great idea to solve multiple complicated problems at
once...

Greetings,

Andres Freund



Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


ne 27. 2. 2022 v 5:13 odesílatel Andres Freund <andres@anarazel.de> napsal:
Hi,

On 2022-02-27 04:17:52 +0100, Pavel Stehule wrote:
> > You redirect stats from pg_class and pg_statistics to a local hash table.
> > This is pretty hairy :(

As is I think the patch is architecturally completely unacceptable. Having
code everywhere to redirect to manually written in-memory catalog table code
isn't maintainable.


> > I guess you'd also need to handle pg_statistic_ext and ext_data.
> > pg_stats doesn't work, since the data isn't in pg_statistic - it'd need to
> > look
> > at pg_get_gtt_statistics.
>
> Without this, the GTT will be terribly slow like current temporary tables
> with a lot of problems with bloating of pg_class, pg_attribute and
> pg_depend tables.

I think it's not a great idea to solve multiple complicated problems at
once...

I thought about this issue for a very long time, and I didn't find any better (without more significant rewriting of pg storage). In a lot of projects, that I know, the temporary tables are strictly prohibited due possible devastating impact to system catalog bloat.  It is a serious problem. So any implementation of GTT should solve the questions: a) how to reduce catalog bloating, b) how to allow session related statistics for GTT. I agree so implementation of GTT like template based LTT (local temporary tables) can be very simple (it is possible by extension), but with the same unhappy performance impacts.

I don't say so current design should be accepted without any discussions and without changes. Maybe GTT based on LTT can be better than nothing (what we have now), and can be good enough for a lot of projects where the load is not too high (and almost all projects have low load). Unfortunately,it can be a trap for a lot of projects in future, so there should be discussion and proposed solutions for fix of related issues. The performance of GTT should be fixable, so any discussion about this topic should have part about protections against catalog bloat and about cost related to frequent catalog updates.

But anyway, I invite (and probably not just me) any discussion on how to implement this feature, how to solve performance issues, and how to divide implementation into smaller steps. I am sure so fast GTT  implementation can be used for fast implementation of LTT too, and maybe with all other temporary objects

Regards

Pavel


Greetings,

Andres Freund

Re: [Proposal] Global temporary tables

From
Wenjing Zeng
Date:


2022年2月25日 15:45,Andres Freund <andres@anarazel.de> 写道:

Hi,


This is a huge thread. Realistically reviewers and committers can't reread
it. I think there needs to be more of a description of how this works included
in the patchset and *why* it works that way. The readme does a bit of that,
but not particularly well.
Thank you for your review of the design and code.
I'm always trying to improve it. If you are confused or need clarification on something, please point it out.




On 2022-02-25 14:26:47 +0800, Wenjing Zeng wrote:
+++ b/README.gtt.txt
@@ -0,0 +1,172 @@
+Global Temporary Table(GTT)
+=========================================
+
+Feature description
+-----------------------------------------
+
+Previously, temporary tables are defined once and automatically
+exist (starting with empty contents) in every session before using them.

I think for a README "previously" etc isn't good language - if it were
commited, it'd not be understandable anymore. It makes more sense for commit
messages etc.
Thanks for pointing it out. I will adjust the description.



+Main design ideas
+-----------------------------------------
+In general, GTT and LTT use the same storage and buffer design and
+implementation. The storage files for both types of temporary tables are named
+as t_backendid_relfilenode, and the local buffer is used to cache the data.

What does "named ast_backendid_relfilenode" mean?
This is the storage file naming format for describing temporary tables.
It starts with 't', followed by backendid and relfilenode, connected by an underscore.
File naming rules are the same as LTT.
The data in the file is no different from regular tables and LTT.



+The schema of GTTs is shared among sessions while their data are not. We build
+a new mechanisms to manage those non-shared data and their statistics.
+Here is the summary of changes:
+
+1) CATALOG
+GTTs store session-specific data. The storage information of GTTs'data, their
+transaction information, and their statistics are not stored in the catalog.
+
+2) STORAGE INFO & STATISTICS INFO & TRANSACTION INFO
+In order to maintain durability and availability of GTTs'session-specific data,
+their storage information, statistics, and transaction information is managed
+in a local hash table tt_storage_local_hash.

"maintain durability"? Durable across what? In the context of databases it's
typically about crash safety, but that can't be the case here.
It means that the transaction information(relfrozenxid/relminmxid)  storage information(relfilenode)
and statistics(relpagesof GTT, which are maintained in hashtable , not pg_class.
This is to allow GTT to store its own local data in different sessions and to avoid frequent catalog changes.



+3) DDL
+Currently, GTT supports almost all table'DDL except CLUSTER/VACUUM FULL.
+Part of the DDL behavior is limited by shared definitions and multiple copies of
+local data, and we added some structures to handle this.

+A shared hash table active_gtt_shared_hash is added to track the state of the
+GTT in a different session. This information is recorded in the hash table
+during the DDL execution of the GTT.

+The data stored in a GTT can only be modified or accessed by owning session.
+The statements that only modify data in a GTT do not need a high level of
+table locking. The operations making those changes include truncate GTT,
+reindex GTT, and lock GTT.

I think you need to introduce a bit more terminology for any of this to make
sense. Sometimes GTT means the global catalog entity, sometimes, like here, it
appears to mean the session specific contents of a GTT.

What state of a GTT in a nother session?


How do GTTs handle something like BEGIN; TRUNCATE some_gtt_table; ROLLBACK;?

GTT behaves exactly like a regular table.
Specifically, the latest relfilenode for the current session is stored in the hashtable and may change it.
If the transaction rolls back, the old relfilenode is enabled again, just as it is in pg_class.



+1.2 on commit clause
+LTT's status associated with on commit DELETE ROWS and on commit PRESERVE ROWS
+is not stored in catalog. Instead, GTTs need a bool value on_commit_delete_rows
+in reloptions which is shared among sessions.

Why?
The LTT is always created and used in the current session. The on commit clause property
does not need to be shared with other sessions. This is why LTT does not record the on commit clause
in the catalog.
However, GTT's table definitions are shared between sessions, including the on commit clause,
so it needs to be saved in the catalog.





+2.3 statistics info
+1) relpages reltuples relallvisible relfilenode

?
It was mentioned above.


+3 DDL
+3.1. active_gtt_shared_hash
+This is the hash table created in shared memory to trace the GTT files initialized
+in each session. Each hash entry contains a bitmap that records the backendid of
+the initialized GTT file. With this hash table, we know which backend/session
+is using this GTT. Such information is used during GTT's DDL operations.

So there's a separate locking protocol for GTTs that doesn't use the normal
locking infrastructure? Why?


+3.7 CLUSTER GTT/VACUUM FULL GTT
+The current version does not support.

Why?
Currently, GTT cannot reuse clusters for regular table processes. I choose not to support it for now.
Also, I can't think of any scenario that would require clustering for temporary tables, which
is another reason why not support cluster first.




+4 MVCC commit log(clog) cleanup
+
+The GTT storage file contains transaction information. Queries for GTT data rely
+on transaction information such as clog. The transaction information required by
+each session may be completely different.

Why is transaction information different between sessions? Or does this just
mean that different transaction ids will be accessed?

It has the same meaning as pg_class.relfrozenxid.
For the same GTT, the first transaction to write data in each session is different and
the data is independent of each other. They have a separate frozenxid.
The vacuum clog process needs to consider it.




0003-gtt-v67-implementation.patch
71 files changed, 3167 insertions(+), 195 deletions(-)

This needs to be broken into smaller chunks to be reviewable.


@@ -677,6 +678,14 @@ _bt_getrootheight(Relation rel)
{
Buffer metabuf;

+ /*
+ * If a global temporary table storage file is not initialized in the
+ * this session, its index does not have a root page, just returns 0.
+ */
+ if (RELATION_IS_GLOBAL_TEMP(rel) &&
+ !gtt_storage_attached(RelationGetRelid(rel)))
+ return 0;
+
metabuf = _bt_getbuf(rel, BTREE_METAPAGE, BT_READ);
metad = _bt_getmeta(rel, metabuf);

Stuff like this seems not acceptable. Accesses would have to be prevented much
earlier. Otherwise each index method is going to need copies of this logic. I
also doubt that _bt_getrootheight() is the only place that'd need this.
You are right, this is done to solve the empty GTT being queried. I don't need it anymore,
so I'll get rid of it.



static void
index_update_stats(Relation rel,
  bool hasindex,
-   double reltuples)
+   double reltuples,
+   bool isreindex)
{
Oid relid = RelationGetRelid(rel);
Relation pg_class;
@@ -2797,6 +2824,13 @@ index_update_stats(Relation rel,
Form_pg_class rd_rel;
bool dirty;

+ /*
+ * Most of the global Temp table data is updated to the local hash, and reindex
+ * does not refresh relcache, so call a separate function.
+ */
+ if (RELATION_IS_GLOBAL_TEMP(rel))
+ return index_update_gtt_relstats(rel, hasindex, reltuples, isreindex);
+

So basically every single place in the code that does catalog accesses is
going to need a completely separate implementation for GTTs? That seems
unmaintainable.
create Index on GTT and VACUUM GTT do this.
Some info of the table (relhasIndex) need to be updated to pg_class, 
while others not (relpages…).
Would you prefer to extend it on the original function?





+/*-------------------------------------------------------------------------
+ *
+ * storage_gtt.c
+ *  The body implementation of Global temparary table.
+ *
+ * IDENTIFICATION
+ *  src/backend/catalog/storage_gtt.c
+ *
+ *  See src/backend/catalog/GTT_README for Global temparary table's
+ *  requirements and design.
+ *
+ *-------------------------------------------------------------------------
+ */

I don't think that path to the readme is correct.
I tried to reorganize it.


Regards, Wenjing.




Greetings,

Andres Freund



Re: [Proposal] Global temporary tables

From
Wenjing Zeng
Date:


2022年2月27日 08:21,Justin Pryzby <pryzby@telsasoft.com> 写道:

I read through this.
Find attached some language fixes.  You should be able to apply each "fix"
patch on top of your own local branch with git am, and then squish them
together.  Let me know if you have trouble with that.

I think get_seqence_start_value() should be static.  (Or otherwise, it should
be in lsyscache.c).

The include added to execPartition.c seems to be unused.

+#define RELATION_IS_TEMP_ON_CURRENT_SESSION(relation) \
+#define RELATION_IS_TEMP(relation) \
+#define RelpersistenceTsTemp(relpersistence) \
+#define RELATION_GTT_ON_COMMIT_DELETE(relation)    \

=> These macros can evaluate their arguments multiple times.
You should add a comment to warn about that.  And maybe avoid passing them a
function argument, like: RelpersistenceTsTemp(get_rel_persistence(rte->relid))

+list_all_backend_gtt_frozenxids should return TransactionId not int.
The function name should say "oldest" and not "all" ?

I think the GUC should have a longer name.  max_active_gtt is too short for a
global var.

+#define    MIN_NUM_ACTIVE_GTT          0
+#define    DEFAULT_NUM_ACTIVE_GTT          1000
+#define    MAX_NUM_ACTIVE_GTT          1000000

+int        max_active_gtt = MIN_NUM_ACTIVE_GTT

It's being initialized to MIN, but then the GUC machinery sets it to DEFAULT.
By convention, it should be initialized to default.

fout->remoteVersion >= 140000

=> should say 15

describe.c has gettext_noop("session"), which is a half-truth.  The data is
per-session but the table definition is persistent..
Thanks for your advice, I will try to merge this part of the code.


You redirect stats from pg_class and pg_statistics to a local hash table.
This is pretty hairy :(
I guess you'd also need to handle pg_statistic_ext and ext_data.
pg_stats doesn't work, since the data isn't in pg_statistic - it'd need to look
at pg_get_gtt_statistics.

I wonder if there's a better way to do it, like updating pg_statistic but
forcing the changes to be rolled back when the session ends...  But I think
that would make longrunning sessions behave badly, the same as "longrunning
transactions".

There are three pieces of data related to session-level GTT data that need to be managed
session-level storage info like relfilenode
2 session-level like relfrozenxid
session-level stats like relpages or column stats

I think the 1 and 2 are necessary, but not for stats.
In the previous email, It has been suggested that GTT statistics not be processed.
This means that GTT statistics are not recorded in the localhash or catalog.
In my observation, very few users require an accurate query plan for temporary tables to
perform manual analyze.
Of course, doing this will also avoid catalog bloat and performance problems.



Have you looked at Gilles Darold's GTT extension ?
If you are referring to https://github.com/darold/pgtt , yes.
It is smart to use unlogged table as a template and then use LTT to read and write data.
For this implementation, I want to point out two things:
1 For the first insert of GTT in each session, create table or create index is implicitly executed.
2 The catalog bloat caused by LTT still exist.


Regards, Wenjing.


<0002-f-0002-gtt-v64-doc.txt><0004-f-0003-gtt-v64-implementation.txt><0006-f-0004-gtt-v64-regress.txt>

Re: [Proposal] Global temporary tables

From
Wenjing Zeng
Date:

> 2022年2月27日 12:13,Andres Freund <andres@anarazel.de> 写道:
>
> Hi,
>
> On 2022-02-27 04:17:52 +0100, Pavel Stehule wrote:
>>> You redirect stats from pg_class and pg_statistics to a local hash table.
>>> This is pretty hairy :(
>
> As is I think the patch is architecturally completely unacceptable. Having
> code everywhere to redirect to manually written in-memory catalog table code
> isn't maintainable.
>
>
>>> I guess you'd also need to handle pg_statistic_ext and ext_data.
>>> pg_stats doesn't work, since the data isn't in pg_statistic - it'd need to
>>> look
>>> at pg_get_gtt_statistics.
>>
>> Without this, the GTT will be terribly slow like current temporary tables
>> with a lot of problems with bloating of pg_class, pg_attribute and
>> pg_depend tables.
>
> I think it's not a great idea to solve multiple complicated problems at
> once...

I'm trying to break down the entire implementation into multiple sub-patches.


Regards, Wenjing.


>
> Greetings,
>
> Andres Freund
>
>




Re: [Proposal] Global temporary tables

From
Adam Brusselback
Date:
>In my observation, very few users require an accurate query plan for temporary tables to
perform manual analyze.

Absolutely not true in my observations or personal experience. It's one of the main reasons I have needed to use (local) temporary tables rather than just materializing a CTE when decomposing queries that are too complex for Postgres to handle.

I wish I could use GTT to avoid the catalog bloat in those instances, but that will only be possible if the query plans are accurate.

Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:


st 2. 3. 2022 v 19:02 odesílatel Adam Brusselback <adambrusselback@gmail.com> napsal:
>In my observation, very few users require an accurate query plan for temporary tables to
perform manual analyze.

Absolutely not true in my observations or personal experience. It's one of the main reasons I have needed to use (local) temporary tables rather than just materializing a CTE when decomposing queries that are too complex for Postgres to handle.

I wish I could use GTT to avoid the catalog bloat in those instances, but that will only be possible if the query plans are accurate.

This strongly depends on usage.  Very common patterns from MSSQL don't need statistics. But on second thought, sometimes, the query should be divided and temp tables are used for storing some middle results. In this case, you cannot exist without statistics. In the first case, the temp tables can be replaced by arrays. In the second case, the temp tables are not replaceable.

Regards

Pavel

Re: [Proposal] Global temporary tables

From
Andres Freund
Date:
Hi,

On 2022-02-27 06:09:54 +0100, Pavel Stehule wrote:
> ne 27. 2. 2022 v 5:13 odesílatel Andres Freund <andres@anarazel.de> napsal:
> > On 2022-02-27 04:17:52 +0100, Pavel Stehule wrote:
> > > Without this, the GTT will be terribly slow like current temporary tables
> > > with a lot of problems with bloating of pg_class, pg_attribute and
> > > pg_depend tables.
> >
> > I think it's not a great idea to solve multiple complicated problems at
> > once...

> I thought about this issue for a very long time, and I didn't find any
> better (without more significant rewriting of pg storage). In a lot of
> projects, that I know, the temporary tables are strictly prohibited due
> possible devastating impact to system catalog bloat.  It is a serious
> problem. So any implementation of GTT should solve the questions: a) how to
> reduce catalog bloating, b) how to allow session related statistics for
> GTT. I agree so implementation of GTT like template based LTT (local
> temporary tables) can be very simple (it is possible by extension), but
> with the same unhappy performance impacts.

> I don't say so current design should be accepted without any discussions
> and without changes. Maybe GTT based on LTT can be better than nothing
> (what we have now), and can be good enough for a lot of projects where the
> load is not too high (and almost all projects have low load).

I think there's just no way that it can be merged with anything close to the
current design - it's unmaintainable. The need for the feature doesn't change
that.

That's not to say it's impossible to come up with a workable design. But it's
definitely not easy. If I were to work on this - which I am not planning to -
I'd try to solve the problems of "LTT" first, with an eye towards using the
infrastructure for GTT.

I think you'd basically have to come up with a generic design for partitioning
catalog tables into local / non-local storage, without needing explicit code
for each catalog. That could also be used to store the default catalog
contents separately from user defined ones (e.g. pg_proc is pretty large).

Greetings,

Andres Freund



Re: [Proposal] Global temporary tables

From
Pavel Stehule
Date:

Hi


I think you'd basically have to come up with a generic design for partitioning
catalog tables into local / non-local storage, without needing explicit code
for each catalog. That could also be used to store the default catalog
contents separately from user defined ones (e.g. pg_proc is pretty large).

There is still a risk of bloating in local storage, but, mainly, you probably have to modify a lot of lines because the system cache doesn't support partitioning.

Regards

Pavel
 

Greetings,

Andres Freund

Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Wed, Mar 2, 2022 at 4:18 PM Andres Freund <andres@anarazel.de> wrote:
> I think there's just no way that it can be merged with anything close to the
> current design - it's unmaintainable. The need for the feature doesn't change
> that.

I don't know whether the design is right or wrong, but I agree that a
bad design isn't OK just because we need the feature. I'm not entirely
convinced that the change to _bt_getrootheight() is a red flag,
although I agree that there is a need to explain and justify why
similar changes aren't needed in other places. But I think overall
this patch is just too big and too unpolished to be seriously
considered. It clearly needs to be broken down into incremental
patches that are not just separated by topic but potentially
independently committable, with proposed commit messages for each.

And, like, there's a long history on this thread of people pointing
out particular crash bugs and particular problems with code comments
or whatever and I guess those are getting fixed as they are reported,
but I do not have the feeling that the overall code quality is
terribly high, because people just keep finding more stuff. Like look
at this:

+ uint8 flags = 0;
+
+ /* return 0 if feature is disabled */
+ if (max_active_gtt <= 0)
+ return InvalidTransactionId;
+
+ /* Disable in standby node */
+ if (RecoveryInProgress())
+ return InvalidTransactionId;
+
+ flags |= PROC_IS_AUTOVACUUM;
+ flags |= PROC_IN_LOGICAL_DECODING;
+
+ LWLockAcquire(ProcArrayLock, LW_SHARED);
+ arrayP = procArray;
+ for (index = 0; index < arrayP->numProcs; index++)
+ {
+ int pgprocno = arrayP->pgprocnos[index];
+ PGPROC    *proc = &allProcs[pgprocno];
+ uint8 statusFlags = ProcGlobal->statusFlags[index];
+ TransactionId gtt_frozenxid = InvalidTransactionId;
+
+ if (statusFlags & flags)
+ continue;

This looks like code someone wrote, modified multiple times as they
found problems, and never cleaned up. 'flags' gets set to 0, and then
unconditionally gets two bits xor'd in, and then we test it against
statusFlags. Probably there shouldn't be a local variable at all, and
if there is, the value should be set properly from the start instead
of constructed incrementally as we go along. And there should be
comments. Why is it OK to return InvalidTransactionId in standby mode?
Why is it OK to pass that flags value? And, if we look at this
function a little further down, is it really OK to hold ProcArrayLock
across an operation that could perform multiple memory allocation
operations? I bet it's not, unless calls are very infrequent in
practice.

I'm not asking for this particular part of the code to be cleaned up.
I'm asking for the whole patch to be cleaned up. Like, nobody who is a
committer is going to have enough time to go through the patch
function by function and point out issues on this level of detail in
every place where they occur. Worse, discussing all of those issues is
just a distraction from the real task of figuring out whether the
design needs adjustment. Because the patch is one massive code drop,
and with not-really-that-clean code and not-that-great comments, it's
almost impossible to review. I don't plan to try unless the quality
improves a lot. I'm not saying it's the worst code ever written, but I
think it's kind of at a level of "well, it seems to work for me," and
the standard around here is higher than that. It's not the job of the
community or of individual committers to prove that problems exist in
this patch and therefore it shouldn't be committed. It's the job of
the author to prove that there aren't and it should be. And I don't
think we're close to that at all.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: [Proposal] Global temporary tables

From
Greg Stark
Date:
It doesn't look like this is going to get committed this release
cycle. I understand more feedback could be valuable, especially on the
overall design, but as this is the last commitfest of the release we
should focus on other patches for now and spend that time in the next
release cycle.

I'm going to bump this one now as Waiting on Author for the design
documentation Robert asks for and probably a plan for how to separate
that design into multiple separable features as Andres suggested.

I'm still hopeful we get to advance this early in 16 because I think
everyone agrees the feature would be great.



Re: [Proposal] Global temporary tables

From
Robert Haas
Date:
On Thu, Mar 3, 2022 at 3:29 PM Greg Stark <stark@mit.edu> wrote:
> I'm still hopeful we get to advance this early in 16 because I think
> everyone agrees the feature would be great.

I'm not saying this patch can't make progress, but I think the chances
of this being ready to commit any time in the v16 release cycle, let
alone at the beginning, are low. This patch set has been around since
2019, and here Andres and I are saying it's not even really reviewable
in the shape that it's in. I have done some review of it previously,
BTW, but eventually I gave up because it just didn't seem like we were
making any progress. And then a long time after that people were still
finding many server crashes with relatively simple test cases.

I agree that the feature is desirable, but I think getting there is
going to require a huge amount of effort that may amount to a total
rewrite of the patch.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: [Proposal] Global temporary tables

From
Andres Freund
Date:
Hi,

On 2022-03-03 16:07:37 -0500, Robert Haas wrote:
> On Thu, Mar 3, 2022 at 3:29 PM Greg Stark <stark@mit.edu> wrote:
> > I'm still hopeful we get to advance this early in 16 because I think
> > everyone agrees the feature would be great.
> 
> I'm not saying this patch can't make progress, but I think the chances
> of this being ready to commit any time in the v16 release cycle, let
> alone at the beginning, are low. This patch set has been around since
> 2019, and here Andres and I are saying it's not even really reviewable
> in the shape that it's in. I have done some review of it previously,
> BTW, but eventually I gave up because it just didn't seem like we were
> making any progress. And then a long time after that people were still
> finding many server crashes with relatively simple test cases.
> 
> I agree that the feature is desirable, but I think getting there is
> going to require a huge amount of effort that may amount to a total
> rewrite of the patch.

Agreed. I think this needs very fundamental design work, and the patch itself
isn't worth reviewing until that's tackled.

Greetings,

Andres Freund



Re: [Proposal] Global temporary tables

From
Jacob Champion
Date:
On 3/3/22 13:20, Andres Freund wrote:
> On 2022-03-03 16:07:37 -0500, Robert Haas wrote:
>> I agree that the feature is desirable, but I think getting there is
>> going to require a huge amount of effort that may amount to a total
>> rewrite of the patch.
> 
> Agreed. I think this needs very fundamental design work, and the patch itself
> isn't worth reviewing until that's tackled.

Given two opinions that the patch can't be effectively reviewed as-is, I
will mark this RwF for this commitfest. Anyone up for shepherding the
design conversations, going forward?

--Jacob