Thread: Implementation of global temporary tables?

Implementation of global temporary tables?

From

Pavel Stehule

Date:

02 February 2015, 10:16:09

Hello

Six years ago we did discuss about global temporary tables - persistent schema, ephemeral data.

http://postgresql.nabble.com/idea-global-temp-tables-td2007217.html

I am thinking so some reasons why implement this feature are valid:

* we can get some performance benefit against current temp tables - less the catalogue bloating,

* we can simplify a static validation of plpgsql functions when temp tables are used,

more global temp tables are little bit comfortable for developers,

* we can simplify migration from some other databases, where global temp tables are default.

Some was changed from 2009:

* We have updatable CTE

* We have unlogged tables

Opened questions:

1. Name and conformance with standard of this feature - because we doesn't support modules, a mapping ANSI -> PG should not be trivial

2. Implementation

I see three possible ways how to implement it:

2.a - using on demand created temp tables - most simple solution, but doesn't help with catalogue bloating

2.b - using unlogged tables for holding statistics, relfilenode, and all necessary data

3.c - store ephemeral metadata only in memory without MVCC

Is there still interest about this feature?

Comments, notes?

Pavel

Re: Implementation of global temporary tables?

From

Pavel Stehule

Date:

02 February 2015, 10:26:34

2015-02-02 11:15 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

Hello

Six years ago we did discuss about global temporary tables - persistent schema, ephemeral data.

http://postgresql.nabble.com/idea-global-temp-tables-td2007217.html

I am thinking so some reasons why implement this feature are valid:

* we can get some performance benefit against current temp tables - less the catalogue bloating,

* we can simplify a static validation of plpgsql functions when temp tables are used,
more global temp tables are little bit comfortable for developers,

* we can simplify migration from some other databases, where global temp tables are default.

I forgot other possible benefit:

* using temp tables on slaves - (needs 3c implementation)

Some was changed from 2009:

* We have updatable CTE

* We have unlogged tables

Opened questions:

1. Name and conformance with standard of this feature - because we doesn't support modules, a mapping ANSI -> PG should not be trivial

2. Implementation

I see three possible ways how to implement it:

2.a - using on demand created temp tables - most simple solution, but doesn't help with catalogue bloating

2.b - using unlogged tables for holding statistics, relfilenode, and all necessary data

3.c - store ephemeral metadata only in memory without MVCC

Is there still interest about this feature?

Comments, notes?

Pavel

Re: Implementation of global temporary tables?

From

Atri Sharma

Date:

02 February 2015, 10:52:02

Some was changed from 2009:

* We have updatable CTE

* We have unlogged tables

Opened questions:

1. Name and conformance with standard of this feature - because we doesn't support modules, a mapping ANSI -> PG should not be trivial

2. Implementation

I see three possible ways how to implement it:

2.a - using on demand created temp tables - most simple solution, but doesn't help with catalogue bloating

2.b - using unlogged tables for holding statistics, relfilenode, and all necessary data

3.c - store ephemeral metadata only in memory without MVCC

With 2.a, essentially, we are defining a global definition of a temp table, but the actual per session objects still follow the same rules are our current temp tables do?

2.b seems like a lot of new data, and if we are defining new paths for e.g. statistics for global temp tables, we might end up adding new logic in planner to use those tables. I am not seeing how this will work.

Could you elaborate a bit on 3.c please?

Something that really bothers me here, on a different note, is the catalog churn this could cause. Of course, you mentioned in as a point in your email, but I feel that 2.a's showstopper could be the massive catalog churn it causes. Maybe have a way to manage such tables without getting pg_class to bloat pretty quickly (I am assuming you do not mean metadata as the catalog metadata in 3.c).

Regards,

Atri

Re: Implementation of global temporary tables?

From

Pavel Stehule

Date:

02 February 2015, 10:59:36

2015-02-02 11:51 GMT+01:00 Atri Sharma <atri.jiit@gmail.com>:

Some was changed from 2009:

* We have updatable CTE

* We have unlogged tables

Opened questions:

1. Name and conformance with standard of this feature - because we doesn't support modules, a mapping ANSI -> PG should not be trivial

2. Implementation

I see three possible ways how to implement it:

2.a - using on demand created temp tables - most simple solution, but doesn't help with catalogue bloating

2.b - using unlogged tables for holding statistics, relfilenode, and all necessary data

3.c - store ephemeral metadata only in memory without MVCC

With 2.a, essentially, we are defining a global definition of a temp table, but the actual per session objects still follow the same rules are our current temp tables do?

yes .. it means "global temp table is template for local temp table"

2.b seems like a lot of new data, and if we are defining new paths for e.g. statistics for global temp tables, we might end up adding new logic in planner to use those tables. I am not seeing how this will work.

The advantages of this method is transactional behave and moving some bloating content to specific smaller and unlogged tables.

Could you elaborate a bit on 3.c please?

Something that really bothers me here, on a different note, is the catalog churn this could cause. Of course, you mentioned in as a point in your email, but I feel that 2.a's showstopper could be the massive catalog churn it causes. Maybe have a way to manage such tables without getting pg_class to bloat pretty quickly (I am assuming you do not mean metadata as the catalog metadata in 3.c).

3.c is good protection against catalog bloating - on second hand - implementation will be probably more complex.

Regards,

Atri

Re: Implementation of global temporary tables?

From

Andres Freund

Date:

02 February 2015, 11:04:15

Hi,

On 2015-02-02 11:15:22 +0100, Pavel Stehule wrote:
> Six years ago we did discuss about global temporary tables - persistent
> schema, ephemeral data.
> 
> http://postgresql.nabble.com/idea-global-temp-tables-td2007217.html
> 
> I am thinking so some reasons why implement this feature are valid:
> 
> * we can get some performance benefit against current temp tables - less
> the catalogue bloating,
> 
> * we can simplify a static validation of plpgsql functions when temp tables
> are used,
>   more global temp tables are little bit comfortable for developers,
> 
> * we can simplify migration from some other databases, where global temp
> tables are default.

I agree that the feature would be interesting.

> 2. Implementation
> 
> I see three possible ways how to implement it:
> 
> 2.a - using on demand created temp tables - most simple solution, but
> doesn't help with catalogue bloating

Yea, that's no good.

> 2.b - using unlogged tables for holding statistics, relfilenode, and all
> necessary data

I can't follow why that'd achieve anything?

> 3.c - store ephemeral metadata only in memory without MVCC

I think that's not an option. That'd end up being a massive amount of
duplication at a low rate of functionality.

I think it's more realistic way to implement is to have a separate
'relpersistence' setting for global temp tables. The first access to
such one in a session (or xact if truncate on commit) copies the table
from the _init fork. By having the backend id in all filenames (besides
the init fork) they're unique between sessions.

Or something roughly like that.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Implementation of global temporary tables?

From

Pavel Stehule

Date:

02 February 2015, 11:25:42

2015-02-02 12:04 GMT+01:00 Andres Freund <andres@2ndquadrant.com>:

Hi,

On 2015-02-02 11:15:22 +0100, Pavel Stehule wrote:
> Six years ago we did discuss about global temporary tables - persistent
> schema, ephemeral data.
>
> http://postgresql.nabble.com/idea-global-temp-tables-td2007217.html
>
> I am thinking so some reasons why implement this feature are valid:
>
> * we can get some performance benefit against current temp tables - less
> the catalogue bloating,
>
> * we can simplify a static validation of plpgsql functions when temp tables
> are used,
> more global temp tables are little bit comfortable for developers,
>
> * we can simplify migration from some other databases, where global temp
> tables are default.

I agree that the feature would be interesting.

> 2. Implementation
>
> I see three possible ways how to implement it:
>
> 2.a - using on demand created temp tables - most simple solution, but
> doesn't help with catalogue bloating

Yea, that's no good.

> 2.b - using unlogged tables for holding statistics, relfilenode, and all
> necessary data

I can't follow why that'd achieve anything?

1. Main catalogue will be stable.

2. There is not necessary to implement new storage and it can helps with transaction support.

> 3.c - store ephemeral metadata only in memory without MVCC

I think that's not an option. That'd end up being a massive amount of
duplication at a low rate of functionality.

I don't plan to implement a storage - I expect only few functions for store/read data from session memory context

I think it's more realistic way to implement is to have a separate
'relpersistence' setting for global temp tables. The first access to
such one in a session (or xact if truncate on commit) copies the table
from the _init fork. By having the backend id in all filenames (besides
the init fork) they're unique between sessions.

If I understand well, it is similar to my fast implementation from 2008. It works partially, because it doesn't solve other (session) property - like relpages, reltuples and related data from pg_statistics

Or something roughly like that.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: Implementation of global temporary tables?

From

Andres Freund

Date:

02 February 2015, 12:15:36

On 2015-02-02 12:24:44 +0100, Pavel Stehule wrote:
> 2015-02-02 12:04 GMT+01:00 Andres Freund <andres@2ndquadrant.com>:
> > > 2.b - using unlogged tables for holding statistics, relfilenode, and all
> > > necessary data
> >
> > I can't follow why that'd achieve anything?
> >
> 
> 1. Main catalogue will be stable.
> 2. There is not necessary to implement new storage and it can helps with
> transaction support.

The amount of complexity that'd be involved to store catalog data in a
separate relation around the caches and accesses would be significant. I
don't think that's a realistic option.

> > > 3.c - store ephemeral metadata only in memory without MVCC
> >
> > I think that's not an option. That'd end up being a massive amount of
> > duplication at a low rate of functionality.
> >
> 
> I don't plan to implement a storage - I expect only few functions for
> store/read data from session memory context

What does it have to do with temp tables then?

> > I think it's more realistic way to implement is to have a separate
> > 'relpersistence' setting for global temp tables. The first access to
> > such one in a session (or xact if truncate on commit) copies the table
> > from the _init fork. By having the backend id in all filenames (besides
> > the init fork) they're unique between sessions.
> >
> >
> If I understand well, it is similar to my fast implementation from 2008. It
> works partially,  because it doesn't solve other (session) property - like
> relpages, reltuples and related data from pg_statistics

I'm honestly not particularly concerned about that problem. For one, we
don't auto-analyze/vacuum temp tables. For another, it'd be
comparatively easy to gather reltuples/relpages/stats from session local
state if necessary. Those are all only accessed from a few places.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Implementation of global temporary tables?

From

Atri Sharma

Date:

02 February 2015, 12:36:42

> 1. Main catalogue will be stable.
> 2. There is not necessary to implement new storage and it can helps with
> transaction support.

The amount of complexity that'd be involved to store catalog data in a
separate relation around the caches and accesses would be significant. I
don't think that's a realistic option.

Not to mention the problems we might end up in. We still have corner cases in our cache code, and a new heap on top of it all might be just too painful.

> > > 3.c - store ephemeral metadata only in memory without MVCC
> >
> > I think that's not an option. That'd end up being a massive amount of
> > duplication at a low rate of functionality.
> >
>
> I don't plan to implement a storage - I expect only few functions for
> store/read data from session memory context

What does it have to do with temp tables then?

I think what Pavel means here is that we do not need a full fledged heap layer and rather only a minimal API from a per session memory context. However, that might be still as painful because we will eventually end up inventing mechanisms for syscache and typcache to work with this storage, which IMO is the biggest pain point around this idea.

Regards,

Atri

Regards,

Atri

l'apprenant

Re: Implementation of global temporary tables?

From

Pavel Stehule

Date:

02 February 2015, 13:02:32

2015-02-02 13:15 GMT+01:00 Andres Freund <andres@2ndquadrant.com>:

On 2015-02-02 12:24:44 +0100, Pavel Stehule wrote:
> 2015-02-02 12:04 GMT+01:00 Andres Freund <andres@2ndquadrant.com>:
> > > 2.b - using unlogged tables for holding statistics, relfilenode, and all
> > > necessary data
> >
> > I can't follow why that'd achieve anything?
> >
>
> 1. Main catalogue will be stable.
> 2. There is not necessary to implement new storage and it can helps with
> transaction support.

The amount of complexity that'd be involved to store catalog data in a
separate relation around the caches and accesses would be significant. I
don't think that's a realistic option.

> > > 3.c - store ephemeral metadata only in memory without MVCC
> >
> > I think that's not an option. That'd end up being a massive amount of
> > duplication at a low rate of functionality.
> >
>
> I don't plan to implement a storage - I expect only few functions for
> store/read data from session memory context

What does it have to do with temp tables then?

it is mechanism how to store a session metadata related to global temp tables

> > I think it's more realistic way to implement is to have a separate
> > 'relpersistence' setting for global temp tables. The first access to
> > such one in a session (or xact if truncate on commit) copies the table
> > from the _init fork. By having the backend id in all filenames (besides
> > the init fork) they're unique between sessions.
> >
> >
> If I understand well, it is similar to my fast implementation from 2008. It
> works partially, because it doesn't solve other (session) property - like
> relpages, reltuples and related data from pg_statistics

I'm honestly not particularly concerned about that problem. For one, we
don't auto-analyze/vacuum temp tables. For another, it'd be
comparatively easy to gather reltuples/relpages/stats from session local
state if necessary. Those are all only accessed from a few places.

so I don't see a big differences from 3.c - all session metadata will stored in session memory.

I didn't write code, so I have not a knowledge about details.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: Implementation of global temporary tables?

From

Pavel Stehule

Date:

02 February 2015, 13:05:27

2015-02-02 13:36 GMT+01:00 Atri Sharma <atri.jiit@gmail.com>:

> 1. Main catalogue will be stable.
> 2. There is not necessary to implement new storage and it can helps with
> transaction support.

The amount of complexity that'd be involved to store catalog data in a
separate relation around the caches and accesses would be significant. I
don't think that's a realistic option.

Not to mention the problems we might end up in. We still have corner cases in our cache code, and a new heap on top of it all might be just too painful.

> > > 3.c - store ephemeral metadata only in memory without MVCC
> >
> > I think that's not an option. That'd end up being a massive amount of
> > duplication at a low rate of functionality.
> >
>
> I don't plan to implement a storage - I expect only few functions for
> store/read data from session memory context

What does it have to do with temp tables then?

I think what Pavel means here is that we do not need a full fledged heap layer and rather only a minimal API from a per session memory context. However, that might be still as painful because we will eventually end up inventing mechanisms for syscache and typcache to work with this storage, which IMO is the biggest pain point around this idea.

It should be solvable - I see another risk - if we accelerate a work with temp tables, then 4 byte oid should not be enough.

Regards,

Atri

Regards,

Atri
l'apprenant

Re: Implementation of global temporary tables?

From

Atri Sharma

Date:

02 February 2015, 14:02:30

On Mon, Feb 2, 2015 at 6:34 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

2015-02-02 13:36 GMT+01:00 Atri Sharma <atri.jiit@gmail.com>:

> 1. Main catalogue will be stable.
> 2. There is not necessary to implement new storage and it can helps with
> transaction support.

The amount of complexity that'd be involved to store catalog data in a
separate relation around the caches and accesses would be significant. I
don't think that's a realistic option.

Not to mention the problems we might end up in. We still have corner cases in our cache code, and a new heap on top of it all might be just too painful.

> > > 3.c - store ephemeral metadata only in memory without MVCC
> >
> > I think that's not an option. That'd end up being a massive amount of
> > duplication at a low rate of functionality.
> >
>
> I don't plan to implement a storage - I expect only few functions for
> store/read data from session memory context

What does it have to do with temp tables then?

I think what Pavel means here is that we do not need a full fledged heap layer and rather only a minimal API from a per session memory context. However, that might be still as painful because we will eventually end up inventing mechanisms for syscache and typcache to work with this storage, which IMO is the biggest pain point around this idea.

It should be solvable - I see another risk - if we accelerate a work with temp tables, then 4 byte oid should not be enough.

Hrm, that might well be true. It might be worth the effort to find a better way to materialize global temp tables then, like having a single OID and only materializing a relfilenode for a session when the session inserts into the temp table. Not sure here at all...

Regards,

Atri

l'apprenant

Re: Implementation of global temporary tables?

From

Pavel Stehule

Date:

08 July 2015, 07:54:44

2015-07-08 9:08 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu>:

> more global temp tables are little bit comfortable for developers,
I'd like to emphasize this point. This feature does much more than saving a developer from issuing a CREATE TEMP TABLE statement in every session. Here are two common use cases and I'm sure there are more.

(1)
Imagine in a web application scenario, a developer wants to cache some session information in a temp table. What's more, he also wants to specify some rules which reference the session information. Without this feature, the rules will be removed at the end of every session since they depend on a temporary object. Global temp tables will allow the developer to define the temp table and the rules once.

(2)
The second case is mentioned by Tom Lane back in 2010 in a thread about global temp tables.
(http://www.postgresql.org/message-id/9319.1272130283@sss.pgh.pa.us)
"The context that I've seen it come up in is that people don't want to clutter their functions with
create-it-if-it-doesn't-exist logic, which you have to have given the current behavior of temp tables."

> 2.a - using on demand created temp tables - most simple solution, but
> doesn't help with catalogue bloating

I've read the thread and people disapprove this approach because of the potential catalog bloat. However, I'd like to champion it. Indeed, this approach may have a bloat issue. But for users who needs global temp tables, they now have to create a new temp table in every session, which means they already have the bloat problem and presumably they have some policy to deal with it. In other words, implementing global temp tables by this approach gives users the same performance, plus the convenience the feature brings.

The root problem here is that whether "whether having the unoptimized feature is better than
having no feature at all". Actually, there was a very similar discussion back in 2009 on global temp tables. Let me borrow Kevin Grittner's and Tom Lane's arguments here.

Kevin Grittner's argument:

http://www.postgresql.org/message-id/49F82AEA.EE98.0025.0@wicourts.gov
"... If you're saying we can implement the standard's global temporary tables in a way that performs better than current temporary tables, that's cool. That would be a nice "bonus" in addition to the application programmer convenience and having another tick-mark on the standards compliance charts. Do you think that's feasible? If not, the feature would be useful to some with the same performance that temporary tables currently provide."

Tom Lane's arguments:

http://www.postgresql.org/message-id/24110.1241035178@sss.pgh.pa.us
"I'm all for eliminating catalog overheads, if we can find a way to do that. I don't think that you get to veto implementation of the feature until we can find a way to optimize it better. The question is not about whether having the optimization would be better than not having it --- it's about whether having the unoptimized feature is better than having no feature at all (which means people have to implement the same behavior by hand, and they'll *still* not get the optimization)."

There have been several threads here discussing global temp table since 2007. Quite a few ideas aimed to avoid the bloat issue by not storing the metadata of the session copy in the catalog. However, it seems that none of them has been implemented, or even has a feasible design. So why don't we implement it in a unoptimized way first?

I am not sure, if it is not useless work.

Now, I am thinking so best implementation of global temp tables is enhancing unlogged tables to have local content. All local data can be saved in session memory. Usually it is less than 2KB with statistic, and you don't need to store it in catalogue. When anybody is working with any table, related data are copied to system cache - and there can be injected a implementation of global temp tables.

regards

Pavel Stehule

> Is there still interest about this feature?
I'm very interested in this feature. I'm thinking about one implementation which is similar to Pavel's 2009 proposal (http://www.postgresql.org/message-id/162867790904271344s1ec96d90j6cde295fdcc7806f@mail.gmail.com). Here are the major ideas of my design:

(1)
Creating the cross-session persistent schema as a regular table and creating session-private temp tables when a session first accesses it.

(2)
For DML queries, The global temp table is overloaded by its session copy after the relation is opened by an oid or a rangevar. For DDL queries, which copy is used depends on whether the query needs to access the data or metadata of the global temp table.

There are more differences between this design and Pavel's 2009 proposal and I'd like to send a detailed proposal to the mailing list but first I want to know if our community would accept a global temp table implementation which provides the same performance as currently temp tables do.

Thanks,
Zhaomo

Re: Implementation of global temporary tables?

From

Zhaomo Yang

Date:

08 July 2015, 13:24:31

> more global temp tables are little bit comfortable for developers,
I'd like to emphasize this point. This feature does much more than saving a developer from issuing a CREATE TEMP TABLE statement in every session. Here are two common use cases and I'm sure there are more.

(1)
Imagine in a web application scenario, a developer wants to cache some session information in a temp table. What's more, he also wants to specify some rules which reference the session information. Without this feature, the rules will be removed at the end of every session since they depend on a temporary object. Global temp tables will allow the developer to define the temp table and the rules once.

(2)
The second case is mentioned by Tom Lane back in 2010 in a thread about global temp tables.

(http://www.postgresql.org/message-id/9319.1272130283@sss.pgh.pa.us)
"The context that I've seen it come up in is that people don't want to clutter their functions with
create-it-if-it-doesn't-exist logic, which you have to have given the current behavior of temp tables."

> 2.a - using on demand created temp tables - most simple solution, but
> doesn't help with catalogue bloating

I've read the thread and people disapprove this approach because of the potential catalog bloat. However, I'd like to champion it. Indeed, this approach may have a bloat issue. But for users who needs global temp tables, they now have to create a new temp table in every session, which means they already have the bloat problem and presumably they have some policy to deal with it. In other words, implementing global temp tables by this approach gives users the same performance, plus the convenience the feature brings.

The root problem here is that whether "whether having the unoptimized feature is better than
having no feature at all". Actually, there was a very similar discussion back in 2009 on global temp tables. Let me borrow Kevin Grittner's and Tom Lane's arguments here.

Kevin Grittner's argument:

http://www.postgresql.org/message-id/49F82AEA.EE98.0025.0@wicourts.gov
"... If you're saying we can implement the standard's global temporary tables in a way that performs better than current temporary tables, that's cool. That would be a nice "bonus" in addition to the application programmer convenience and having another tick-mark on the standards compliance charts. Do you think that's feasible? If not, the feature would be useful to some with the same performance that temporary tables currently provide."

Tom Lane's arguments:

http://www.postgresql.org/message-id/24110.1241035178@sss.pgh.pa.us
"I'm all for eliminating catalog overheads, if we can find a way to do that. I don't think that you get to veto implementation of the feature until we can find a way to optimize it better. The question is not about whether having the optimization would be better than not having it --- it's about whether having the unoptimized feature is better than having no feature at all (which means people have to implement the same behavior by hand, and they'll *still* not get the optimization)."

There have been several threads here discussing global temp table since 2007. Quite a few ideas aimed to avoid the bloat issue by not storing the metadata of the session copy in the catalog. However, it seems that none of them has been implemented, or even has a feasible design. So why don't we implement it in a unoptimized way first?

> Is there still interest about this feature?
I'm very interested in this feature. I'm thinking about one implementation which is similar to Pavel's 2009 proposal (http://www.postgresql.org/message-id/162867790904271344s1ec96d90j6cde295fdcc7806f@mail.gmail.com). Here are the major ideas of my design:

(1)

Creating the cross-session persistent schema as a regular table and creating session-private temp tables when a session first accesses it.

(2)

For DML queries, The global temp table is overloaded by its session copy after the relation is opened by an oid or a rangevar. For DDL queries, which copy is used depends on whether the query needs to access the data or metadata of the global temp table.

There are more differences between this design and Pavel's 2009 proposal and I'd like to send a detailed proposal to the mailing list but first I want to know if our community would accept a global temp table implementation which provides the same performance as currently temp tables do.

Thanks,

Zhaomo

Re: Implementation of global temporary tables?

From

Pavel Stehule

Date:

09 July 2015, 05:45:52

2015-07-09 7:32 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu>:

> I am not sure, if it is not useless work.

I don't understand why an implementation taking approach 2.a would be useless. As I said, its performance will be no worse than current temp tables and it will provide a lot of convenience to users who need to create temp tables in every session.

Surely it should be step forward. But you will to have to solve lot of problems with "duplicated" tables in system catalogue, and still it doesn't solve the main problem with temporary tables - the bloating catalogue - and related performance degradation.

Although global temp tables is nice to have feature (for PLpgSQL developers), we can live without it - and with some patterns and extensions, we are living well. But the performance issue is not be fixed by any pattern. So the major motivation for introduction of global temp tables is performance - from 90%. It should be a primary target to merge this feature to upstream. I believe, when bloating will be solved, then the chance to accept this patch will be pretty high.

Regards

Pavel

Thanks,
Zhaomo

On Tue, Jul 7, 2015 at 11:53 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hi

2015-07-08 9:08 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu>:
> more global temp tables are little bit comfortable for developers,
I'd like to emphasize this point. This feature does much more than saving a developer from issuing a CREATE TEMP TABLE statement in every session. Here are two common use cases and I'm sure there are more.

(1)
Imagine in a web application scenario, a developer wants to cache some session information in a temp table. What's more, he also wants to specify some rules which reference the session information. Without this feature, the rules will be removed at the end of every session since they depend on a temporary object. Global temp tables will allow the developer to define the temp table and the rules once.

(2)
The second case is mentioned by Tom Lane back in 2010 in a thread about global temp tables.
(http://www.postgresql.org/message-id/9319.1272130283@sss.pgh.pa.us)
"The context that I've seen it come up in is that people don't want to clutter their functions with
create-it-if-it-doesn't-exist logic, which you have to have given the current behavior of temp tables."

> 2.a - using on demand created temp tables - most simple solution, but
> doesn't help with catalogue bloating

I've read the thread and people disapprove this approach because of the potential catalog bloat. However, I'd like to champion it. Indeed, this approach may have a bloat issue. But for users who needs global temp tables, they now have to create a new temp table in every session, which means they already have the bloat problem and presumably they have some policy to deal with it. In other words, implementing global temp tables by this approach gives users the same performance, plus the convenience the feature brings.

The root problem here is that whether "whether having the unoptimized feature is better than
having no feature at all". Actually, there was a very similar discussion back in 2009 on global temp tables. Let me borrow Kevin Grittner's and Tom Lane's arguments here.

Kevin Grittner's argument:

http://www.postgresql.org/message-id/49F82AEA.EE98.0025.0@wicourts.gov
"... If you're saying we can implement the standard's global temporary tables in a way that performs better than current temporary tables, that's cool. That would be a nice "bonus" in addition to the application programmer convenience and having another tick-mark on the standards compliance charts. Do you think that's feasible? If not, the feature would be useful to some with the same performance that temporary tables currently provide."

Tom Lane's arguments:

http://www.postgresql.org/message-id/24110.1241035178@sss.pgh.pa.us
"I'm all for eliminating catalog overheads, if we can find a way to do that. I don't think that you get to veto implementation of the feature until we can find a way to optimize it better. The question is not about whether having the optimization would be better than not having it --- it's about whether having the unoptimized feature is better than having no feature at all (which means people have to implement the same behavior by hand, and they'll *still* not get the optimization)."

There have been several threads here discussing global temp table since 2007. Quite a few ideas aimed to avoid the bloat issue by not storing the metadata of the session copy in the catalog. However, it seems that none of them has been implemented, or even has a feasible design. So why don't we implement it in a unoptimized way first?

I am not sure, if it is not useless work.

Now, I am thinking so best implementation of global temp tables is enhancing unlogged tables to have local content. All local data can be saved in session memory. Usually it is less than 2KB with statistic, and you don't need to store it in catalogue. When anybody is working with any table, related data are copied to system cache - and there can be injected a implementation of global temp tables.

regards

Pavel Stehule

> Is there still interest about this feature?
I'm very interested in this feature. I'm thinking about one implementation which is similar to Pavel's 2009 proposal (http://www.postgresql.org/message-id/162867790904271344s1ec96d90j6cde295fdcc7806f@mail.gmail.com). Here are the major ideas of my design:

(1)
Creating the cross-session persistent schema as a regular table and creating session-private temp tables when a session first accesses it.

(2)
For DML queries, The global temp table is overloaded by its session copy after the relation is opened by an oid or a rangevar. For DDL queries, which copy is used depends on whether the query needs to access the data or metadata of the global temp table.

There are more differences between this design and Pavel's 2009 proposal and I'd like to send a detailed proposal to the mailing list but first I want to know if our community would accept a global temp table implementation which provides the same performance as currently temp tables do.

Thanks,
Zhaomo

Re: Implementation of global temporary tables?

From

Zhaomo Yang

Date:

09 July 2015, 11:23:01

> I am not sure, if it is not useless work.

I don't understand why an implementation taking approach 2.a would be useless. As I said, its performance will be no worse than current temp tables and it will provide a lot of convenience to users who need to create temp tables in every session.

Thanks,

Zhaomo

On Tue, Jul 7, 2015 at 11:53 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

Hi

2015-07-08 9:08 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu>:
> more global temp tables are little bit comfortable for developers,
I'd like to emphasize this point. This feature does much more than saving a developer from issuing a CREATE TEMP TABLE statement in every session. Here are two common use cases and I'm sure there are more.

(1)
Imagine in a web application scenario, a developer wants to cache some session information in a temp table. What's more, he also wants to specify some rules which reference the session information. Without this feature, the rules will be removed at the end of every session since they depend on a temporary object. Global temp tables will allow the developer to define the temp table and the rules once.

(2)
The second case is mentioned by Tom Lane back in 2010 in a thread about global temp tables.
(http://www.postgresql.org/message-id/9319.1272130283@sss.pgh.pa.us)
"The context that I've seen it come up in is that people don't want to clutter their functions with
create-it-if-it-doesn't-exist logic, which you have to have given the current behavior of temp tables."

> 2.a - using on demand created temp tables - most simple solution, but
> doesn't help with catalogue bloating

I've read the thread and people disapprove this approach because of the potential catalog bloat. However, I'd like to champion it. Indeed, this approach may have a bloat issue. But for users who needs global temp tables, they now have to create a new temp table in every session, which means they already have the bloat problem and presumably they have some policy to deal with it. In other words, implementing global temp tables by this approach gives users the same performance, plus the convenience the feature brings.

The root problem here is that whether "whether having the unoptimized feature is better than
having no feature at all". Actually, there was a very similar discussion back in 2009 on global temp tables. Let me borrow Kevin Grittner's and Tom Lane's arguments here.

Kevin Grittner's argument:

http://www.postgresql.org/message-id/49F82AEA.EE98.0025.0@wicourts.gov
"... If you're saying we can implement the standard's global temporary tables in a way that performs better than current temporary tables, that's cool. That would be a nice "bonus" in addition to the application programmer convenience and having another tick-mark on the standards compliance charts. Do you think that's feasible? If not, the feature would be useful to some with the same performance that temporary tables currently provide."

Tom Lane's arguments:

http://www.postgresql.org/message-id/24110.1241035178@sss.pgh.pa.us
"I'm all for eliminating catalog overheads, if we can find a way to do that. I don't think that you get to veto implementation of the feature until we can find a way to optimize it better. The question is not about whether having the optimization would be better than not having it --- it's about whether having the unoptimized feature is better than having no feature at all (which means people have to implement the same behavior by hand, and they'll *still* not get the optimization)."

There have been several threads here discussing global temp table since 2007. Quite a few ideas aimed to avoid the bloat issue by not storing the metadata of the session copy in the catalog. However, it seems that none of them has been implemented, or even has a feasible design. So why don't we implement it in a unoptimized way first?

I am not sure, if it is not useless work.

Now, I am thinking so best implementation of global temp tables is enhancing unlogged tables to have local content. All local data can be saved in session memory. Usually it is less than 2KB with statistic, and you don't need to store it in catalogue. When anybody is working with any table, related data are copied to system cache - and there can be injected a implementation of global temp tables.

regards

Pavel Stehule

> Is there still interest about this feature?
I'm very interested in this feature. I'm thinking about one implementation which is similar to Pavel's 2009 proposal (http://www.postgresql.org/message-id/162867790904271344s1ec96d90j6cde295fdcc7806f@mail.gmail.com). Here are the major ideas of my design:

(1)
Creating the cross-session persistent schema as a regular table and creating session-private temp tables when a session first accesses it.

(2)
For DML queries, The global temp table is overloaded by its session copy after the relation is opened by an oid or a rangevar. For DDL queries, which copy is used depends on whether the query needs to access the data or metadata of the global temp table.

There are more differences between this design and Pavel's 2009 proposal and I'd like to send a detailed proposal to the mailing list but first I want to know if our community would accept a global temp table implementation which provides the same performance as currently temp tables do.

Thanks,
Zhaomo

Re: Implementation of global temporary tables?

From

Jim Nasby

Date:

14 July 2015, 22:21:10

On 7/9/15 12:45 AM, Pavel Stehule wrote:
>
> 2015-07-09 7:32 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu
> <mailto:zhy001@cs.ucsd.edu>>:
>
>     >  I am not sure, if it is not useless work.
>
>     I don't understand why an implementation taking approach 2.a would
>     be useless. As I said, its performance will be no worse than current
>     temp tables and it will provide a lot of convenience to users who
>     need to create temp tables in every session.
>
>
> Surely it should be step forward. But you will to have to solve lot of
> problems with "duplicated" tables in system catalogue, and still it
> doesn't solve the main problem with temporary tables - the bloating
> catalogue - and related performance degradation.

That being the "main" problem is strictly a matter of opinion based on 
your experience. Many people don't have a performance problem today, but 
do have to deal with all the pain of handling this manually (as well as 
all the limitations that go with that).

If it's easy to fix the bloat problem at the same time as adding GLOBAL 
TEMP then great! But there's no reason to reject this just because it 
doesn't fix that issue.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: Implementation of global temporary tables?

From

Simon Riggs

Date:

15 July 2015, 11:58:57

On 14 July 2015 at 23:20, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

On 7/9/15 12:45 AM, Pavel Stehule wrote:

2015-07-09 7:32 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu
<mailto:zhy001@cs.ucsd.edu>>:

> I am not sure, if it is not useless work.

I don't understand why an implementation taking approach 2.a would
be useless. As I said, its performance will be no worse than current
temp tables and it will provide a lot of convenience to users who
need to create temp tables in every session.

Surely it should be step forward. But you will to have to solve lot of
problems with "duplicated" tables in system catalogue, and still it
doesn't solve the main problem with temporary tables - the bloating
catalogue - and related performance degradation.

That being the "main" problem is strictly a matter of opinion based on your experience. Many people don't have a performance problem today, but do have to deal with all the pain of handling this manually (as well as all the limitations that go with that).

If it's easy to fix the bloat problem at the same time as adding GLOBAL TEMP then great! But there's no reason to reject this just because it doesn't fix that issue.

Agreed

There are some good arguments for why we need this feature.

Pavel's original description of how to do this seem valid, and from the link Tom agreed in 2009.

For me the design summary is this

* CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table but with different relkind

* When we see a request to INSERT, DEL, UPD, SEL from the temp table, if it does not exist we create it as a TEMP table of the same name, using the Global's pg_class entry as a template

That meets the SQL Standard and doesn't contain any visibility problems or need for new internals.

The purpose of this feature is to automatically create a temp table with the same definition whenever needed. The discussion of "bloat" is just wrong. We create exactly the same amount of bloat as if we had typed CREATE TEMP TABLE. Optimising temp table entries in the catalog is another, separate patch, if we care.

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Implementation of global temporary tables?

From

Andrew Dunstan

Date:

15 July 2015, 12:26:49

On 07/15/2015 07:58 AM, Simon Riggs wrote:

>
> For me the design summary is this
>
> * CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table 
> but with different relkind
> * When we see a request to INSERT, DEL, UPD, SEL from the temp table, 
> if it does not exist we create it as a TEMP table of the same name, 
> using the Global's pg_class entry as a template
>
> That meets the SQL Standard and doesn't contain any visibility 
> problems or need for new internals.
>
> The purpose of this feature is to automatically create a temp table 
> with the same definition whenever needed. The discussion of "bloat" is 
> just wrong. We create exactly the same amount of bloat as if we had 
> typed CREATE TEMP TABLE. Optimising temp table entries in the catalog 
> is another, separate patch, if we care.
>
>

Sounds fine in general. I'm a bit curious to know what are the locking 
implications of vivifying the table on access.

cheers

andrew

Re: Implementation of global temporary tables?

From

Pavel Stehule

Date:

15 July 2015, 12:41:06

2015-07-15 13:58 GMT+02:00 Simon Riggs <simon@2ndquadrant.com>:

On 14 July 2015 at 23:20, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 7/9/15 12:45 AM, Pavel Stehule wrote:

2015-07-09 7:32 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu
<mailto:zhy001@cs.ucsd.edu>>:

> I am not sure, if it is not useless work.

I don't understand why an implementation taking approach 2.a would
be useless. As I said, its performance will be no worse than current
temp tables and it will provide a lot of convenience to users who
need to create temp tables in every session.

Surely it should be step forward. But you will to have to solve lot of
problems with "duplicated" tables in system catalogue, and still it
doesn't solve the main problem with temporary tables - the bloating
catalogue - and related performance degradation.

That being the "main" problem is strictly a matter of opinion based on your experience. Many people don't have a performance problem today, but do have to deal with all the pain of handling this manually (as well as all the limitations that go with that).

If it's easy to fix the bloat problem at the same time as adding GLOBAL TEMP then great! But there's no reason to reject this just because it doesn't fix that issue.

Agreed

There are some good arguments for why we need this feature.

Pavel's original description of how to do this seem valid, and from the link Tom agreed in 2009.

For me the design summary is this

* CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table but with different relkind
* When we see a request to INSERT, DEL, UPD, SEL from the temp table, if it does not exist we create it as a TEMP table of the same name, using the Global's pg_class entry as a template

That meets the SQL Standard and doesn't contain any visibility problems or need for new internals.

The purpose of this feature is to automatically create a temp table with the same definition whenever needed. The discussion of "bloat" is just wrong. We create exactly the same amount of bloat as if we had typed CREATE TEMP TABLE. Optimising temp table entries in the catalog is another, separate patch, if we care.

The optimization of local temp tables is little bit harder - you cannot to share pg_class and pg_attribute - although some memory entries can be used too.

Regards

Pavel

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Implementation of global temporary tables?

From

Pavel Stehule

Date:

15 July 2015, 13:30:50

2015-07-15 15:21 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu>:

> Sounds fine in general. I'm a bit curious to know what are the locking implications of > vivifying the table on access.

The locking implications depend on how we interpret the existing commands in the context of global temp tables and I think we should discuss and agree on the behaviors of the commands with global temp tables, but I think in general we can follow these rules:

If the command executes on the global temp table's metadata, for example an ALTER TABLE command, then we lock the global copy at the same level as we do a regular table.

there is other question - what is effect of ALTER TABLE of global temp table on instances of this table in active sessions?

If the command executes on the global temp table's data (which is actually stored in the session copy), for example an DML command, then the global copy is locked at the AccessShareLock level to prevent concurrent modifications to the global temp table's definition from other sessions.

Thanks,
Zhaomo

On Wed, Jul 15, 2015 at 4:26 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

On 07/15/2015 07:58 AM, Simon Riggs wrote:

For me the design summary is this

* CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table but with different relkind
* When we see a request to INSERT, DEL, UPD, SEL from the temp table, if it does not exist we create it as a TEMP table of the same name, using the Global's pg_class entry as a template

That meets the SQL Standard and doesn't contain any visibility problems or need for new internals.

The purpose of this feature is to automatically create a temp table with the same definition whenever needed. The discussion of "bloat" is just wrong. We create exactly the same amount of bloat as if we had typed CREATE TEMP TABLE. Optimising temp table entries in the catalog is another, separate patch, if we care.

Sounds fine in general. I'm a bit curious to know what are the locking implications of vivifying the table on access.

cheers

andrew

Re: Implementation of global temporary tables?

From

Simon Riggs

Date:

15 July 2015, 14:05:53

On 15 July 2015 at 13:26, Andrew Dunstan <andrew@dunslane.net> wrote:

On 07/15/2015 07:58 AM, Simon Riggs wrote:

For me the design summary is this

* CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table but with different relkind
* When we see a request to INSERT, DEL, UPD, SEL from the temp table, if it does not exist we create it as a TEMP table of the same name, using the Global's pg_class entry as a template

That meets the SQL Standard and doesn't contain any visibility problems or need for new internals.

The purpose of this feature is to automatically create a temp table with the same definition whenever needed. The discussion of "bloat" is just wrong. We create exactly the same amount of bloat as if we had typed CREATE TEMP TABLE. Optimising temp table entries in the catalog is another, separate patch, if we care.

Sounds fine in general. I'm a bit curious to know what are the locking implications of vivifying the table on access.

We would lock the Global Temp Table at the same lock level for the activity, just as we do for INSERT, SELECT etc.. That prevents concurrent DDL like DROP or ALTER on the Global Temp Table.

The Local temp table created is a copy of the Global Temp Table. The Local temp table is only locally locked, so no worries.

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Implementation of global temporary tables?

From

Pavel Stehule

Date:

15 July 2015, 14:25:10

2015-07-15 15:53 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu>:

> there is other question - what is effect of ALTER TABLE of global temp table on
> instances of this table in active sessions?

As I said, we need to first agree on the behaviors of the existing commands. I can think of two options now for ALTER TABLE: 1) only allow ALTER TABLE when there is no other active sessions (this is how Oracle deals with it.) 2) handle it as if session copies inherit from the global copy and ALTER TABLE executes on the global copy.

There are two possible kinds of GLOBAL TEMP tables - session related and transation related. Transaction related tables has very short life - and @1 needs outage, @2 requires stronger locks and can slow and less effective - because a) some changes can be invisible in other transactions (depends on isolation levels), b) the structure can be changed, but function code not (without dependency on isolation levels) - so it can be non consistent, c) why to change table if this table will be dropped in next milisecond. For this case the behave like PL functions can be very practical ~ third option for ALTER TABLE

Regards

Pavel

Thanks,
Zhaomo

Re: Implementation of global temporary tables?

From

Andres Freund

Date:

15 July 2015, 14:52:54

On 2015-07-15 12:58:51 +0100, Simon Riggs wrote:
> On 14 July 2015 at 23:20, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> Pavel's original description of how to do this seem valid, and from the
> link Tom agreed in 2009.
> 
> For me the design summary is this
> 
> * CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table but
> with different relkind
> * When we see a request to INSERT, DEL, UPD, SEL from the temp table, if it
> does not exist we create it as a TEMP table of the same name, using the
> Global's pg_class entry as a template

Why do we need to create that copy? We can just use the relfilenode in
all backends by having the backendid in the filename? Yes, there's a
some amount of additional code needed, but it's not that much?  I
actually think it might end up being less additional code than having a
copy, because with the copy you'll have two different oids for global
entry and the local copy.

Regards,

Andres

Re: Implementation of global temporary tables?

From

Andres Freund

Date:

15 July 2015, 14:57:48

On 2015-07-15 16:52:49 +0200, Andres Freund wrote:
> Why do we need to create that copy? We can just use the relfilenode in
> all backends by having the backendid in the filename? Yes, there's a
> some amount of additional code needed, but it's not that much?  I
> actually think it might end up being less additional code than having a
> copy, because with the copy you'll have two different oids for global
> entry and the local copy.

Hm, yes. Brainfart. Transaction table rewrites/truncations need to
change the relfilenode.

To fix We could add a backend local mapping table from global temp table
id to the backend local relfilenode. The code to lookup the relfilenode
is already mostly isolated.

Re: Implementation of global temporary tables?

From

Zhaomo Yang

Date:

15 July 2015, 15:24:10

> Sounds fine in general. I'm a bit curious to know what are the locking implications of > vivifying the table on access.

The locking implications depend on how we interpret the existing commands in the context of global temp tables and I think we should discuss and agree on the behaviors of the commands with global temp tables, but I think in general we can follow these rules:

If the command executes on the global temp table's metadata, for example an ALTER TABLE command, then we lock the global copy at the same level as we do a regular table.

If the command executes on the global temp table's data (which is actually stored in the session copy), for example an DML command, then the global copy is locked at the AccessShareLock level to prevent concurrent modifications to the global temp table's definition from other sessions.

Thanks,

Zhaomo

On Wed, Jul 15, 2015 at 4:26 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

On 07/15/2015 07:58 AM, Simon Riggs wrote:

For me the design summary is this

* CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table but with different relkind
* When we see a request to INSERT, DEL, UPD, SEL from the temp table, if it does not exist we create it as a TEMP table of the same name, using the Global's pg_class entry as a template

That meets the SQL Standard and doesn't contain any visibility problems or need for new internals.

The purpose of this feature is to automatically create a temp table with the same definition whenever needed. The discussion of "bloat" is just wrong. We create exactly the same amount of bloat as if we had typed CREATE TEMP TABLE. Optimising temp table entries in the catalog is another, separate patch, if we care.

Sounds fine in general. I'm a bit curious to know what are the locking implications of vivifying the table on access.

cheers

andrew

Re: Implementation of global temporary tables?

From

Zhaomo Yang

Date:

15 July 2015, 15:24:17

> there is other question - what is effect of ALTER TABLE of global temp table on

> instances of this table in active sessions?

As I said, we need to first agree on the behaviors of the existing commands. I can think of two options now for ALTER TABLE: 1) only allow ALTER TABLE when there is no other active sessions (this is how Oracle deals with it.) 2) handle it as if session copies inherit from the global copy and ALTER TABLE executes on the global copy.

Thanks,
Zhaomo

Re: Implementation of global temporary tables?

From

Simon Riggs

Date:

15 July 2015, 15:24:58

On 15 July 2015 at 15:57, Andres Freund <andres@anarazel.de> wrote:

On 2015-07-15 16:52:49 +0200, Andres Freund wrote:
> Why do we need to create that copy? We can just use the relfilenode in
> all backends by having the backendid in the filename? Yes, there's a
> some amount of additional code needed, but it's not that much? I
> actually think it might end up being less additional code than having a
> copy, because with the copy you'll have two different oids for global
> entry and the local copy.

Hm, yes. Brainfart. Transaction table rewrites/truncations need to
change the relfilenode.

To fix We could add a backend local mapping table from global temp table
id to the backend local relfilenode. The code to lookup the relfilenode
is already mostly isolated.

It may be possible to do this, though I'm sure there's a wrinkle somewhere. But there doesn't seem to be a need to overload the main feature request with additional requirements. Doing that is just scope creep that prevents us getting features out. Nice, simple patches from newer developers. Later tuning and tweaking from more expert community members.

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Implementation of global temporary tables?

From

Andres Freund

Date:

15 July 2015, 15:28:41

On 2015-07-15 16:24:52 +0100, Simon Riggs wrote:
> It may be possible to do this, though I'm sure there's a wrinkle somewhere.
> But there doesn't seem to be a need to overload the main feature request
> with additional requirements. Doing that is just scope creep that prevents
> us getting features out. Nice, simple patches from newer developers. Later
> tuning and tweaking from more expert community members.

I think that's generally a fair point. But here we're discussing to add
a fair amount of wrinkles with the copy approach. The fact alone that
the oid is different will have some ugly consequences.

So we add complexity, just to shift it into different places later? I'm
not sure that's a good idea.

Re: Implementation of global temporary tables?

From

Simon Riggs

Date:

15 July 2015, 15:36:19

On 15 July 2015 at 16:28, Andres Freund <andres@anarazel.de> wrote:

On 2015-07-15 16:24:52 +0100, Simon Riggs wrote:
> It may be possible to do this, though I'm sure there's a wrinkle somewhere.
> But there doesn't seem to be a need to overload the main feature request
> with additional requirements. Doing that is just scope creep that prevents
> us getting features out. Nice, simple patches from newer developers. Later
> tuning and tweaking from more expert community members.

I think that's generally a fair point. But here we're discussing to add
a fair amount of wrinkles with the copy approach. The fact alone that
the oid is different will have some ugly consequences.

Why? We are creating a local temp table LIKE the global temp table. That is already a supported operation. So there is no "different oid".

So we add complexity, just to shift it into different places later? I'm
not sure that's a good idea.

There's no complexity in a simple temp table like. We can do this now with triggers.

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Implementation of global temporary tables?

From

Tom Lane

Date:

15 July 2015, 15:40:15

Andres Freund <andres@anarazel.de> writes:
> On 2015-07-15 16:24:52 +0100, Simon Riggs wrote:
>> It may be possible to do this, though I'm sure there's a wrinkle somewhere.
>> But there doesn't seem to be a need to overload the main feature request
>> with additional requirements. Doing that is just scope creep that prevents
>> us getting features out. Nice, simple patches from newer developers. Later
>> tuning and tweaking from more expert community members.

> I think that's generally a fair point. But here we're discussing to add
> a fair amount of wrinkles with the copy approach. The fact alone that
> the oid is different will have some ugly consequences.

> So we add complexity, just to shift it into different places later? I'm
> not sure that's a good idea.

With all due respect, there are features that are beyond the abilities of
some "newer developers", and reducing the scope isn't a good way to fix
that.  It just leaves a bigger mess to be cleaned up later.

I think Andres' idea of a per-backend filenode mapping table might work.
The existing relfilenode mapper solves a somewhat related problem, namely
how do you replace the filenode for shared system catalogs whose pg_class
entries can't be changed.
        regards, tom lane

Re: Implementation of global temporary tables?

From

Tom Lane

Date:

15 July 2015, 15:41:27

Simon Riggs <simon@2ndQuadrant.com> writes:
> On 15 July 2015 at 16:28, Andres Freund <andres@anarazel.de> wrote:
>> I think that's generally a fair point. But here we're discussing to add
>> a fair amount of wrinkles with the copy approach. The fact alone that
>> the oid is different will have some ugly consequences.

> Why? We are creating a local temp table LIKE the global temp table. That is
> already a supported operation. So there is no "different oid".

You're presuming a specific implementation decision, one that has not been
made yet, and isn't all that attractive because of the catalog bloat issues.
        regards, tom lane

Re: Implementation of global temporary tables?

From

Andres Freund

Date:

15 July 2015, 15:44:14

On 2015-07-15 16:36:12 +0100, Simon Riggs wrote:
> On 15 July 2015 at 16:28, Andres Freund <andres@anarazel.de> wrote:
> > I think that's generally a fair point. But here we're discussing to add
> > a fair amount of wrinkles with the copy approach. The fact alone that
> > the oid is different will have some ugly consequences.
> >
> 
> Why? We are creating a local temp table LIKE the global temp table. That is
> already a supported operation. So there is no "different oid".

Then your locking against ALTER, DROP etc. isn't going to work.

Re: Implementation of global temporary tables?

From

Simon Riggs

Date:

15 July 2015, 15:53:08

On 15 July 2015 at 16:44, Andres Freund <andres@anarazel.de> wrote:

On 2015-07-15 16:36:12 +0100, Simon Riggs wrote:
> On 15 July 2015 at 16:28, Andres Freund <andres@anarazel.de> wrote:
> > I think that's generally a fair point. But here we're discussing to add
> > a fair amount of wrinkles with the copy approach. The fact alone that
> > the oid is different will have some ugly consequences.
> >
>
> Why? We are creating a local temp table LIKE the global temp table. That is
> already a supported operation. So there is no "different oid".

Then your locking against ALTER, DROP etc. isn't going to work.

There would be two objects, both locked. The temp table is just nice and simple. No problem.

Your optimization may work; I hope it does. My approach definitely will. So we could choose either.

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Implementation of global temporary tables?

From

Josh Berkus

Date:

19 July 2015, 19:39:24

Pavel, All:

Just to be clear, the idea of a global temp table is that the table def
is available to all users, but the data is private to each session?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: Implementation of global temporary tables?

From

Pavel Stehule

Date:

20 July 2015, 03:01:41

2015-07-19 21:39 GMT+02:00 Josh Berkus <josh@agliodbs.com>:

Pavel, All:

Just to be clear, the idea of a global temp table is that the table def
is available to all users, but the data is private to each session?

yes.

Pavel

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: Implementation of global temporary tables?

From

Gavin Flower

Date:

20 July 2015, 03:34:33

On 20/07/15 15:00, Pavel Stehule wrote:
>
>
> 2015-07-19 21:39 GMT+02:00 Josh Berkus <josh@agliodbs.com 
> <mailto:josh@agliodbs.com>>:
>
>     Pavel, All:
>
>     Just to be clear, the idea of a global temp table is that the
>     table def
>     is available to all users, but the data is private to each session?
>
>
> yes.
>
> Pavel
>
>
>     --
>     Josh Berkus
>     PostgreSQL Experts Inc.
>     http://pgexperts.com
>
>
Just wondering...

Would it be difficult to add the ability for one user to share the 
contents with a list of named other users (roles)?

-Gavin

Re: Implementation of global temporary tables?

From

Pavel Stehule

Date:

20 July 2015, 03:46:27

2015-07-20 5:33 GMT+02:00 Gavin Flower <GavinFlower@archidevsys.co.nz>:

On 20/07/15 15:00, Pavel Stehule wrote:

2015-07-19 21:39 GMT+02:00 Josh Berkus <josh@agliodbs.com <mailto:josh@agliodbs.com>>:

Pavel, All:

Just to be clear, the idea of a global temp table is that the
table def
is available to all users, but the data is private to each session?

yes.

Pavel

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Just wondering...

Would it be difficult to add the ability for one user to share the contents with a list of named other users (roles)?

Probably it is possible, but not for temporary data - short data are in process memory, so it are not accessible from other sessions.

This sharing tables needs:

1. some infrastructure to hold data about sharing - who can share with what

2. who will clean data? temporary data are cleaned on end of transaction or end of session

3. data should be saved in shared memory instead process memory

So it is possible, but partially different

-Gavin

Re: Implementation of global temporary tables?

From

Alvaro Herrera

Date:

20 July 2015, 09:07:45

Pavel Stehule wrote:
> 2015-07-20 5:33 GMT+02:00 Gavin Flower <GavinFlower@archidevsys.co.nz>:
> 
> > Would it be difficult to add the ability for one user to share the
> > contents with a list of named other users (roles)?
> 
> Probably it is possible, but not for temporary data - short data are in
> process memory, so it are not accessible from other sessions.
> 
> This sharing tables needs:
> 
> 1. some infrastructure to hold data about sharing - who can share with what
> 2. who will clean data? temporary data are cleaned on end of transaction or
> end of session
> 3. data should be saved in shared memory instead process memory
> 
> So it is possible, but partially different

To me this gets in the "crazy ideas" list.  Please add it to the TODO
page in the wiki, so that we're sure we never implement it.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Implementation of global temporary tables?

From

Andres Freund

Date:

20 July 2015, 09:12:42

On 2015-07-20 15:33:32 +1200, Gavin Flower wrote:
> Would it be difficult to add the ability for one user to share the contents
> with a list of named other users (roles)?

No need. That feature is called unlogged tables and grants.

Doing this for temporary tables would be horrible. They live in process
local memory and not shared memory. Because that provides higher
isolation, not even though it does.

Andres

Re: Implementation of global temporary tables?

From

Pavel Stehule

Date:

20 July 2015, 09:55:53

2015-07-20 11:07 GMT+02:00 Alvaro Herrera <alvherre@2ndquadrant.com>:

Pavel Stehule wrote:
> 2015-07-20 5:33 GMT+02:00 Gavin Flower <GavinFlower@archidevsys.co.nz>:
>
> > Would it be difficult to add the ability for one user to share the
> > contents with a list of named other users (roles)?
>
> Probably it is possible, but not for temporary data - short data are in
> process memory, so it are not accessible from other sessions.
>
> This sharing tables needs:
>
> 1. some infrastructure to hold data about sharing - who can share with what
> 2. who will clean data? temporary data are cleaned on end of transaction or
> end of session
> 3. data should be saved in shared memory instead process memory
>
> So it is possible, but partially different

To me this gets in the "crazy ideas" list. Please add it to the TODO
page in the wiki, so that we're sure we never implement it.

yes, it is pretty crazy - Have no plan to implement it :)

Pavel

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Implementation of global temporary tables?

From

Zhaomo Yang

Date:

20 July 2015, 12:49:31

Just to be clear, the idea of a global temp table is that the table def
is available to all users, but the data is private to each session?

The table def is visible to all sessions and persistent, but the data is private to each session and temporary.

Thanks,

Zhaomo

Re: Implementation of global temporary tables?

From

Robert Haas

Date:

22 July 2015, 14:29:49

On Wed, Jul 15, 2015 at 11:52 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 15 July 2015 at 16:44, Andres Freund <andres@anarazel.de> wrote:
>> On 2015-07-15 16:36:12 +0100, Simon Riggs wrote:
>> > On 15 July 2015 at 16:28, Andres Freund <andres@anarazel.de> wrote:
>> > > I think that's generally a fair point. But here we're discussing to
>> > > add
>> > > a fair amount of wrinkles with the copy approach. The fact alone that
>> > > the oid is different will have some ugly consequences.
>> > >
>> >
>> > Why? We are creating a local temp table LIKE the global temp table. That
>> > is
>> > already a supported operation. So there is no "different oid".
>>
>> Then your locking against ALTER, DROP etc. isn't going to work.
>
> There would be two objects, both locked. The temp table is just nice and
> simple. No problem.
>
> Your optimization may work; I hope it does. My approach definitely will. So
> we could choose either.

It's not really an optimization; it's a whole different approach.  I
looked at the create-a-temp-table-on-the-fly idea back when I
implemented unlogged tables and concluded it was an unworkable mess.
Deep down in the guts of name resolution code is not the place where
you want to suddenly decide that you need to run some DDL.  So I
believe in what Andres is proposing.  I'm not necessarily going to
shout it down if somebody finds a way to make the
temp-table-on-the-fly approach work, but my view is that making that
work, although it may look superficially appealing, will eventually
make whoever has to do it hate their life; and that even if they get
it to where it sorta works, it's going to have ugly corner cases that
are almost impossible to file down.

Another advantage of Andres's approach, BTW, is that it could
potentially eventually be extended to work on Hot Standby machines.
For that to work, we'd need a separate XID space for temporary tables,
but Noah proposed that before, and I don't think it's a completely
crazy idea (just mostly crazy).  Now, maybe nobody's going to care
about that any more in 5 years if we have full-blown logical
replication deeply integrated into core, but there's a lot to like
about a design that keeps our options in that area open.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company