Thread: Implementation of global temporary tables?
Hello
Six years ago we did discuss about global temporary tables - persistent schema, ephemeral data.http://postgresql.nabble.com/idea-global-temp-tables-td2007217.html
more global temp tables are little bit comfortable for developers,
Opened questions:
1. Name and conformance with standard of this feature - because we doesn't support modules, a mapping ANSI -> PG should not be trivial
2. Implementation
I see three possible ways how to implement it:
2.a - using on demand created temp tables - most simple solution, but doesn't help with catalogue bloating
2.b - using unlogged tables for holding statistics, relfilenode, and all necessary data
3.c - store ephemeral metadata only in memory without MVCC
Is there still interest about this feature?
Comments, notes?
Comments, notes?
Pavel
2015-02-02 11:15 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:
* we can simplify a static validation of plpgsql functions when temp tables are used,* we can get some performance benefit against current temp tables - less the catalogue bloating,I am thinking so some reasons why implement this feature are valid:HelloSix years ago we did discuss about global temporary tables - persistent schema, ephemeral data.
http://postgresql.nabble.com/idea-global-temp-tables-td2007217.htmlmore global temp tables are little bit comfortable for developers,* we can simplify migration from some other databases, where global temp tables are default.
I forgot other possible benefit:
* using temp tables on slaves - (needs 3c implementation)
* We have unlogged tables* We have updatable CTESome was changed from 2009:Opened questions:1. Name and conformance with standard of this feature - because we doesn't support modules, a mapping ANSI -> PG should not be trivial2. ImplementationI see three possible ways how to implement it:2.a - using on demand created temp tables - most simple solution, but doesn't help with catalogue bloating2.b - using unlogged tables for holding statistics, relfilenode, and all necessary data3.c - store ephemeral metadata only in memory without MVCCIs there still interest about this feature?
Comments, notes?Pavel
* We have unlogged tablesSome was changed from 2009:* We have updatable CTEOpened questions:1. Name and conformance with standard of this feature - because we doesn't support modules, a mapping ANSI -> PG should not be trivial2. ImplementationI see three possible ways how to implement it:2.a - using on demand created temp tables - most simple solution, but doesn't help with catalogue bloating2.b - using unlogged tables for holding statistics, relfilenode, and all necessary data3.c - store ephemeral metadata only in memory without MVCC
With 2.a, essentially, we are defining a global definition of a temp table, but the actual per session objects still follow the same rules are our current temp tables do?
2.b seems like a lot of new data, and if we are defining new paths for e.g. statistics for global temp tables, we might end up adding new logic in planner to use those tables. I am not seeing how this will work.
Could you elaborate a bit on 3.c please?
Something that really bothers me here, on a different note, is the catalog churn this could cause. Of course, you mentioned in as a point in your email, but I feel that 2.a's showstopper could be the massive catalog churn it causes. Maybe have a way to manage such tables without getting pg_class to bloat pretty quickly (I am assuming you do not mean metadata as the catalog metadata in 3.c).
Regards,
Atri
2015-02-02 11:51 GMT+01:00 Atri Sharma <atri.jiit@gmail.com>:
* We have unlogged tablesSome was changed from 2009:* We have updatable CTEOpened questions:1. Name and conformance with standard of this feature - because we doesn't support modules, a mapping ANSI -> PG should not be trivial2. ImplementationI see three possible ways how to implement it:2.a - using on demand created temp tables - most simple solution, but doesn't help with catalogue bloating2.b - using unlogged tables for holding statistics, relfilenode, and all necessary data3.c - store ephemeral metadata only in memory without MVCCWith 2.a, essentially, we are defining a global definition of a temp table, but the actual per session objects still follow the same rules are our current temp tables do?
yes .. it means "global temp table is template for local temp table"
2.b seems like a lot of new data, and if we are defining new paths for e.g. statistics for global temp tables, we might end up adding new logic in planner to use those tables. I am not seeing how this will work.
The advantages of this method is transactional behave and moving some bloating content to specific smaller and unlogged tables.
Could you elaborate a bit on 3.c please?Something that really bothers me here, on a different note, is the catalog churn this could cause. Of course, you mentioned in as a point in your email, but I feel that 2.a's showstopper could be the massive catalog churn it causes. Maybe have a way to manage such tables without getting pg_class to bloat pretty quickly (I am assuming you do not mean metadata as the catalog metadata in 3.c).
3.c is good protection against catalog bloating - on second hand - implementation will be probably more complex.
Regards,Atri
Hi, On 2015-02-02 11:15:22 +0100, Pavel Stehule wrote: > Six years ago we did discuss about global temporary tables - persistent > schema, ephemeral data. > > http://postgresql.nabble.com/idea-global-temp-tables-td2007217.html > > I am thinking so some reasons why implement this feature are valid: > > * we can get some performance benefit against current temp tables - less > the catalogue bloating, > > * we can simplify a static validation of plpgsql functions when temp tables > are used, > more global temp tables are little bit comfortable for developers, > > * we can simplify migration from some other databases, where global temp > tables are default. I agree that the feature would be interesting. > 2. Implementation > > I see three possible ways how to implement it: > > 2.a - using on demand created temp tables - most simple solution, but > doesn't help with catalogue bloating Yea, that's no good. > 2.b - using unlogged tables for holding statistics, relfilenode, and all > necessary data I can't follow why that'd achieve anything? > 3.c - store ephemeral metadata only in memory without MVCC I think that's not an option. That'd end up being a massive amount of duplication at a low rate of functionality. I think it's more realistic way to implement is to have a separate 'relpersistence' setting for global temp tables. The first access to such one in a session (or xact if truncate on commit) copies the table from the _init fork. By having the backend id in all filenames (besides the init fork) they're unique between sessions. Or something roughly like that. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
2015-02-02 12:04 GMT+01:00 Andres Freund <andres@2ndquadrant.com>:
Hi,
On 2015-02-02 11:15:22 +0100, Pavel Stehule wrote:
> Six years ago we did discuss about global temporary tables - persistent
> schema, ephemeral data.
>
> http://postgresql.nabble.com/idea-global-temp-tables-td2007217.html
>
> I am thinking so some reasons why implement this feature are valid:
>
> * we can get some performance benefit against current temp tables - less
> the catalogue bloating,
>
> * we can simplify a static validation of plpgsql functions when temp tables
> are used,
> more global temp tables are little bit comfortable for developers,
>
> * we can simplify migration from some other databases, where global temp
> tables are default.
I agree that the feature would be interesting.
> 2. Implementation
>
> I see three possible ways how to implement it:
>
> 2.a - using on demand created temp tables - most simple solution, but
> doesn't help with catalogue bloating
Yea, that's no good.
> 2.b - using unlogged tables for holding statistics, relfilenode, and all
> necessary data
I can't follow why that'd achieve anything?
1. Main catalogue will be stable.
2. There is not necessary to implement new storage and it can helps with transaction support.
> 3.c - store ephemeral metadata only in memory without MVCC
I think that's not an option. That'd end up being a massive amount of
duplication at a low rate of functionality.
I don't plan to implement a storage - I expect only few functions for store/read data from session memory context
I think it's more realistic way to implement is to have a separate
'relpersistence' setting for global temp tables. The first access to
such one in a session (or xact if truncate on commit) copies the table
from the _init fork. By having the backend id in all filenames (besides
the init fork) they're unique between sessions.
If I understand well, it is similar to my fast implementation from 2008. It works partially, because it doesn't solve other (session) property - like relpages, reltuples and related data from pg_statistics
Or something roughly like that.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 2015-02-02 12:24:44 +0100, Pavel Stehule wrote: > 2015-02-02 12:04 GMT+01:00 Andres Freund <andres@2ndquadrant.com>: > > > 2.b - using unlogged tables for holding statistics, relfilenode, and all > > > necessary data > > > > I can't follow why that'd achieve anything? > > > > 1. Main catalogue will be stable. > 2. There is not necessary to implement new storage and it can helps with > transaction support. The amount of complexity that'd be involved to store catalog data in a separate relation around the caches and accesses would be significant. I don't think that's a realistic option. > > > 3.c - store ephemeral metadata only in memory without MVCC > > > > I think that's not an option. That'd end up being a massive amount of > > duplication at a low rate of functionality. > > > > I don't plan to implement a storage - I expect only few functions for > store/read data from session memory context What does it have to do with temp tables then? > > I think it's more realistic way to implement is to have a separate > > 'relpersistence' setting for global temp tables. The first access to > > such one in a session (or xact if truncate on commit) copies the table > > from the _init fork. By having the backend id in all filenames (besides > > the init fork) they're unique between sessions. > > > > > If I understand well, it is similar to my fast implementation from 2008. It > works partially, because it doesn't solve other (session) property - like > relpages, reltuples and related data from pg_statistics I'm honestly not particularly concerned about that problem. For one, we don't auto-analyze/vacuum temp tables. For another, it'd be comparatively easy to gather reltuples/relpages/stats from session local state if necessary. Those are all only accessed from a few places. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
> 1. Main catalogue will be stable.
> 2. There is not necessary to implement new storage and it can helps with
> transaction support.
The amount of complexity that'd be involved to store catalog data in a
separate relation around the caches and accesses would be significant. I
don't think that's a realistic option.
Not to mention the problems we might end up in. We still have corner cases in our cache code, and a new heap on top of it all might be just too painful.
> > > 3.c - store ephemeral metadata only in memory without MVCC
> >
> > I think that's not an option. That'd end up being a massive amount of
> > duplication at a low rate of functionality.
> >
>
> I don't plan to implement a storage - I expect only few functions for
> store/read data from session memory context
What does it have to do with temp tables then?
I think what Pavel means here is that we do not need a full fledged heap layer and rather only a minimal API from a per session memory context. However, that might be still as painful because we will eventually end up inventing mechanisms for syscache and typcache to work with this storage, which IMO is the biggest pain point around this idea.
Regards,
Atri
Regards,
Atri
l'apprenant
2015-02-02 13:15 GMT+01:00 Andres Freund <andres@2ndquadrant.com>:
On 2015-02-02 12:24:44 +0100, Pavel Stehule wrote:
> 2015-02-02 12:04 GMT+01:00 Andres Freund <andres@2ndquadrant.com>:
> > > 2.b - using unlogged tables for holding statistics, relfilenode, and all
> > > necessary data
> >
> > I can't follow why that'd achieve anything?
> >
>
> 1. Main catalogue will be stable.
> 2. There is not necessary to implement new storage and it can helps with
> transaction support.
The amount of complexity that'd be involved to store catalog data in a
separate relation around the caches and accesses would be significant. I
don't think that's a realistic option.
> > > 3.c - store ephemeral metadata only in memory without MVCC
> >
> > I think that's not an option. That'd end up being a massive amount of
> > duplication at a low rate of functionality.
> >
>
> I don't plan to implement a storage - I expect only few functions for
> store/read data from session memory context
What does it have to do with temp tables then?
it is mechanism how to store a session metadata related to global temp tables
> > I think it's more realistic way to implement is to have a separate
> > 'relpersistence' setting for global temp tables. The first access to
> > such one in a session (or xact if truncate on commit) copies the table
> > from the _init fork. By having the backend id in all filenames (besides
> > the init fork) they're unique between sessions.
> >
> >
> If I understand well, it is similar to my fast implementation from 2008. It
> works partially, because it doesn't solve other (session) property - like
> relpages, reltuples and related data from pg_statistics
I'm honestly not particularly concerned about that problem. For one, we
don't auto-analyze/vacuum temp tables. For another, it'd be
comparatively easy to gather reltuples/relpages/stats from session local
state if necessary. Those are all only accessed from a few places.
so I don't see a big differences from 3.c - all session metadata will stored in session memory.
I didn't write code, so I have not a knowledge about details.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
2015-02-02 13:36 GMT+01:00 Atri Sharma <atri.jiit@gmail.com>:
> 1. Main catalogue will be stable.
> 2. There is not necessary to implement new storage and it can helps with
> transaction support.
The amount of complexity that'd be involved to store catalog data in a
separate relation around the caches and accesses would be significant. I
don't think that's a realistic option.Not to mention the problems we might end up in. We still have corner cases in our cache code, and a new heap on top of it all might be just too painful.
> > > 3.c - store ephemeral metadata only in memory without MVCC
> >
> > I think that's not an option. That'd end up being a massive amount of
> > duplication at a low rate of functionality.
> >
>
> I don't plan to implement a storage - I expect only few functions for
> store/read data from session memory context
What does it have to do with temp tables then?I think what Pavel means here is that we do not need a full fledged heap layer and rather only a minimal API from a per session memory context. However, that might be still as painful because we will eventually end up inventing mechanisms for syscache and typcache to work with this storage, which IMO is the biggest pain point around this idea.
It should be solvable - I see another risk - if we accelerate a work with temp tables, then 4 byte oid should not be enough.
Regards,AtriRegards,Atril'apprenant
On Mon, Feb 2, 2015 at 6:34 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
2015-02-02 13:36 GMT+01:00 Atri Sharma <atri.jiit@gmail.com>:
> 1. Main catalogue will be stable.
> 2. There is not necessary to implement new storage and it can helps with
> transaction support.
The amount of complexity that'd be involved to store catalog data in a
separate relation around the caches and accesses would be significant. I
don't think that's a realistic option.Not to mention the problems we might end up in. We still have corner cases in our cache code, and a new heap on top of it all might be just too painful.
> > > 3.c - store ephemeral metadata only in memory without MVCC
> >
> > I think that's not an option. That'd end up being a massive amount of
> > duplication at a low rate of functionality.
> >
>
> I don't plan to implement a storage - I expect only few functions for
> store/read data from session memory context
What does it have to do with temp tables then?I think what Pavel means here is that we do not need a full fledged heap layer and rather only a minimal API from a per session memory context. However, that might be still as painful because we will eventually end up inventing mechanisms for syscache and typcache to work with this storage, which IMO is the biggest pain point around this idea.It should be solvable - I see another risk - if we accelerate a work with temp tables, then 4 byte oid should not be enough.
Hrm, that might well be true. It might be worth the effort to find a better way to materialize global temp tables then, like having a single OID and only materializing a relfilenode for a session when the session inserts into the temp table. Not sure here at all...
Regards,
Atri
l'apprenant
Hi
2015-07-08 9:08 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu>:
> more global temp tables are little bit comfortable for developers,
I'd like to emphasize this point. This feature does much more than saving a developer from issuing a CREATE TEMP TABLE statement in every session. Here are two common use cases and I'm sure there are more.
(1)
Imagine in a web application scenario, a developer wants to cache some session information in a temp table. What's more, he also wants to specify some rules which reference the session information. Without this feature, the rules will be removed at the end of every session since they depend on a temporary object. Global temp tables will allow the developer to define the temp table and the rules once.
(2)
The second case is mentioned by Tom Lane back in 2010 in a thread about global temp tables.(http://www.postgresql.org/message-id/9319.1272130283@sss.pgh.pa.us)
"The context that I've seen it come up in is that people don't want to clutter their functions with
create-it-if-it-doesn't-exist logic, which you have to have given the current behavior of temp tables."
> 2.a - using on demand created temp tables - most simple solution, but
> doesn't help with catalogue bloating
I've read the thread and people disapprove this approach because of the potential catalog bloat. However, I'd like to champion it. Indeed, this approach may have a bloat issue. But for users who needs global temp tables, they now have to create a new temp table in every session, which means they already have the bloat problem and presumably they have some policy to deal with it. In other words, implementing global temp tables by this approach gives users the same performance, plus the convenience the feature brings.
The root problem here is that whether "whether having the unoptimized feature is better than
having no feature at all". Actually, there was a very similar discussion back in 2009 on global temp tables. Let me borrow Kevin Grittner's and Tom Lane's arguments here.
Kevin Grittner's argument:
http://www.postgresql.org/message-id/49F82AEA.EE98.0025.0@wicourts.gov
"... If you're saying we can implement the standard's global temporary tables in a way that performs better than current temporary tables, that's cool. That would be a nice "bonus" in addition to the application programmer convenience and having another tick-mark on the standards compliance charts. Do you think that's feasible? If not, the feature would be useful to some with the same performance that temporary tables currently provide."
Tom Lane's arguments:
http://www.postgresql.org/message-id/24110.1241035178@sss.pgh.pa.us
"I'm all for eliminating catalog overheads, if we can find a way to do that. I don't think that you get to veto implementation of the feature until we can find a way to optimize it better. The question is not about whether having the optimization would be better than not having it --- it's about whether having the unoptimized feature is better than having no feature at all (which means people have to implement the same behavior by hand, and they'll *still* not get the optimization)."
There have been several threads here discussing global temp table since 2007. Quite a few ideas aimed to avoid the bloat issue by not storing the metadata of the session copy in the catalog. However, it seems that none of them has been implemented, or even has a feasible design. So why don't we implement it in a unoptimized way first?
I am not sure, if it is not useless work.
Now, I am thinking so best implementation of global temp tables is enhancing unlogged tables to have local content. All local data can be saved in session memory. Usually it is less than 2KB with statistic, and you don't need to store it in catalogue. When anybody is working with any table, related data are copied to system cache - and there can be injected a implementation of global temp tables.
regards
Pavel Stehule
> Is there still interest about this feature?
I'm very interested in this feature. I'm thinking about one implementation which is similar to Pavel's 2009 proposal (http://www.postgresql.org/message-id/162867790904271344s1ec96d90j6cde295fdcc7806f@mail.gmail.com). Here are the major ideas of my design:(1)Creating the cross-session persistent schema as a regular table and creating session-private temp tables when a session first accesses it.(2)For DML queries, The global temp table is overloaded by its session copy after the relation is opened by an oid or a rangevar. For DDL queries, which copy is used depends on whether the query needs to access the data or metadata of the global temp table.There are more differences between this design and Pavel's 2009 proposal and I'd like to send a detailed proposal to the mailing list but first I want to know if our community would accept a global temp table implementation which provides the same performance as currently temp tables do.Thanks,Zhaomo
> more global temp tables are little bit comfortable for developers,
I'd like to emphasize this point. This feature does much more than saving a developer from issuing a CREATE TEMP TABLE statement in every session. Here are two common use cases and I'm sure there are more.
(1)
Imagine in a web application scenario, a developer wants to cache some session information in a temp table. What's more, he also wants to specify some rules which reference the session information. Without this feature, the rules will be removed at the end of every session since they depend on a temporary object. Global temp tables will allow the developer to define the temp table and the rules once.
(2)
The second case is mentioned by Tom Lane back in 2010 in a thread about global temp tables.
I'd like to emphasize this point. This feature does much more than saving a developer from issuing a CREATE TEMP TABLE statement in every session. Here are two common use cases and I'm sure there are more.
(1)
Imagine in a web application scenario, a developer wants to cache some session information in a temp table. What's more, he also wants to specify some rules which reference the session information. Without this feature, the rules will be removed at the end of every session since they depend on a temporary object. Global temp tables will allow the developer to define the temp table and the rules once.
(2)
The second case is mentioned by Tom Lane back in 2010 in a thread about global temp tables.
(http://www.postgresql.org/message-id/9319.1272130283@sss.pgh.pa.us)
"The context that I've seen it come up in is that people don't want to clutter their functions with
create-it-if-it-doesn't-exist logic, which you have to have given the current behavior of temp tables."
> 2.a - using on demand created temp tables - most simple solution, but
> doesn't help with catalogue bloating
I've read the thread and people disapprove this approach because of the potential catalog bloat. However, I'd like to champion it. Indeed, this approach may have a bloat issue. But for users who needs global temp tables, they now have to create a new temp table in every session, which means they already have the bloat problem and presumably they have some policy to deal with it. In other words, implementing global temp tables by this approach gives users the same performance, plus the convenience the feature brings.
The root problem here is that whether "whether having the unoptimized feature is better than
having no feature at all". Actually, there was a very similar discussion back in 2009 on global temp tables. Let me borrow Kevin Grittner's and Tom Lane's arguments here.
Kevin Grittner's argument:
http://www.postgresql.org/message-id/49F82AEA.EE98.0025.0@wicourts.gov
"... If you're saying we can implement the standard's global temporary tables in a way that performs better than current temporary tables, that's cool. That would be a nice "bonus" in addition to the application programmer convenience and having another tick-mark on the standards compliance charts. Do you think that's feasible? If not, the feature would be useful to some with the same performance that temporary tables currently provide."
Tom Lane's arguments:
http://www.postgresql.org/message-id/24110.1241035178@sss.pgh.pa.us
"I'm all for eliminating catalog overheads, if we can find a way to do that. I don't think that you get to veto implementation of the feature until we can find a way to optimize it better. The question is not about whether having the optimization would be better than not having it --- it's about whether having the unoptimized feature is better than having no feature at all (which means people have to implement the same behavior by hand, and they'll *still* not get the optimization)."
There have been several threads here discussing global temp table since 2007. Quite a few ideas aimed to avoid the bloat issue by not storing the metadata of the session copy in the catalog. However, it seems that none of them has been implemented, or even has a feasible design. So why don't we implement it in a unoptimized way first?
> Is there still interest about this feature?
I'm very interested in this feature. I'm thinking about one implementation which is similar to Pavel's 2009 proposal (http://www.postgresql.org/message-id/162867790904271344s1ec96d90j6cde295fdcc7806f@mail.gmail.com). Here are the major ideas of my design:
"The context that I've seen it come up in is that people don't want to clutter their functions with
create-it-if-it-doesn't-exist logic, which you have to have given the current behavior of temp tables."
> 2.a - using on demand created temp tables - most simple solution, but
> doesn't help with catalogue bloating
I've read the thread and people disapprove this approach because of the potential catalog bloat. However, I'd like to champion it. Indeed, this approach may have a bloat issue. But for users who needs global temp tables, they now have to create a new temp table in every session, which means they already have the bloat problem and presumably they have some policy to deal with it. In other words, implementing global temp tables by this approach gives users the same performance, plus the convenience the feature brings.
The root problem here is that whether "whether having the unoptimized feature is better than
having no feature at all". Actually, there was a very similar discussion back in 2009 on global temp tables. Let me borrow Kevin Grittner's and Tom Lane's arguments here.
Kevin Grittner's argument:
http://www.postgresql.org/message-id/49F82AEA.EE98.0025.0@wicourts.gov
"... If you're saying we can implement the standard's global temporary tables in a way that performs better than current temporary tables, that's cool. That would be a nice "bonus" in addition to the application programmer convenience and having another tick-mark on the standards compliance charts. Do you think that's feasible? If not, the feature would be useful to some with the same performance that temporary tables currently provide."
Tom Lane's arguments:
http://www.postgresql.org/message-id/24110.1241035178@sss.pgh.pa.us
"I'm all for eliminating catalog overheads, if we can find a way to do that. I don't think that you get to veto implementation of the feature until we can find a way to optimize it better. The question is not about whether having the optimization would be better than not having it --- it's about whether having the unoptimized feature is better than having no feature at all (which means people have to implement the same behavior by hand, and they'll *still* not get the optimization)."
There have been several threads here discussing global temp table since 2007. Quite a few ideas aimed to avoid the bloat issue by not storing the metadata of the session copy in the catalog. However, it seems that none of them has been implemented, or even has a feasible design. So why don't we implement it in a unoptimized way first?
> Is there still interest about this feature?
I'm very interested in this feature. I'm thinking about one implementation which is similar to Pavel's 2009 proposal (http://www.postgresql.org/message-id/162867790904271344s1ec96d90j6cde295fdcc7806f@mail.gmail.com). Here are the major ideas of my design:
(1)
Creating the cross-session persistent schema as a regular table and creating session-private temp tables when a session first accesses it.
(2)
For DML queries, The global temp table is overloaded by its session copy after the relation is opened by an oid or a rangevar. For DDL queries, which copy is used depends on whether the query needs to access the data or metadata of the global temp table.
There are more differences between this design and Pavel's 2009 proposal and I'd like to send a detailed proposal to the mailing list but first I want to know if our community would accept a global temp table implementation which provides the same performance as currently temp tables do.
Thanks,
Zhaomo
2015-07-09 7:32 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu>:
> I am not sure, if it is not useless work.I don't understand why an implementation taking approach 2.a would be useless. As I said, its performance will be no worse than current temp tables and it will provide a lot of convenience to users who need to create temp tables in every session.
Surely it should be step forward. But you will to have to solve lot of problems with "duplicated" tables in system catalogue, and still it doesn't solve the main problem with temporary tables - the bloating catalogue - and related performance degradation.
Although global temp tables is nice to have feature (for PLpgSQL developers), we can live without it - and with some patterns and extensions, we are living well. But the performance issue is not be fixed by any pattern. So the major motivation for introduction of global temp tables is performance - from 90%. It should be a primary target to merge this feature to upstream. I believe, when bloating will be solved, then the chance to accept this patch will be pretty high.
Regards
Pavel
Thanks,ZhaomoOn Tue, Jul 7, 2015 at 11:53 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:Hi2015-07-08 9:08 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu>:> more global temp tables are little bit comfortable for developers,
I'd like to emphasize this point. This feature does much more than saving a developer from issuing a CREATE TEMP TABLE statement in every session. Here are two common use cases and I'm sure there are more.
(1)
Imagine in a web application scenario, a developer wants to cache some session information in a temp table. What's more, he also wants to specify some rules which reference the session information. Without this feature, the rules will be removed at the end of every session since they depend on a temporary object. Global temp tables will allow the developer to define the temp table and the rules once.
(2)
The second case is mentioned by Tom Lane back in 2010 in a thread about global temp tables.(http://www.postgresql.org/message-id/9319.1272130283@sss.pgh.pa.us)
"The context that I've seen it come up in is that people don't want to clutter their functions with
create-it-if-it-doesn't-exist logic, which you have to have given the current behavior of temp tables."
> 2.a - using on demand created temp tables - most simple solution, but
> doesn't help with catalogue bloating
I've read the thread and people disapprove this approach because of the potential catalog bloat. However, I'd like to champion it. Indeed, this approach may have a bloat issue. But for users who needs global temp tables, they now have to create a new temp table in every session, which means they already have the bloat problem and presumably they have some policy to deal with it. In other words, implementing global temp tables by this approach gives users the same performance, plus the convenience the feature brings.
The root problem here is that whether "whether having the unoptimized feature is better than
having no feature at all". Actually, there was a very similar discussion back in 2009 on global temp tables. Let me borrow Kevin Grittner's and Tom Lane's arguments here.
Kevin Grittner's argument:
http://www.postgresql.org/message-id/49F82AEA.EE98.0025.0@wicourts.gov
"... If you're saying we can implement the standard's global temporary tables in a way that performs better than current temporary tables, that's cool. That would be a nice "bonus" in addition to the application programmer convenience and having another tick-mark on the standards compliance charts. Do you think that's feasible? If not, the feature would be useful to some with the same performance that temporary tables currently provide."
Tom Lane's arguments:
http://www.postgresql.org/message-id/24110.1241035178@sss.pgh.pa.us
"I'm all for eliminating catalog overheads, if we can find a way to do that. I don't think that you get to veto implementation of the feature until we can find a way to optimize it better. The question is not about whether having the optimization would be better than not having it --- it's about whether having the unoptimized feature is better than having no feature at all (which means people have to implement the same behavior by hand, and they'll *still* not get the optimization)."
There have been several threads here discussing global temp table since 2007. Quite a few ideas aimed to avoid the bloat issue by not storing the metadata of the session copy in the catalog. However, it seems that none of them has been implemented, or even has a feasible design. So why don't we implement it in a unoptimized way first?I am not sure, if it is not useless work.Now, I am thinking so best implementation of global temp tables is enhancing unlogged tables to have local content. All local data can be saved in session memory. Usually it is less than 2KB with statistic, and you don't need to store it in catalogue. When anybody is working with any table, related data are copied to system cache - and there can be injected a implementation of global temp tables.regardsPavel Stehule
> Is there still interest about this feature?
I'm very interested in this feature. I'm thinking about one implementation which is similar to Pavel's 2009 proposal (http://www.postgresql.org/message-id/162867790904271344s1ec96d90j6cde295fdcc7806f@mail.gmail.com). Here are the major ideas of my design:(1)Creating the cross-session persistent schema as a regular table and creating session-private temp tables when a session first accesses it.(2)For DML queries, The global temp table is overloaded by its session copy after the relation is opened by an oid or a rangevar. For DDL queries, which copy is used depends on whether the query needs to access the data or metadata of the global temp table.There are more differences between this design and Pavel's 2009 proposal and I'd like to send a detailed proposal to the mailing list but first I want to know if our community would accept a global temp table implementation which provides the same performance as currently temp tables do.Thanks,Zhaomo
> I am not sure, if it is not useless work.
I don't understand why an implementation taking approach 2.a would be useless. As I said, its performance will be no worse than current temp tables and it will provide a lot of convenience to users who need to create temp tables in every session.
Thanks,
Zhaomo
On Tue, Jul 7, 2015 at 11:53 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hi2015-07-08 9:08 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu>:> more global temp tables are little bit comfortable for developers,
I'd like to emphasize this point. This feature does much more than saving a developer from issuing a CREATE TEMP TABLE statement in every session. Here are two common use cases and I'm sure there are more.
(1)
Imagine in a web application scenario, a developer wants to cache some session information in a temp table. What's more, he also wants to specify some rules which reference the session information. Without this feature, the rules will be removed at the end of every session since they depend on a temporary object. Global temp tables will allow the developer to define the temp table and the rules once.
(2)
The second case is mentioned by Tom Lane back in 2010 in a thread about global temp tables.(http://www.postgresql.org/message-id/9319.1272130283@sss.pgh.pa.us)
"The context that I've seen it come up in is that people don't want to clutter their functions with
create-it-if-it-doesn't-exist logic, which you have to have given the current behavior of temp tables."
> 2.a - using on demand created temp tables - most simple solution, but
> doesn't help with catalogue bloating
I've read the thread and people disapprove this approach because of the potential catalog bloat. However, I'd like to champion it. Indeed, this approach may have a bloat issue. But for users who needs global temp tables, they now have to create a new temp table in every session, which means they already have the bloat problem and presumably they have some policy to deal with it. In other words, implementing global temp tables by this approach gives users the same performance, plus the convenience the feature brings.
The root problem here is that whether "whether having the unoptimized feature is better than
having no feature at all". Actually, there was a very similar discussion back in 2009 on global temp tables. Let me borrow Kevin Grittner's and Tom Lane's arguments here.
Kevin Grittner's argument:
http://www.postgresql.org/message-id/49F82AEA.EE98.0025.0@wicourts.gov
"... If you're saying we can implement the standard's global temporary tables in a way that performs better than current temporary tables, that's cool. That would be a nice "bonus" in addition to the application programmer convenience and having another tick-mark on the standards compliance charts. Do you think that's feasible? If not, the feature would be useful to some with the same performance that temporary tables currently provide."
Tom Lane's arguments:
http://www.postgresql.org/message-id/24110.1241035178@sss.pgh.pa.us
"I'm all for eliminating catalog overheads, if we can find a way to do that. I don't think that you get to veto implementation of the feature until we can find a way to optimize it better. The question is not about whether having the optimization would be better than not having it --- it's about whether having the unoptimized feature is better than having no feature at all (which means people have to implement the same behavior by hand, and they'll *still* not get the optimization)."
There have been several threads here discussing global temp table since 2007. Quite a few ideas aimed to avoid the bloat issue by not storing the metadata of the session copy in the catalog. However, it seems that none of them has been implemented, or even has a feasible design. So why don't we implement it in a unoptimized way first?I am not sure, if it is not useless work.Now, I am thinking so best implementation of global temp tables is enhancing unlogged tables to have local content. All local data can be saved in session memory. Usually it is less than 2KB with statistic, and you don't need to store it in catalogue. When anybody is working with any table, related data are copied to system cache - and there can be injected a implementation of global temp tables.regardsPavel Stehule
> Is there still interest about this feature?
I'm very interested in this feature. I'm thinking about one implementation which is similar to Pavel's 2009 proposal (http://www.postgresql.org/message-id/162867790904271344s1ec96d90j6cde295fdcc7806f@mail.gmail.com). Here are the major ideas of my design:(1)Creating the cross-session persistent schema as a regular table and creating session-private temp tables when a session first accesses it.(2)For DML queries, The global temp table is overloaded by its session copy after the relation is opened by an oid or a rangevar. For DDL queries, which copy is used depends on whether the query needs to access the data or metadata of the global temp table.There are more differences between this design and Pavel's 2009 proposal and I'd like to send a detailed proposal to the mailing list but first I want to know if our community would accept a global temp table implementation which provides the same performance as currently temp tables do.Thanks,Zhaomo
On 7/9/15 12:45 AM, Pavel Stehule wrote: > > 2015-07-09 7:32 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu > <mailto:zhy001@cs.ucsd.edu>>: > > > I am not sure, if it is not useless work. > > I don't understand why an implementation taking approach 2.a would > be useless. As I said, its performance will be no worse than current > temp tables and it will provide a lot of convenience to users who > need to create temp tables in every session. > > > Surely it should be step forward. But you will to have to solve lot of > problems with "duplicated" tables in system catalogue, and still it > doesn't solve the main problem with temporary tables - the bloating > catalogue - and related performance degradation. That being the "main" problem is strictly a matter of opinion based on your experience. Many people don't have a performance problem today, but do have to deal with all the pain of handling this manually (as well as all the limitations that go with that). If it's easy to fix the bloat problem at the same time as adding GLOBAL TEMP then great! But there's no reason to reject this just because it doesn't fix that issue. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Data in Trouble? Get it in Treble! http://BlueTreble.com
On 14 July 2015 at 23:20, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
--
On 7/9/15 12:45 AM, Pavel Stehule wrote:
2015-07-09 7:32 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu
<mailto:zhy001@cs.ucsd.edu>>:
> I am not sure, if it is not useless work.
I don't understand why an implementation taking approach 2.a would
be useless. As I said, its performance will be no worse than current
temp tables and it will provide a lot of convenience to users who
need to create temp tables in every session.
Surely it should be step forward. But you will to have to solve lot of
problems with "duplicated" tables in system catalogue, and still it
doesn't solve the main problem with temporary tables - the bloating
catalogue - and related performance degradation.
That being the "main" problem is strictly a matter of opinion based on your experience. Many people don't have a performance problem today, but do have to deal with all the pain of handling this manually (as well as all the limitations that go with that).
If it's easy to fix the bloat problem at the same time as adding GLOBAL TEMP then great! But there's no reason to reject this just because it doesn't fix that issue.
Agreed
There are some good arguments for why we need this feature.
Pavel's original description of how to do this seem valid, and from the link Tom agreed in 2009.
For me the design summary is this
* CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table but with different relkind
* When we see a request to INSERT, DEL, UPD, SEL from the temp table, if it does not exist we create it as a TEMP table of the same name, using the Global's pg_class entry as a template
That meets the SQL Standard and doesn't contain any visibility problems or need for new internals.
The purpose of this feature is to automatically create a temp table with the same definition whenever needed. The discussion of "bloat" is just wrong. We create exactly the same amount of bloat as if we had typed CREATE TEMP TABLE. Optimising temp table entries in the catalog is another, separate patch, if we care.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 07/15/2015 07:58 AM, Simon Riggs wrote: > > For me the design summary is this > > * CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table > but with different relkind > * When we see a request to INSERT, DEL, UPD, SEL from the temp table, > if it does not exist we create it as a TEMP table of the same name, > using the Global's pg_class entry as a template > > That meets the SQL Standard and doesn't contain any visibility > problems or need for new internals. > > The purpose of this feature is to automatically create a temp table > with the same definition whenever needed. The discussion of "bloat" is > just wrong. We create exactly the same amount of bloat as if we had > typed CREATE TEMP TABLE. Optimising temp table entries in the catalog > is another, separate patch, if we care. > > Sounds fine in general. I'm a bit curious to know what are the locking implications of vivifying the table on access. cheers andrew
2015-07-15 13:58 GMT+02:00 Simon Riggs <simon@2ndquadrant.com>:
On 14 July 2015 at 23:20, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:On 7/9/15 12:45 AM, Pavel Stehule wrote:
2015-07-09 7:32 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu
<mailto:zhy001@cs.ucsd.edu>>:
> I am not sure, if it is not useless work.
I don't understand why an implementation taking approach 2.a would
be useless. As I said, its performance will be no worse than current
temp tables and it will provide a lot of convenience to users who
need to create temp tables in every session.
Surely it should be step forward. But you will to have to solve lot of
problems with "duplicated" tables in system catalogue, and still it
doesn't solve the main problem with temporary tables - the bloating
catalogue - and related performance degradation.
That being the "main" problem is strictly a matter of opinion based on your experience. Many people don't have a performance problem today, but do have to deal with all the pain of handling this manually (as well as all the limitations that go with that).
If it's easy to fix the bloat problem at the same time as adding GLOBAL TEMP then great! But there's no reason to reject this just because it doesn't fix that issue.AgreedThere are some good arguments for why we need this feature.Pavel's original description of how to do this seem valid, and from the link Tom agreed in 2009.For me the design summary is this* CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table but with different relkind* When we see a request to INSERT, DEL, UPD, SEL from the temp table, if it does not exist we create it as a TEMP table of the same name, using the Global's pg_class entry as a templateThat meets the SQL Standard and doesn't contain any visibility problems or need for new internals.The purpose of this feature is to automatically create a temp table with the same definition whenever needed. The discussion of "bloat" is just wrong. We create exactly the same amount of bloat as if we had typed CREATE TEMP TABLE. Optimising temp table entries in the catalog is another, separate patch, if we care.
The optimization of local temp tables is little bit harder - you cannot to share pg_class and pg_attribute - although some memory entries can be used too.
Regards
Pavel
--Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
2015-07-15 15:21 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu>:
> Sounds fine in general. I'm a bit curious to know what are the locking implications of > vivifying the table on access.The locking implications depend on how we interpret the existing commands in the context of global temp tables and I think we should discuss and agree on the behaviors of the commands with global temp tables, but I think in general we can follow these rules:If the command executes on the global temp table's metadata, for example an ALTER TABLE command, then we lock the global copy at the same level as we do a regular table.
there is other question - what is effect of ALTER TABLE of global temp table on instances of this table in active sessions?
If the command executes on the global temp table's data (which is actually stored in the session copy), for example an DML command, then the global copy is locked at the AccessShareLock level to prevent concurrent modifications to the global temp table's definition from other sessions.Thanks,ZhaomoOn Wed, Jul 15, 2015 at 4:26 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
On 07/15/2015 07:58 AM, Simon Riggs wrote:
For me the design summary is this
* CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table but with different relkind
* When we see a request to INSERT, DEL, UPD, SEL from the temp table, if it does not exist we create it as a TEMP table of the same name, using the Global's pg_class entry as a template
That meets the SQL Standard and doesn't contain any visibility problems or need for new internals.
The purpose of this feature is to automatically create a temp table with the same definition whenever needed. The discussion of "bloat" is just wrong. We create exactly the same amount of bloat as if we had typed CREATE TEMP TABLE. Optimising temp table entries in the catalog is another, separate patch, if we care.
Sounds fine in general. I'm a bit curious to know what are the locking implications of vivifying the table on access.
cheers
andrew
On 15 July 2015 at 13:26, Andrew Dunstan <andrew@dunslane.net> wrote:
--
On 07/15/2015 07:58 AM, Simon Riggs wrote:
For me the design summary is this
* CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table but with different relkind
* When we see a request to INSERT, DEL, UPD, SEL from the temp table, if it does not exist we create it as a TEMP table of the same name, using the Global's pg_class entry as a template
That meets the SQL Standard and doesn't contain any visibility problems or need for new internals.
The purpose of this feature is to automatically create a temp table with the same definition whenever needed. The discussion of "bloat" is just wrong. We create exactly the same amount of bloat as if we had typed CREATE TEMP TABLE. Optimising temp table entries in the catalog is another, separate patch, if we care.
Sounds fine in general. I'm a bit curious to know what are the locking implications of vivifying the table on access.
We would lock the Global Temp Table at the same lock level for the activity, just as we do for INSERT, SELECT etc.. That prevents concurrent DDL like DROP or ALTER on the Global Temp Table.
The Local temp table created is a copy of the Global Temp Table. The Local temp table is only locally locked, so no worries.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
2015-07-15 15:53 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu>:
> there is other question - what is effect of ALTER TABLE of global temp table on> instances of this table in active sessions?
As I said, we need to first agree on the behaviors of the existing commands. I can think of two options now for ALTER TABLE: 1) only allow ALTER TABLE when there is no other active sessions (this is how Oracle deals with it.) 2) handle it as if session copies inherit from the global copy and ALTER TABLE executes on the global copy.
There are two possible kinds of GLOBAL TEMP tables - session related and transation related. Transaction related tables has very short life - and @1 needs outage, @2 requires stronger locks and can slow and less effective - because a) some changes can be invisible in other transactions (depends on isolation levels), b) the structure can be changed, but function code not (without dependency on isolation levels) - so it can be non consistent, c) why to change table if this table will be dropped in next milisecond. For this case the behave like PL functions can be very practical ~ third option for ALTER TABLE
Regards
Pavel
Thanks,
Zhaomo
On 2015-07-15 12:58:51 +0100, Simon Riggs wrote: > On 14 July 2015 at 23:20, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: > Pavel's original description of how to do this seem valid, and from the > link Tom agreed in 2009. > > For me the design summary is this > > * CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table but > with different relkind > * When we see a request to INSERT, DEL, UPD, SEL from the temp table, if it > does not exist we create it as a TEMP table of the same name, using the > Global's pg_class entry as a template Why do we need to create that copy? We can just use the relfilenode in all backends by having the backendid in the filename? Yes, there's a some amount of additional code needed, but it's not that much? I actually think it might end up being less additional code than having a copy, because with the copy you'll have two different oids for global entry and the local copy. Regards, Andres
On 2015-07-15 16:52:49 +0200, Andres Freund wrote: > Why do we need to create that copy? We can just use the relfilenode in > all backends by having the backendid in the filename? Yes, there's a > some amount of additional code needed, but it's not that much? I > actually think it might end up being less additional code than having a > copy, because with the copy you'll have two different oids for global > entry and the local copy. Hm, yes. Brainfart. Transaction table rewrites/truncations need to change the relfilenode. To fix We could add a backend local mapping table from global temp table id to the backend local relfilenode. The code to lookup the relfilenode is already mostly isolated.
> Sounds fine in general. I'm a bit curious to know what are the locking implications of > vivifying the table on access.
The locking implications depend on how we interpret the existing commands in the context of global temp tables and I think we should discuss and agree on the behaviors of the commands with global temp tables, but I think in general we can follow these rules:
If the command executes on the global temp table's metadata, for example an ALTER TABLE command, then we lock the global copy at the same level as we do a regular table.
If the command executes on the global temp table's data (which is actually stored in the session copy), for example an DML command, then the global copy is locked at the AccessShareLock level to prevent concurrent modifications to the global temp table's definition from other sessions.
Thanks,
Zhaomo
On Wed, Jul 15, 2015 at 4:26 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
On 07/15/2015 07:58 AM, Simon Riggs wrote:
For me the design summary is this
* CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table but with different relkind
* When we see a request to INSERT, DEL, UPD, SEL from the temp table, if it does not exist we create it as a TEMP table of the same name, using the Global's pg_class entry as a template
That meets the SQL Standard and doesn't contain any visibility problems or need for new internals.
The purpose of this feature is to automatically create a temp table with the same definition whenever needed. The discussion of "bloat" is just wrong. We create exactly the same amount of bloat as if we had typed CREATE TEMP TABLE. Optimising temp table entries in the catalog is another, separate patch, if we care.
Sounds fine in general. I'm a bit curious to know what are the locking implications of vivifying the table on access.
cheers
andrew
> there is other question - what is effect of ALTER TABLE of global temp table on
> instances of this table in active sessions?
As I said, we need to first agree on the behaviors of the existing commands. I can think of two options now for ALTER TABLE: 1) only allow ALTER TABLE when there is no other active sessions (this is how Oracle deals with it.) 2) handle it as if session copies inherit from the global copy and ALTER TABLE executes on the global copy.
Thanks,
Zhaomo
As I said, we need to first agree on the behaviors of the existing commands. I can think of two options now for ALTER TABLE: 1) only allow ALTER TABLE when there is no other active sessions (this is how Oracle deals with it.) 2) handle it as if session copies inherit from the global copy and ALTER TABLE executes on the global copy.
Thanks,
Zhaomo
On 15 July 2015 at 15:57, Andres Freund <andres@anarazel.de> wrote:
It may be possible to do this, though I'm sure there's a wrinkle somewhere. But there doesn't seem to be a need to overload the main feature request with additional requirements. Doing that is just scope creep that prevents us getting features out. Nice, simple patches from newer developers. Later tuning and tweaking from more expert community members.
--
On 2015-07-15 16:52:49 +0200, Andres Freund wrote:
> Why do we need to create that copy? We can just use the relfilenode in
> all backends by having the backendid in the filename? Yes, there's a
> some amount of additional code needed, but it's not that much? I
> actually think it might end up being less additional code than having a
> copy, because with the copy you'll have two different oids for global
> entry and the local copy.
Hm, yes. Brainfart. Transaction table rewrites/truncations need to
change the relfilenode.
To fix We could add a backend local mapping table from global temp table
id to the backend local relfilenode. The code to lookup the relfilenode
is already mostly isolated.
It may be possible to do this, though I'm sure there's a wrinkle somewhere. But there doesn't seem to be a need to overload the main feature request with additional requirements. Doing that is just scope creep that prevents us getting features out. Nice, simple patches from newer developers. Later tuning and tweaking from more expert community members.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2015-07-15 16:24:52 +0100, Simon Riggs wrote: > It may be possible to do this, though I'm sure there's a wrinkle somewhere. > But there doesn't seem to be a need to overload the main feature request > with additional requirements. Doing that is just scope creep that prevents > us getting features out. Nice, simple patches from newer developers. Later > tuning and tweaking from more expert community members. I think that's generally a fair point. But here we're discussing to add a fair amount of wrinkles with the copy approach. The fact alone that the oid is different will have some ugly consequences. So we add complexity, just to shift it into different places later? I'm not sure that's a good idea.
On 15 July 2015 at 16:28, Andres Freund <andres@anarazel.de> wrote:
There's no complexity in a simple temp table like. We can do this now with triggers.
--
On 2015-07-15 16:24:52 +0100, Simon Riggs wrote:
> It may be possible to do this, though I'm sure there's a wrinkle somewhere.
> But there doesn't seem to be a need to overload the main feature request
> with additional requirements. Doing that is just scope creep that prevents
> us getting features out. Nice, simple patches from newer developers. Later
> tuning and tweaking from more expert community members.
I think that's generally a fair point. But here we're discussing to add
a fair amount of wrinkles with the copy approach. The fact alone that
the oid is different will have some ugly consequences.
Why? We are creating a local temp table LIKE the global temp table. That is already a supported operation. So there is no "different oid".
So we add complexity, just to shift it into different places later? I'm
not sure that's a good idea.
There's no complexity in a simple temp table like. We can do this now with triggers.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Andres Freund <andres@anarazel.de> writes: > On 2015-07-15 16:24:52 +0100, Simon Riggs wrote: >> It may be possible to do this, though I'm sure there's a wrinkle somewhere. >> But there doesn't seem to be a need to overload the main feature request >> with additional requirements. Doing that is just scope creep that prevents >> us getting features out. Nice, simple patches from newer developers. Later >> tuning and tweaking from more expert community members. > I think that's generally a fair point. But here we're discussing to add > a fair amount of wrinkles with the copy approach. The fact alone that > the oid is different will have some ugly consequences. > So we add complexity, just to shift it into different places later? I'm > not sure that's a good idea. With all due respect, there are features that are beyond the abilities of some "newer developers", and reducing the scope isn't a good way to fix that. It just leaves a bigger mess to be cleaned up later. I think Andres' idea of a per-backend filenode mapping table might work. The existing relfilenode mapper solves a somewhat related problem, namely how do you replace the filenode for shared system catalogs whose pg_class entries can't be changed. regards, tom lane
Simon Riggs <simon@2ndQuadrant.com> writes: > On 15 July 2015 at 16:28, Andres Freund <andres@anarazel.de> wrote: >> I think that's generally a fair point. But here we're discussing to add >> a fair amount of wrinkles with the copy approach. The fact alone that >> the oid is different will have some ugly consequences. > Why? We are creating a local temp table LIKE the global temp table. That is > already a supported operation. So there is no "different oid". You're presuming a specific implementation decision, one that has not been made yet, and isn't all that attractive because of the catalog bloat issues. regards, tom lane
On 2015-07-15 16:36:12 +0100, Simon Riggs wrote: > On 15 July 2015 at 16:28, Andres Freund <andres@anarazel.de> wrote: > > I think that's generally a fair point. But here we're discussing to add > > a fair amount of wrinkles with the copy approach. The fact alone that > > the oid is different will have some ugly consequences. > > > > Why? We are creating a local temp table LIKE the global temp table. That is > already a supported operation. So there is no "different oid". Then your locking against ALTER, DROP etc. isn't going to work.
On 15 July 2015 at 16:44, Andres Freund <andres@anarazel.de> wrote:
There would be two objects, both locked. The temp table is just nice and simple. No problem.
--
On 2015-07-15 16:36:12 +0100, Simon Riggs wrote:
> On 15 July 2015 at 16:28, Andres Freund <andres@anarazel.de> wrote:
> > I think that's generally a fair point. But here we're discussing to add
> > a fair amount of wrinkles with the copy approach. The fact alone that
> > the oid is different will have some ugly consequences.
> >
>
> Why? We are creating a local temp table LIKE the global temp table. That is
> already a supported operation. So there is no "different oid".
Then your locking against ALTER, DROP etc. isn't going to work.
There would be two objects, both locked. The temp table is just nice and simple. No problem.
Your optimization may work; I hope it does. My approach definitely will. So we could choose either.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Pavel, All: Just to be clear, the idea of a global temp table is that the table def is available to all users, but the data is private to each session? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
2015-07-19 21:39 GMT+02:00 Josh Berkus <josh@agliodbs.com>:
Pavel, All:
Just to be clear, the idea of a global temp table is that the table def
is available to all users, but the data is private to each session?
yes.
Pavel
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
On 20/07/15 15:00, Pavel Stehule wrote: > > > 2015-07-19 21:39 GMT+02:00 Josh Berkus <josh@agliodbs.com > <mailto:josh@agliodbs.com>>: > > Pavel, All: > > Just to be clear, the idea of a global temp table is that the > table def > is available to all users, but the data is private to each session? > > > yes. > > Pavel > > > -- > Josh Berkus > PostgreSQL Experts Inc. > http://pgexperts.com > > Just wondering... Would it be difficult to add the ability for one user to share the contents with a list of named other users (roles)? -Gavin
2015-07-20 5:33 GMT+02:00 Gavin Flower <GavinFlower@archidevsys.co.nz>:
On 20/07/15 15:00, Pavel Stehule wrote:Just wondering...
2015-07-19 21:39 GMT+02:00 Josh Berkus <josh@agliodbs.com <mailto:josh@agliodbs.com>>:
Pavel, All:
Just to be clear, the idea of a global temp table is that the
table def
is available to all users, but the data is private to each session?
yes.
Pavel
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
Would it be difficult to add the ability for one user to share the contents with a list of named other users (roles)?
Probably it is possible, but not for temporary data - short data are in process memory, so it are not accessible from other sessions.
This sharing tables needs:
1. some infrastructure to hold data about sharing - who can share with what
2. who will clean data? temporary data are cleaned on end of transaction or end of session
3. data should be saved in shared memory instead process memory
So it is possible, but partially different
-Gavin
Pavel Stehule wrote: > 2015-07-20 5:33 GMT+02:00 Gavin Flower <GavinFlower@archidevsys.co.nz>: > > > Would it be difficult to add the ability for one user to share the > > contents with a list of named other users (roles)? > > Probably it is possible, but not for temporary data - short data are in > process memory, so it are not accessible from other sessions. > > This sharing tables needs: > > 1. some infrastructure to hold data about sharing - who can share with what > 2. who will clean data? temporary data are cleaned on end of transaction or > end of session > 3. data should be saved in shared memory instead process memory > > So it is possible, but partially different To me this gets in the "crazy ideas" list. Please add it to the TODO page in the wiki, so that we're sure we never implement it. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2015-07-20 15:33:32 +1200, Gavin Flower wrote: > Would it be difficult to add the ability for one user to share the contents > with a list of named other users (roles)? No need. That feature is called unlogged tables and grants. Doing this for temporary tables would be horrible. They live in process local memory and not shared memory. Because that provides higher isolation, not even though it does. Andres
2015-07-20 11:07 GMT+02:00 Alvaro Herrera <alvherre@2ndquadrant.com>:
Pavel Stehule wrote:
> 2015-07-20 5:33 GMT+02:00 Gavin Flower <GavinFlower@archidevsys.co.nz>:
>
> > Would it be difficult to add the ability for one user to share the
> > contents with a list of named other users (roles)?
>
> Probably it is possible, but not for temporary data - short data are in
> process memory, so it are not accessible from other sessions.
>
> This sharing tables needs:
>
> 1. some infrastructure to hold data about sharing - who can share with what
> 2. who will clean data? temporary data are cleaned on end of transaction or
> end of session
> 3. data should be saved in shared memory instead process memory
>
> So it is possible, but partially different
To me this gets in the "crazy ideas" list. Please add it to the TODO
page in the wiki, so that we're sure we never implement it.
yes, it is pretty crazy - Have no plan to implement it :)
Pavel
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Just to be clear, the idea of a global temp table is that the table def
is available to all users, but the data is private to each session?
The table def is visible to all sessions and persistent, but the data is private to each session and temporary.
Thanks,
Zhaomo
On Wed, Jul 15, 2015 at 11:52 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On 15 July 2015 at 16:44, Andres Freund <andres@anarazel.de> wrote: >> On 2015-07-15 16:36:12 +0100, Simon Riggs wrote: >> > On 15 July 2015 at 16:28, Andres Freund <andres@anarazel.de> wrote: >> > > I think that's generally a fair point. But here we're discussing to >> > > add >> > > a fair amount of wrinkles with the copy approach. The fact alone that >> > > the oid is different will have some ugly consequences. >> > > >> > >> > Why? We are creating a local temp table LIKE the global temp table. That >> > is >> > already a supported operation. So there is no "different oid". >> >> Then your locking against ALTER, DROP etc. isn't going to work. > > There would be two objects, both locked. The temp table is just nice and > simple. No problem. > > Your optimization may work; I hope it does. My approach definitely will. So > we could choose either. It's not really an optimization; it's a whole different approach. I looked at the create-a-temp-table-on-the-fly idea back when I implemented unlogged tables and concluded it was an unworkable mess. Deep down in the guts of name resolution code is not the place where you want to suddenly decide that you need to run some DDL. So I believe in what Andres is proposing. I'm not necessarily going to shout it down if somebody finds a way to make the temp-table-on-the-fly approach work, but my view is that making that work, although it may look superficially appealing, will eventually make whoever has to do it hate their life; and that even if they get it to where it sorta works, it's going to have ugly corner cases that are almost impossible to file down. Another advantage of Andres's approach, BTW, is that it could potentially eventually be extended to work on Hot Standby machines. For that to work, we'd need a separate XID space for temporary tables, but Noah proposed that before, and I don't think it's a completely crazy idea (just mostly crazy). Now, maybe nobody's going to care about that any more in 5 years if we have full-blown logical replication deeply integrated into core, but there's a lot to like about a design that keeps our options in that area open. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company