Re: Implementation of global temporary tables? - Mailing list pgsql-hackers
From | Zhaomo Yang |
---|---|
Subject | Re: Implementation of global temporary tables? |
Date | |
Msg-id | CA+0EDdAQJfed6JnZ_CzyPXtAXXKVrEF8gtX25TJJ+U=hv9LquQ@mail.gmail.com Whole thread Raw |
In response to | Re: Implementation of global temporary tables? (Pavel Stehule <pavel.stehule@gmail.com>) |
List | pgsql-hackers |
> I am not sure, if it is not useless work.
I don't understand why an implementation taking approach 2.a would be useless. As I said, its performance will be no worse than current temp tables and it will provide a lot of convenience to users who need to create temp tables in every session.
Thanks,
Zhaomo
On Tue, Jul 7, 2015 at 11:53 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hi2015-07-08 9:08 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu>:> more global temp tables are little bit comfortable for developers,
I'd like to emphasize this point. This feature does much more than saving a developer from issuing a CREATE TEMP TABLE statement in every session. Here are two common use cases and I'm sure there are more.
(1)
Imagine in a web application scenario, a developer wants to cache some session information in a temp table. What's more, he also wants to specify some rules which reference the session information. Without this feature, the rules will be removed at the end of every session since they depend on a temporary object. Global temp tables will allow the developer to define the temp table and the rules once.
(2)
The second case is mentioned by Tom Lane back in 2010 in a thread about global temp tables.(http://www.postgresql.org/message-id/9319.1272130283@sss.pgh.pa.us)
"The context that I've seen it come up in is that people don't want to clutter their functions with
create-it-if-it-doesn't-exist logic, which you have to have given the current behavior of temp tables."
> 2.a - using on demand created temp tables - most simple solution, but
> doesn't help with catalogue bloating
I've read the thread and people disapprove this approach because of the potential catalog bloat. However, I'd like to champion it. Indeed, this approach may have a bloat issue. But for users who needs global temp tables, they now have to create a new temp table in every session, which means they already have the bloat problem and presumably they have some policy to deal with it. In other words, implementing global temp tables by this approach gives users the same performance, plus the convenience the feature brings.
The root problem here is that whether "whether having the unoptimized feature is better than
having no feature at all". Actually, there was a very similar discussion back in 2009 on global temp tables. Let me borrow Kevin Grittner's and Tom Lane's arguments here.
Kevin Grittner's argument:
http://www.postgresql.org/message-id/49F82AEA.EE98.0025.0@wicourts.gov
"... If you're saying we can implement the standard's global temporary tables in a way that performs better than current temporary tables, that's cool. That would be a nice "bonus" in addition to the application programmer convenience and having another tick-mark on the standards compliance charts. Do you think that's feasible? If not, the feature would be useful to some with the same performance that temporary tables currently provide."
Tom Lane's arguments:
http://www.postgresql.org/message-id/24110.1241035178@sss.pgh.pa.us
"I'm all for eliminating catalog overheads, if we can find a way to do that. I don't think that you get to veto implementation of the feature until we can find a way to optimize it better. The question is not about whether having the optimization would be better than not having it --- it's about whether having the unoptimized feature is better than having no feature at all (which means people have to implement the same behavior by hand, and they'll *still* not get the optimization)."
There have been several threads here discussing global temp table since 2007. Quite a few ideas aimed to avoid the bloat issue by not storing the metadata of the session copy in the catalog. However, it seems that none of them has been implemented, or even has a feasible design. So why don't we implement it in a unoptimized way first?I am not sure, if it is not useless work.Now, I am thinking so best implementation of global temp tables is enhancing unlogged tables to have local content. All local data can be saved in session memory. Usually it is less than 2KB with statistic, and you don't need to store it in catalogue. When anybody is working with any table, related data are copied to system cache - and there can be injected a implementation of global temp tables.regardsPavel Stehule
> Is there still interest about this feature?
I'm very interested in this feature. I'm thinking about one implementation which is similar to Pavel's 2009 proposal (http://www.postgresql.org/message-id/162867790904271344s1ec96d90j6cde295fdcc7806f@mail.gmail.com). Here are the major ideas of my design:(1)Creating the cross-session persistent schema as a regular table and creating session-private temp tables when a session first accesses it.(2)For DML queries, The global temp table is overloaded by its session copy after the relation is opened by an oid or a rangevar. For DDL queries, which copy is used depends on whether the query needs to access the data or metadata of the global temp table.There are more differences between this design and Pavel's 2009 proposal and I'd like to send a detailed proposal to the mailing list but first I want to know if our community would accept a global temp table implementation which provides the same performance as currently temp tables do.Thanks,Zhaomo
pgsql-hackers by date: