Re: [Proposal] Global temporary tables - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: [Proposal] Global temporary tables |
Date | |
Msg-id | CA+TgmoY=aP4p_AR=z4O_kfMFHoSEg19Ed+f=U+6_0W1HKfJ2-Q@mail.gmail.com Whole thread Raw |
In response to | Re: [Proposal] Global temporary tables (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>) |
Responses |
Re: [Proposal] Global temporary tables
|
List | pgsql-hackers |
On Mon, Feb 3, 2020 at 3:08 AM Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > Thank you for explanation. > You convinced me that building indexes from _bt_getbuf is not good idea. > What do you think about idea to check and build indexes for GTT prior to > query execution? > > In this case we do not need to patch code of all indexes - it can be > done just in one place. > We can use build function of access method to initialize index and > populate it with data. > > So right now when building query execution plan, optimizer checks if > index is valid. > If index belongs to GTT, it an check that first page of the index is > initialized and if not - call build method for this index. > > If building index during building query plan is not desirable, we can > just construct list of indexes which should be checked and > perform check itself and building indexes somewhere after building plan > but for execution of the query. > > Do you seem some problems with such approach? My guess it that the right time to do this work is just after we acquire locks, at the end of parse analysis. I think trying to do it during execution is too late, since the planner looks at indexes, and trying to do it in the planner instead of before we start planning seems more likely to cause bugs and has no real advantages. It's just better to do complicated things (like creating indexes) separately rather than in the middle of some other complicated thing (like planning). I could tie my shoelaces the first time they get tangled up with my break pedal but it's better to do it before I get in the car. And I'm still inclined to do it by flat-copying files rather than calling ambuild. It will be slightly faster, but also importantly, it will guarantee that (1) every backend gets exactly the same initial state and (2) it has fewer ways to fail because it doesn't involve calling any user-defined code. Those seem like fairly compelling advantages, and I don't see what the disadvantages are. I think calling ambuild() at the point in time proposed in the preceding paragraph would be fairly safe and would probably work OK most of the time, but I can't think of any reason it would be better. Incidentally, what I'd be inclined to do is - if the session is running a query that does only read-only operations, let it continue to point to the "master" copy of the GTT and its indexes, which is stored in the relfilenodes indicated for those relations in pg_class. If it's going to acquire a lock heavier than AccessShareLock, then give it is own copies of the table and indexes, stored in a temporary relfilenode (tXXX_YYY) and redirect all future access to that GTT by this backend to there. Maybe there's some reason this won't work, but it seems nice to avoid saying that we've "attached" to the GTT if all we did is read the empty table. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: