Re: [Proposal] Global temporary tables - Mailing list pgsql-hackers
From | Konstantin Knizhnik |
---|---|
Subject | Re: [Proposal] Global temporary tables |
Date | |
Msg-id | d6eedce3-6b54-8167-7caa-b6b085156219@postgrespro.ru Whole thread Raw |
In response to | Re: [Proposal] Global temporary tables (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: [Proposal] Global temporary tables
|
List | pgsql-hackers |
On 05.02.2020 00:38, Robert Haas wrote: > > My guess it that the right time to do this work is just after we > acquire locks, at the end of parse analysis. I think trying to do it > during execution is too late, since the planner looks at indexes, and > trying to do it in the planner instead of before we start planning > seems more likely to cause bugs and has no real advantages. It's just > better to do complicated things (like creating indexes) separately > rather than in the middle of some other complicated thing (like > planning). I could tie my shoelaces the first time they get tangled up > with my break pedal but it's better to do it before I get in the car. I have implemented this approach in my new patch https://www.postgresql.org/message-id/3e88b59f-73e8-685e-4983-9026f94c57c5%40postgrespro.ru I have added check whether index is initialized or not to plancat.c where optimizer checks if index is valid. Now it should work for all kinds of indexes (B-Tree, hash, user defined access methods...). > > And I'm still inclined to do it by flat-copying files rather than > calling ambuild. It will be slightly faster, but also importantly, it > will guarantee that (1) every backend gets exactly the same initial > state and (2) it has fewer ways to fail because it doesn't involve > calling any user-defined code. Those seem like fairly compelling > advantages, and I don't see what the disadvantages are. I think > calling ambuild() at the point in time proposed in the preceding > paragraph would be fairly safe and would probably work OK most of the > time, but I can't think of any reason it would be better. There is very important reason (from my point of view): allow other sessions to use created index and so provide compatible behavior with regular tables (and with Oracle). So we should be able to populate index with existed GTT data. And ambuild will do it. > > Incidentally, what I'd be inclined to do is - if the session is > running a query that does only read-only operations, let it continue > to point to the "master" copy of the GTT and its indexes, which is > stored in the relfilenodes indicated for those relations in pg_class. > If it's going to acquire a lock heavier than AccessShareLock, then > give it is own copies of the table and indexes, stored in a temporary > relfilenode (tXXX_YYY) and redirect all future access to that GTT by > this backend to there. Maybe there's some reason this won't work, but > it seems nice to avoid saying that we've "attached" to the GTT if all > we did is read the empty table. > Sorry, I do not understand the benefits of such optimization. It seems to be very rare situation when session will try to access temp table which was not previously filled with data. But even if it happen, keeping "master" copy will not safe much: we in any case have shared metadata and no data. Yes, with current approach, first access to GTT will cause creation of empty indexes. But It is just initialization of 1-3 pages. I do not think that delaying index initialization can be really useful. In any case, calling ambuild is the simplest and most universal approach, providing desired and compatible behavior. I really do not understand why we should try yo invent some alternative solution.
pgsql-hackers by date: