Re: [HACKERS] UPDATE of partition key - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: [HACKERS] UPDATE of partition key |
Date | |
Msg-id | CA+TgmobVbxfkxEpZG=Zxumfg_b3q7PECOLBoHk2g4KtauYVx5Q@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] UPDATE of partition key (Amit Khandekar <amitdkhan.pg@gmail.com>) |
Responses |
Re: [HACKERS] UPDATE of partition key
|
List | pgsql-hackers |
On Thu, Jan 11, 2018 at 6:07 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote: > In the first paragraph of my explanation, I was explaining why two > Transition capture states does not look like a good idea to me : Oh, sorry. I didn't read what you wrote carefully enough, I guess. I see your points. I think that there is probably a general need for some refactoring here. AfterTriggerSaveEvent() got significantly more complicated and harder to understand with the arrival of transition tables, and this patch is adding more complexity still. It's also adding complexity in other places to make ExecInsert() and ExecDelete() usable for the semi-internal DELETE/INSERT operations being produced when we split a partition key update into a DELETE and INSERT pair. It would be awfully nice to have some better way to separate out each of the different things we might or might not want to do depending on the situation: capture old tuple, capture new tuple, fire before triggers, fire after triggers, count processed rows, set command tag, perform actual heap operation, update indexes, etc. However, I don't have a specific idea how to do it better, so maybe we should just get this committed for now and perhaps, with more eyes on the code, someone will have a good idea. > Slight correction; it was suggested by Amit Langote; not by David. Oh, OK, sorry. > So there are two independent optimizations we are talking about : > > 1. Create the map only when needed. > 2. In case of UPDATE, for partitions that take part in update scans, > there should be a single map; there should not be two separate maps, > one for accessing per-subplan and the other for accessing per-leaf. These optimizations aren't completely independent. Optimization #2 can be implemented in several different ways. The way you've chosen to do it is to index the same array in two different ways depending on whether per-leaf indexing is not needed, which I think is unacceptable. Another approach, which I proposed upthread, is to always built the per-leaf mapping, but you pointed out that this could involve doing a lot of unnecessary work in the case where most leaves were pruned. However, if you also implement #1, then that problem goes away. In other words, depending on the design you choose for #2, you may or may not need to also implement optimization #1 to get good performance. To put that another way, I think Amit's idea of keeping a subplan-offsets array is a pretty good one. From your comments, you do too. But if we want to keep that, then we need a way to avoid the expense of populating it for leaves that got pruned, except when we are doing update row movement. Otherwise, I don't see much choice but to jettison the subplan-offsets array and just maintain two separate arrays of mappings. > Regarding the array names ... > > Noting all this, I feel we can go with names according to the > structure of maps. Something like : mt_perleaf_tupconv_maps, and > mt_persubplan_tupconv_maps. Other suggestions welcome. I'd probably do mt_per_leaf_tupconv_maps, since inserting an underscore between some but not all words seems strange. But OK otherwise. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: