Re: making update/delete of inheritance trees scale better - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: making update/delete of inheritance trees scale better |
Date | |
Msg-id | CA+TgmoZ98JHuMPvT7NNOz9-CZP9VMY_sR-fk1MGJGE_rM7yuWg@mail.gmail.com Whole thread Raw |
In response to | making update/delete of inheritance trees scale better (Amit Langote <amitlangote09@gmail.com>) |
Responses |
Re: making update/delete of inheritance trees scale better
Re: making update/delete of inheritance trees scale better Re: making update/delete of inheritance trees scale better |
List | pgsql-hackers |
On Wed, Mar 3, 2021 at 9:39 AM Amit Langote <amitlangote09@gmail.com> wrote: > Just noticed that a test added by the recent 927f453a941 fails due to > 0002. We no longer allow moving a row into a postgres_fdw partition > if it is among the UPDATE's result relations, whereas previously we > would if the UPDATE on that partition is already finished. > > To fix, I've adjusted the test case. Attached updated version. I spent some time studying this patch this morning. As far as I can see, 0001 is a relatively faithful implementation of the design Tom proposed back in early 2019. I think it would be nice to either get this committed or else decide that we don't want it and what we're going to try to do instead, because we can't make UPDATEs and DELETEs stop sucking with partitioned tables until we settle on some solution to the problem that is inheritance_planner(), and that strikes me as an *extremely* important problem. Lots of people are trying to use partitioning in PostgreSQL, and they don't like finding out that, in many cases, it makes things slower rather than faster. Neither this nor any other patch is going to solve that problem in general, because there are *lots* of things that haven't been well-optimized for partitioning yet. But, this is a pretty important case that we should really try to do something about. So, that said, here are some random comments: - I think it would be interesting to repeat your earlier benchmarks using -Mprepared. One question I have is whether we're saving overhead here during planning at the price of execution-time overhead, or whether we're saving during both planning and execution. - Until I went back and found your earlier remarks on this thread, I was confused as to why you were replacing a JunkFilter with a ProjectionInfo. I think it would be good to try to add some more explicit comments about that design choice, perhaps as header comments for ExecGetUpdateNewTuple, or maybe there's a better place. I'm still not sure why we need to do the same thing for the insert case, which seems to just be about removing junk columns. At least in the non-JIT case, it seems to me that ExecJunkFilter() should be cheaper than ExecProject(). Is it different enough to matter? Does the fact that we can JIT the ExecProject() work make it actually faster? These are things I don't know. - There's a comment which you didn't write but just moved which I find to be quite confusing. It says "For UPDATE/DELETE, find the appropriate junk attr now. Typically, this will be a 'ctid' or 'wholerow' attribute, but in the case of a foreign data wrapper it might be a set of junk attributes sufficient to identify the remote row." But, the block of code which follows caters only to the 'ctid' and 'wholerow' cases, not anything else. Perhaps that's explained by the comment a bit further down which says "When there is a row-level trigger, there should be a wholerow attribute," but if the point is that this code is only reached when there's a row-level trigger, that's far from obvious. It *looks* like something we'd reach for ANY insert or UPDATE case. Maybe it's not your job to do anything about this confusion, but I thought it was worth highlighting. - The comment for filter_junk_tlist_entries(), needless to say, is of the highest quality, but would it make any sense to think about having an option for expand_tlist() to omit the junk entries itself, to avoid extra work? I'm unclear whether we want that behavior in all UPDATE cases or only some of them, because preproces_targetlist() has a call to expand_tlist() to set parse->onConflict->onConflictSet that does not call filter_junk_tlist_entries() on the result. Does this patch need to make any changes to the handling of ON CONFLICT .. UPDATE? It looks to me like right now it doesn't, but I don't know whether that's an oversight or intentional. - The output changes in the main regression test suite are pretty easy to understand: we're just seeing columns that no longer need to get fed through the execution get dropped. The changes in the postgres_fdw results are harder to understand. In general, it appears that what's happening is that we're no longer outputting the non-updated columns individually -- which makes sense -- but now we're outputting a whole-row var that wasn't there before, e.g.: - Output: foreign_tbl.a, (foreign_tbl.b + 15), foreign_tbl.ctid + Output: (foreign_tbl.b + 15), foreign_tbl.ctid, foreign_tbl.* Since this is postgres_fdw, we can re-fetch the row using CTID, so it's not clear to me why we need foreign_tbl.* when we didn't before. Maybe the comments need improvement. - Specifically, I think the comments in preptlist.c need some work. You've edited the top-of-file comment, but I don't think it's really accurate. The first sentence says "For INSERT and UPDATE, the targetlist must contain an entry for each attribute of the target relation in the correct order," but I don't think that's really true any more. It's certainly not what you see in the EXPLAIN output. The paragraph goes on to explain that UPDATE has a second target list, but (a) that seems to contradict the first sentence and (b) that second target list isn't what you see when you run EXPLAIN. Also, there's no mention of what happens for FDWs here, but it's evidently different, as per the previous review comment. - The comments atop fix_join_expr() should be updated. Maybe you can just adjust the wording for case #2. OK, that's all I've got based on a first read-through. Thanks for your work on this. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: