Re: making update/delete of inheritance trees scale better - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: making update/delete of inheritance trees scale better |
Date | |
Msg-id | CA+TgmobYSpQKo3wJOZM3LUGtOp_OZr4+sB2ehUxMwHp67BDi1Q@mail.gmail.com Whole thread Raw |
In response to | Re: making update/delete of inheritance trees scale better (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: making update/delete of inheritance trees scale better
|
List | pgsql-hackers |
On Fri, Feb 5, 2021 at 12:06 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > You do realize that we're just copying Datums from one level to the > next? For pass-by-ref data, the Datums generally all point at the > same physical data in some disk buffer ... or if they don't, it's > because the join method had a good reason to want to copy data. I am older and dumber than I used to be, but I'm amused at the idea that I might be old enough and dumb enough not to understand this. To be honest, given that we are just copying the datums, I find it kind of surprising that it causes us pain, but it clearly does. If you think it's not an issue, then what of the email from Amit Langote to which I was responding, or his earlier message at http://postgr.es/m/CA+HiwqHUkwcy84uFfUA3qVsyU2pgTwxVkJx1uwPQFSHfPz4rsA@mail.gmail.com which contains benchmark results? As to why it causes us pain, I don't have a full picture of that. Target list construction is one problem: we build all these target lists for intermediate notes during planning and they're long enough -- if the user has a bunch of columns -- and planning is cheap enough for some queries that the sheer time to construct the list shows up noticeably in profiles. I've seen that be a problem even for query planning problems that involve just one table: a test that takes the "physical tlist" path can be slower just because the time to construct the longer tlist is significant and the savings from postponing tuple deforming isn't. It seems impossible to believe that it can't also hurt us on join queries that actually make use of a lot of columns, so that they've all got to be included in tlists at every level of the join tree. I believe that the execution-time overhead isn't entirely trivial either. Sure, copying an 8-byte quantity is pretty cheap, but if you have a lot of columns and you copy them a lot of times for each of a lot of tuples, it adds up. Queries that do enough "real work" e.g. calling expensive functions, forcing disk I/O, etc. will make the effect of a bunch of x[i] = y[j] stuff unnoticeable, but there are plenty of queries that don't really do anything expensive -- they're doing simple joins of data that's already in memory. Even there, accessing buffers figures to be more expensive because it's shared memory with locking and cache line contention; but I don't think that means we can completely ignore the performance impact of backend-local computation. b8d7f053c5c2bf2a7e8734fe3327f6a8bc711755 is a good example of getting a significant gain by refactoring to reduce seemingly trivial overheads -- in that case, AIUI, the benefits are around fewer function calls and better CPU branch prediction. > If we didn't have the intermediate tuple slots, we'd have to have > some other scheme for identifying which data to examine in intermediate > join levels' quals. Maybe you can devise a scheme that has less overhead, > but it's not immediately obvious that any huge win would be available. I agree. I'm inclined to suspect that some benefit is possible, but that might be wrong and it sure doesn't look easy. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: