Re: partition routing layering in nodeModifyTable.c - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: partition routing layering in nodeModifyTable.c |
Date | |
Msg-id | CA+HiwqH+5EYSTZzOQ8FXhySYzQgUb5efqeuh0EeyugAe2=132A@mail.gmail.com Whole thread Raw |
In response to | Re: partition routing layering in nodeModifyTable.c (Heikki Linnakangas <hlinnaka@iki.fi>) |
Responses |
Re: partition routing layering in nodeModifyTable.c
|
List | pgsql-hackers |
On Fri, Oct 23, 2020 at 4:04 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > On 22/10/2020 16:49, Amit Langote wrote: > > On Tue, Oct 20, 2020 at 9:57 PM Amit Langote <amitlangote09@gmail.com> wrote: > >> On Mon, Oct 19, 2020 at 8:55 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > >>> It's probably true that there's no performance gain from initializing > >>> them more lazily. But the reasoning and logic around the initialization > >>> is complicated. After tracing through various path through the code, I'm > >>> convinced enough that it's correct, or at least these patches didn't > >>> break it, but I still think some sort of lazy initialization on first > >>> use would make it more readable. Or perhaps there's some other > >>> refactoring we could do. > >> > >> So the other patch I have mentioned is about lazy initialization of > >> the ResultRelInfo itself, not the individual fields, but maybe with > >> enough refactoring we can get the latter too. > > > > So, I tried implementing a lazy-initialization-on-first-access > > approach for both the ResultRelInfos themselves and some of the > > individual fields of ResultRelInfo that don't need to be set right > > away. You can see the end result in the attached 0003 patch. This > > slims down ExecInitModifyTable() significantly, both in terms of code > > footprint and the amount of work that it does. > > Have you done any performance testing? I'd like to know how much of a > difference this makes in practice. I have shown some numbers here: https://www.postgresql.org/message-id/CA+HiwqG7ZruBmmih3wPsBZ4s0H2EhywrnXEduckY5Hr3fWzPWA@mail.gmail.com To reiterate, if you apply the following patch: > Does this patch become moot if we do the "Overhaul UPDATE/DELETE > processing"? > (https://www.postgresql.org/message-id/CA%2BHiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ%40mail.gmail.com)? ...and run this benchmark with plan_cache_mode = force_generic_plan pgbench -i -s 10 --partitions={0, 10, 100, 1000} pgbench -T120 -f test.sql -M prepared test.sql: \set aid random(1, 1000000) update pgbench_accounts set abalance = abalance + 1 where aid = :aid; you may see roughly the following results: HEAD: 0 tps = 13045.485121 (excluding connections establishing) 10 tps = 9358.157433 (excluding connections establishing) 100 tps = 1878.274500 (excluding connections establishing) 1000 tps = 84.684695 (excluding connections establishing) Patched (overhaul update/delete processing): 0 tps = 12743.487196 (excluding connections establishing) 10 tps = 12644.240748 (excluding connections establishing) 100 tps = 4158.123345 (excluding connections establishing) 1000 tps = 391.248067 (excluding connections establishing) And if you apply the patch being discussed here, TPS shoots up a bit, especially for higher partition counts: Patched (lazy-ResultRelInfo-initialization) 0 tps = 13419.283168 (excluding connections establishing) 10 tps = 12588.016095 (excluding connections establishing) 100 tps = 8560.824225 (excluding connections establishing) 1000 tps = 1926.553901 (excluding connections establishing) To explain these numbers a bit, "overheaul update/delete processing" patch improves the performance of that benchmark by allowing the updates to use run-time pruning when executing generic plans, which they can't today. However without "lazy-ResultRelInfo-initialization" patch, ExecInitModifyTable() (or InitPlan() when I ran those benchmarks) can be seen to be spending time initializing all of those result relations, whereas only one of those will actually be used. As mentioned further in that email, it's really the locking of all relations by AcquireExecutorLocks() that occurs even before we enter the executor that's a much thornier bottleneck for this benchmark. But the ResultRelInfo initialization bottleneck sounded like it could get alleviated in a relatively straightforward manner. The patches that were developed for attacking the locking bottleneck would require further reflection on whether they are correct. (Note: I've just copy pasted the numbers I reported in that email. To reproduce, I'll have to rebase the "overhaul update/delete processing" patch on this one, which I haven't yet done.) > Another alternative is to continue to create the ResultRelInfos in > ExecInitModify(), but initialize the individual fields in them lazily. If you consider the above, maybe you can see how that will not really eliminate the bottleneck I'm aiming to fix here. -- Amit Langote EDB: http://www.enterprisedb.com
pgsql-hackers by date: