Re: Buffers from parallel workers not accumulated to upper nodes with gather merge - Mailing list pgsql-bugs
From | Amit Kapila |
---|---|
Subject | Re: Buffers from parallel workers not accumulated to upper nodes with gather merge |
Date | |
Msg-id | CAA4eK1L28FjP4D7f7F3FMiq+x9wa5maJWG7vrBXUr0XFv20MMw@mail.gmail.com Whole thread Raw |
In response to | Re: Buffers from parallel workers not accumulated to upper nodes with gather merge (Amit Kapila <amit.kapila16@gmail.com>) |
List | pgsql-bugs |
On Mon, Jul 20, 2020 at 11:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Sat, Jul 18, 2020 at 7:32 PM Jehan-Guillaume de Rorthais > <jgdr@dalibo.com> wrote: > > > > Hi, > > > > A bug report[1] on pev2 project exposes another bug from explain analyze > > buffers, similar to the one fixed 2 years ago[2]. > > > > In this report, buffers from the workers are not propagated to upper nodes > > on top of the Gather Merge node. On the opposite, buffers from workers ARE > > propagated to upper nodes on top of a Gather node. > > > > Here is some SQL to reproduce: > > > > CREATE TABLE contacts (contactSourceName text, filler text) > > WITH ( autovacuum_enabled = off); > > INSERT INTO contacts > > SELECT rpad(chr(64+i)::text,10,'0'), > > rpad(j::text,500,'0') > > FROM generate_series(1,26) i > > CROSS JOIN generate_series(1,3200) j; > > > > SET work_mem TO '4MB'; > > SET random_page_cost to 4; > > SET jit TO off ; > > SET parallel_setup_cost TO 0; > > SET parallel_tuple_cost TO 0; > > > > EXPLAIN (ANALYZE, BUFFERS, VERBOSE, COSTS OFF, TIMING OFF) > > SELECT contactsourceName, count(*) > > FROM contacts > > GROUP BY contactSourceName; > > > > Finalize GroupAggregate (actual rows=26 loops=1) > > Buffers: shared hit=2087 read=1463 written=32 > > -> Gather Merge (actual rows=70 loops=1) > > Workers Planned: 2 > > Workers Launched: 2 > > Buffers: shared hit=3910 read=2049 written=96 > > -> Sort (actual rows=23 loops=3) > > Buffers: shared hit=3910 read=2049 written=96 > > Worker 0: > > Buffers: shared hit=914 read=287 written=32 > > Worker 1: > > Buffers: shared hit=909 read=299 written=32 > > [...] > > > > Gather Merge exposes shared hit=3910. > > Finalize GroupAggregate exposes shared hit=2087. > > The 909 and 914 buffers from workers have not been propagated. > > > > Comparing with a Gather node: > > > > EXPLAIN (ANALYZE, BUFFERS, VERBOSE, COSTS OFF, TIMING OFF) SELECT count(*) AS > > c FROM contacts; > > > > Finalize Aggregate (actual rows=1 loops=1) > > Output: count(*) > > Buffers: shared hit=3894 read=2049 written=96 > > -> Gather (actual rows=3 loops=1) > > Workers Planned: 2 > > Workers Launched: 2 > > Buffers: shared hit=3894 read=2049 written=96 > > -> Partial Aggregate (actual rows=1 loops=3) > > Buffers: shared hit=3894 read=2049 written=96 > > Worker 0: > > Buffers: shared hit=870 read=299 written=32 > > Worker 1: > > Buffers: shared hit=887 read=301 written=32 > > [...] > > > > Buffers are correctly propagated to the Finalize Aggregate node. > > > > I'm not perfectly clear with the code, but here is my analysis of the issue > > and a fix proposal. > > > > The Merge node works correctly because it calls ExecShutdownGatherWorkers as > > soon as the workers are exhausted from gather_readnext. Because of this, > > buffers from workers are already accounted and propagated to upper nodes > > before the recursive call of ExecShutdownNode on each nodes. There's no similar > > call to ExecShutdownGatherMergeWorkers for Gather Merge. Adding a call to > > ExecShutdownGatherMergeWorkers in gather_merge_getnext when workers are > > exhausted seems to fix the issue, but I feel like this is the wrong place to > > fix this issue. > > > > Why do you think so? I think irrespective of whether we want to call > ExecShutdownGatherMergeWorkers in gather_merge_getnext (when we know > we are done aka binaryheap_empty) to fix this particular issue, it is > better to shutdown the workers as soon as we are done similar to what > we do for Gather node. It is good to release resources as soon as we > can. > Apart from above, we should also do what you are suggesting in the patch as that is also a problem and your fix seems correct to me. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-bugs by date: