BUG #17335: Duplicate result rows in Gather node when "Suppress Append ... that have a single child" applied - Mailing list pgsql-bugs
From | PG Bug reporting form |
---|---|
Subject | BUG #17335: Duplicate result rows in Gather node when "Suppress Append ... that have a single child" applied |
Date | |
Msg-id | 17335-4dc92e1aea3a78af@postgresql.org Whole thread Raw |
Responses |
Re: BUG #17335: Duplicate result rows in Gather node when "Suppress Append ... that have a single child" applied
|
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 17335 Logged by: Yura Sokolov Email address: y.sokolov@postgrespro.ru PostgreSQL version: 14.1 Operating system: any Description: Our customer faced a query that returns duplicate rows if parallel execution is enabled (max_parallel_workers_per_gather > 0) after 11->12 upgrade. Investigation shows bug were introduced in 8edd0e79460b414b1d971895312e549e95e12e4f "Suppress Append and MergeAppend plan nodes that have a single child." . Bug is present in all versions since 12 beta, and it is reproduced in master branch as well. Reproduction. https://gist.github.com/funny-falcon/45fa9460507c015390db99a7efd18a75 Dump contains definition of two tables (with 3 and 2 fields), index and data dump large enough to trigger parallel plan execution. Also `analyze public.testi3, public.testi4` line is added. https://gist.github.com/funny-falcon/45fa9460507c015390db99a7efd18a75/raw/06a55e57bc6c460ea8800600b1b597c1ed216259/repro_dump.sql.zst Query: SELECT count(1) FROM ( SELECT fk, id FROM testi3 WHERE flagged UNION ALL SELECT fk, id from testi3 WHERE false ) as t LEFT OUTER JOIN testi4 USING (fk); When `set max_parallel_workers_per_gather = 0;` then it returns 44 which is correct. When `set max_parallel_workers_per_gather = 2;` then it returns 88 which is wrong. Sometimes (for example, when you try to debug it) it returns 44 even with max_parallel_workers_per_gather = 2. It seems that it happens when worker returns first row fast enough so "Gather" node doesn't start to execute plan by itself. It is confirmed by `set parallel_leader_participation = false`, which forces correct result. `set enable_parallel_append = false` also forces correct result. Original plan included Gather Merge node and max_parallel_workers_per_gather = 1 . Reproduction code uses Gather instead and max_parallel_workers_per_gather = 2 just because it were easier to get bad plan. Plan before change 8edd0e79460b414b1d971895312e549e95e12e4f (returns correct result) Finalize Aggregate (cost=25003.16..25003.17 rows=1 width=8) (actual time=435.473..435.473 rows=1 loops=1) -> Gather (cost=25003.05..25003.16 rows=1 width=8) (actual time=434.646..456.889 rows=2 loops=1) Workers Planned: 1 Workers Launched: 1 -> Partial Aggregate (cost=24003.05..24003.06 rows=1 width=8) (actual time=433.000..433.000 rows=1 loops=2) -> Parallel Hash Left Join (cost=21701.47..24002.86 rows=74 width=0) (actual time=312.576..432.981 rows=22 loops=2) Hash Cond: ("*SELECT* 1".fk = testi4.fk) -> Parallel Append (cost=0.42..9.18 rows=24 width=4) (actual time=0.021..0.112 rows=16 loops=2) -> Subquery Scan on "*SELECT* 1" (cost=0.42..9.11 rows=24 width=4) (actual time=0.041..0.216 rows=33 loops=1) -> Index Scan using testi3_ix on testi3 (cost=0.42..8.87 rows=24 width=8) (actual time=0.039..0.207 rows=33 loops=1) Index Cond: (flagged = true) -> Parallel Hash (cost=12084.91..12084.91 rows=586091 width=4) (actual time=305.473..305.473 rows=703309 loops=2) Buckets: 131072 Batches: 32 Memory Usage: 2784kB -> Parallel Seq Scan on testi4 (cost=0.00..12084.91 rows=586091 width=4) (actual time=0.032..138.958 rows=703309 loops=2) Plan after that change (returns wrong result) Finalize Aggregate (cost=25003.87..25003.88 rows=1 width=8) (actual time=652.062..652.062 rows=1 loops=1) -> Gather (cost=25003.75..25003.86 rows=1 width=8) (actual time=651.482..668.718 rows=2 loops=1) Workers Planned: 1 Workers Launched: 1 -> Partial Aggregate (cost=24003.75..24003.76 rows=1 width=8) (actual time=649.150..649.150 rows=1 loops=2) -> Parallel Hash Left Join (cost=21701.47..24003.49 rows=106 width=0) (actual time=461.751..649.111 rows=44 loops=2) Hash Cond: ("*SELECT* 1".fk = testi4.fk) -> Subquery Scan on "*SELECT* 1" (cost=0.42..9.51 rows=36 width=4) (actual time=0.050..0.229 rows=33 loops=2) -> Index Scan using testi3_ix on testi3 (cost=0.42..9.15 rows=36 width=8) (actual time=0.048..0.214 rows=33 loops=2) Index Cond: (flagged = true) -> Parallel Hash (cost=12084.91..12084.91 rows=586091 width=4) (actual time=444.931..444.931 rows=703309 loops=2) Buckets: 131072 Batches: 32 Memory Usage: 2784kB -> Parallel Seq Scan on testi4 (cost=0.00..12084.91 rows=586091 width=4) (actual time=0.045..202.913 rows=703309 loops=2) ------ regards, Yura
pgsql-bugs by date: