Thread: BUG #18522: Wrong results with Merge Right Anti Join, inconsistent with Merge Anti Join
BUG #18522: Wrong results with Merge Right Anti Join, inconsistent with Merge Anti Join
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 18522 Logged by: Antti Lampinen Email address: antti@lampinen.eu PostgreSQL version: 16.3 Operating system: AWS RDS + Macos Description: The following two queries result in different query plans and different results, even though there is only a dummy condition change between them. The latter results are correct, there are two rows that match the conditions. The following was run both in AWS RDS and after restoring database from plain SQL dump to local Macos system. First query. EXPLAIN (ANALYZE, SETTINGS, VERBOSE ) SELECT id, market_id FROM "order" WHERE "order"."deleted_at" IS NULL AND "order"."archived_at" IS NULL AND NOT EXISTS (SELECT FROM "matchmaking_request" WHERE ("matchmaking_request"."order_id" = "order"."id" AND "matchmaking_request"."started_at" IS NOT NULL AND "matchmaking_request"."cancelled_at" IS NULL)) AND "order"."market_id" IN (1); QUERY PLAN Merge Right Anti Join (cost=415.85..527.39 rows=350 width=8) (actual time=6.015..6.018 rows=1 loops=1) Output: "order".id, "order".market_id Inner Unique: true Merge Cond: (matchmaking_request.order_id = "order".id) -> Index Scan using matchmaking_request_order_id_idx on public.matchmaking_request (cost=0.28..834.07 rows=5840 width=4) (actual time=0.032..2.089 rows=702 loops=1) Output: <retracted> Filter: ((matchmaking_request.started_at IS NOT NULL) AND (matchmaking_request.cancelled_at IS NULL)) Rows Removed by Filter: 301 -> Sort (cost=415.56..416.82 rows=503 width=8) (actual time=3.695..3.730 rows=493 loops=1) Output: "order".id, "order".market_id Sort Key: "order".id Sort Method: quicksort Memory: 40kB -> Index Scan using order_market_id_idx on public."order" (cost=0.28..392.99 rows=503 width=8) (actual time=0.035..3.555 rows=493 loops=1) Output: "order".id, "order".market_id Index Cond: ("order".market_id = 1) Filter: (("order".deleted_at IS NULL) AND ("order".archived_at IS NULL)) Rows Removed by Filter: 871 Settings: random_page_cost = '1' Planning Time: 0.643 ms Execution Time: 6.301 ms Second query. Note the dummy condition at the end. EXPLAIN (ANALYZE, SETTINGS, VERBOSE ) SELECT id, market_id FROM "order" WHERE "order"."deleted_at" IS NULL AND "order"."archived_at" IS NULL AND NOT EXISTS (SELECT FROM "matchmaking_request" WHERE ("matchmaking_request"."order_id" = "order"."id" AND "matchmaking_request"."started_at" IS NOT NULL AND "matchmaking_request"."cancelled_at" IS NULL)) AND "order"."market_id" IN (1, 1); -- dummy condition QUERY PLAN Merge Anti Join (cost=0.56..579.56 rows=503 width=8) (actual time=6.622..6.773 rows=2 loops=1) Output: "order".id, "order".market_id Merge Cond: ("order".id = matchmaking_request.order_id) -> Index Scan using order_pkey on public."order" (cost=0.28..467.77 rows=723 width=8) (actual time=0.078..4.435 rows=493 loops=1) Output: <retracted> Filter: (("order".deleted_at IS NULL) AND ("order".archived_at IS NULL) AND ("order".market_id = ANY ('{1,1}'::integer[]))) Rows Removed by Filter: 1935 -> Index Scan using matchmaking_request_order_id_idx on public.matchmaking_request (cost=0.28..834.07 rows=5840 width=4) (actual time=0.017..1.996 rows=702 loops=1) Output: <retracted> Filter: ((matchmaking_request.started_at IS NOT NULL) AND (matchmaking_request.cancelled_at IS NULL)) Rows Removed by Filter: 301 Settings: random_page_cost = '1' Planning Time: 0.629 ms Execution Time: 6.853 ms
Re: BUG #18522: Wrong results with Merge Right Anti Join, inconsistent with Merge Anti Join
From
Richard Guo
Date:
On Tue, Jun 25, 2024 at 2:37 AM PG Bug reporting form <noreply@postgresql.org> wrote: > The following two queries result in different query plans and different > results, even though there is only a dummy condition change between them. > The > latter results are correct, there are two rows that match the conditions. Thank you for reporting. As far as I recall, this is the first field-reported bug about the right anti join! Could you please provide the schema and necessary data for the two tables to reproduce this bug? If there is a self-contained repro, that would be great. Thanks Richard
Re: BUG #18522: Wrong results with Merge Right Anti Join, inconsistent with Merge Anti Join
From
Antti Lampinen
Date:
On Tue, Jun 25, 2024 at 5:07 AM Richard Guo <guofenglinux@gmail.com> wrote: > > On Tue, Jun 25, 2024 at 2:37 AM PG Bug reporting form > <noreply@postgresql.org> wrote: > > The following two queries result in different query plans and different > > results, even though there is only a dummy condition change between them. > > The > > latter results are correct, there are two rows that match the conditions. > > Could you please provide the schema and necessary data for the two > tables to reproduce this bug? If there is a self-contained repro, that > would be great. I managed to create a self-contained repro: https://gist.github.com/arlampin/0b86963694a936147383f3af3c83224c This gives me consistently different results based on superfluous condition change. See the two EXPLAIN queries in the sample. Thanks, Antti
Re: BUG #18522: Wrong results with Merge Right Anti Join, inconsistent with Merge Anti Join
From
Richard Guo
Date:
On Tue, Jun 25, 2024 at 12:24 PM Antti Lampinen <antti@lampinen.eu> wrote: > On Tue, Jun 25, 2024 at 5:07 AM Richard Guo <guofenglinux@gmail.com> wrote: > > Could you please provide the schema and necessary data for the two > > tables to reproduce this bug? If there is a self-contained repro, that > > would be great. > I managed to create a self-contained repro: > https://gist.github.com/arlampin/0b86963694a936147383f3af3c83224c > > This gives me consistently different results based on superfluous condition > change. See the two EXPLAIN queries in the sample. Thank you so much for the repro script. I've found the root cause: for an inner_unique join we assume that the executor will stop scanning for matches after the first match. Therefore, we set skip_mark_restore to true to indicate that we can skip mark/restore overhead. However, merge right anti join does not get this memo and continues scanning the inner side for matches after the first match, totally ignoring the single_match flag, while still thinking that it can skip mark/restore. Will fix this later. Thanks Richard
Re: BUG #18522: Wrong results with Merge Right Anti Join, inconsistent with Merge Anti Join
From
Richard Guo
Date:
On Tue, Jun 25, 2024 at 7:54 PM Richard Guo <guofenglinux@gmail.com> wrote: > Thank you so much for the repro script. I've found the root cause: > for an inner_unique join we assume that the executor will stop scanning > for matches after the first match. Therefore, we set skip_mark_restore > to true to indicate that we can skip mark/restore overhead. However, > merge right anti join does not get this memo and continues scanning the > inner side for matches after the first match, totally ignoring the > single_match flag, while still thinking that it can skip mark/restore. > > Will fix this later. I think we can fix this issue by ensuring that merge-right-anti-join also advances to next outer tuple after the first match in inner_unique cases. This can also help save cycles by avoiding unnecessary scanning of inner tuples after the first match. Although hash-right-anti-join does not suffer from this wrong results issue, I think we can apply the same change to it as well, to help save cycles for the same reason. Please see attached patch for the fix. This patch still lacks a test case. I'll try to see if I can come up with one. Thanks Richard
Attachment
Re: BUG #18522: Wrong results with Merge Right Anti Join, inconsistent with Merge Anti Join
From
Richard Guo
Date:
On Tue, Jun 25, 2024 at 10:45 PM Richard Guo <guofenglinux@gmail.com> wrote: > On Tue, Jun 25, 2024 at 7:54 PM Richard Guo <guofenglinux@gmail.com> wrote: > > Thank you so much for the repro script. I've found the root cause: > > for an inner_unique join we assume that the executor will stop scanning > > for matches after the first match. Therefore, we set skip_mark_restore > > to true to indicate that we can skip mark/restore overhead. However, > > merge right anti join does not get this memo and continues scanning the > > inner side for matches after the first match, totally ignoring the > > single_match flag, while still thinking that it can skip mark/restore. > > > > Will fix this later. > > I think we can fix this issue by ensuring that merge-right-anti-join > also advances to next outer tuple after the first match in inner_unique > cases. This can also help save cycles by avoiding unnecessary scanning > of inner tuples after the first match. > > Although hash-right-anti-join does not suffer from this wrong results > issue, I think we can apply the same change to it as well, to help save > cycles for the same reason. > > Please see attached patch for the fix. This patch still lacks a test > case. I'll try to see if I can come up with one. I've added a test case in the v2 patch. I'll park it in the July commitfest. Thanks Richard
Attachment
Re: BUG #18522: Wrong results with Merge Right Anti Join, inconsistent with Merge Anti Join
From
Richard Guo
Date:
On Tue, Jun 25, 2024 at 10:07 AM Richard Guo <guofenglinux@gmail.com> wrote: > Thank you for reporting. As far as I recall, this is the first > field-reported bug about the right anti join! Just for the record, this issue was reported again with BUG #18526 by Feliphe Pozzer in https://www.postgresql.org/message-id/flat/18526-bf89f27cb20d8a18%40postgresql.org Thanks Richard
Re: BUG #18522: Wrong results with Merge Right Anti Join, inconsistent with Merge Anti Join
From
Richard Guo
Date:
On Wed, Jun 26, 2024 at 3:31 PM Richard Guo <guofenglinux@gmail.com> wrote: > I've added a test case in the v2 patch. I'll park it in the July > commitfest. Here is an updated patch, with some tweaks suggested by pgindent, and adjustments to the commit message suggested by Andrew (off-list). Barring objections, I'm planning to push it next week. Thanks Richard
Attachment
Re: BUG #18522: Wrong results with Merge Right Anti Join, inconsistent with Merge Anti Join
From
Richard Guo
Date:
On Tue, Jul 2, 2024 at 6:22 PM Richard Guo <guofenglinux@gmail.com> wrote: > Here is an updated patch, with some tweaks suggested by pgindent, and > adjustments to the commit message suggested by Andrew (off-list). > > Barring objections, I'm planning to push it next week. Pushed. Thank you for the report. The fix will appear in August's minor releases. Thanks Richard