Re: BUG #15577: Query returns different results when executed multiple times - Mailing list pgsql-bugs
From | Tom Lane |
---|---|
Subject | Re: BUG #15577: Query returns different results when executed multiple times |
Date | |
Msg-id | 18883.1547071723@sss.pgh.pa.us Whole thread Raw |
In response to | Re: BUG #15577: Query returns different results when executedmultiple times (Thomas Munro <thomas.munro@enterprisedb.com>) |
Responses |
Re: BUG #15577: Query returns different results when executedmultiple times
|
List | pgsql-bugs |
Thomas Munro <thomas.munro@enterprisedb.com> writes: > So it's not the reading of the tuple queue per se, but the resulting > visit to a higher part of the plan when the Gather node emits a tuple. > It can't be right that plan id 2 and plan id 0 are both using paramno > 0, can it? So the plan shape, shorn of uninteresting detail, is Nested Loop Left Join -> Gather -> Nested Loop -> Hash Join -> Parallel Seq Scan on table_c tc -> Hash -> Seq Scan on table_d td -> Index Scan ... on table_b tba Index Cond: (target_id = tc.id) -> Index Scan ... on table_a ta Index Cond: ((table_c_id = tc.id) AND (date = '2018-08-31'::date)) The lower Nested Loop has a NestLoopParam node telling it to pass the value of tc.id down to the tba indexscan. The upper one also has a NestLoopParam node, telling it to pass the value of tc.id down to the ta indexscan. Since these are for the same Var, the planner believes it's safe to use the same PARAM_EXEC slot for both of them. And in a normal plan tree, it'd be right. I think what is happening is (1) Leader runs the lower nestloop for a bit and gets back a successful tc/tba join row. At this point its PARAM_EXEC slot is filled with the tc.id value from that tc row. (2) It hands this up to the upper nestloop, which assigns that same tc.id to the PARAM_EXEC slot ... no trouble so far ... and runs the ta indexscan with that. (3) Now we go back to the Gather, which I think must go off and look for rows coming from child workers before it resumes the leader's nestloop. It gets some row, with a different tc.id value, and hands that back up. (4) Upper nestloop assigns the tc.id value from the child's row to the PARAM_EXEC slot, and runs the ta indexscan with that. (5) Return to the Gather, which now allows the lower nestloop to proceed in the leader. (6) Lower nestloop thinks it's not done with the current outer (tc) row, so it invokes the tba indexscan to see if that can return any more rows. But since it hasn't advanced to a new outer row, it doesn't touch the NestLoopParam or PARAM_EXEC slot. (7) At this point we're resuming the tba indexscan with a different value for tc.id than it was started with. The results of that are undefined, IMO, but what's apparently happening is that it fails to match some rows that it otherwise would have matched. So the end result is some rows are missing from the output. One way we could deal with this, perhaps, is to decide that NestLoopParams above and below a Gather can't share PARAM_EXEC slots. I'm not sure how we'd mechanize that, other than not allowing NestLoopParams to share PARAM_EXEC slots *ever*, which might not really be that much of a cost. But TBH the whole thing frightens me quite a lot as to what other serial-processing assumptions are getting broken by plopping Gather into the middle of a plan tree. I wonder whether we'd not be best off disallowing this sort of plan shape, and/or forbidding the leader from also executing the parallelized subplan. regards, tom lane
pgsql-bugs by date: