Re: 9.2rc1 produces incorrect results - Mailing list pgsql-hackers

From Tom Lane
Subject Re: 9.2rc1 produces incorrect results
Date
Msg-id 23021.1346784754@sss.pgh.pa.us
Whole thread Raw
In response to Re: 9.2rc1 produces incorrect results  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: 9.2rc1 produces incorrect results
List pgsql-hackers
I wrote:
> Vik Reykja <vikreykja@gmail.com> writes:
>> Hello.  It took me a while to get a version of this that was independent of
>> my data, but here it is.  I don't understand what's going wrong but if you
>> change any part of this query (or at least any part I tried), the correct
>> result is returned.

> Huh.  9.1 gets the wrong answer too, so this isn't a (very) new bug;
> but curiously, 8.4 and 9.0 seem to get it right.  I think this is
> probably related somehow to Adam Mackler's recent report --- multiple
> scans of the same CTE seems to be a bit of a soft spot :-(

No, I'm mistaken: it's a planner bug.  The plan looks like this:
                                       QUERY PLAN                                         
-------------------------------------------------------------------------------------------Result  (cost=281.29..290.80
rows=20width=36)  CTE a    ->  Nested Loop  (cost=126.96..280.17 rows=19 width=4)          ->  Merge Right Join
(cost=126.96..220.17rows=19 width=4)                Merge Cond: (((pg_c.relname)::text) = ((t2.id)::text))
 Filter: (pg_c.oid IS NULL)                ->  Sort  (cost=61.86..63.72 rows=743 width=68)                      Sort
Key:((pg_c.relname)::text)                      ->  Seq Scan on pg_class pg_c  (cost=0.00..26.43 rows=743 width=68)
          ->  Sort  (cost=65.10..67.61 rows=1004 width=4)                      Sort Key: ((t2.id)::text)
     ->  Seq Scan on t2  (cost=0.00..15.04 rows=1004 width=4)          ->  Index Scan using t1_pkey on t1
(cost=0.00..3.14rows=1 width=4)                Index Cond: (id = t2.id***)  CTE b    ->  WindowAgg  (cost=0.78..1.12
rows=19width=4)          ->  Sort  (cost=0.78..0.83 rows=19 width=4)                Sort Key: a.id                ->
CTEScan on a  (cost=0.00..0.38 rows=19 width=4)  ->  Append  (cost=0.00..9.51 rows=20 width=36)        ->  CTE Scan on
a (cost=0.00..4.66 rows=10 width=4)              Filter: is_something              SubPlan 3                ->  CTE
Scanon b  (cost=0.00..0.43 rows=1 width=4)                      Filter: (id = a.id***)        ->  CTE Scan on a
(cost=0.00..4.66rows=10 width=4)              Filter: is_something              SubPlan 4                ->  CTE Scan
onb  (cost=0.00..0.43 rows=1 width=4)                      Filter: (id = a.id***)
 
(30 rows)

The planner is assigning a single PARAM_EXEC slot for the parameter
passed into the inner indexscan in "CTE a" and for the parameters passed
into the two SubPlans that scan "CTE b" (the items I marked with "***"
above).  This is safe enough so far as the two SubPlans are concerned,
because they don't execute concurrently --- but when SubPlan 3 is first
fired, it causes the remainder of CTE a to be computed, so that the
nestloop gets iterated some more times, and that overwrites the value of
"a.id" that already got passed down into SubPlan 3.

The reason this is so hard to replicate is that the PARAM_EXEC slot can
only get re-used for identical-looking Vars (same varno, varlevelsup,
vartype, etc) --- so even granted that you've got the right shape of
plan, minor "unrelated" changes in the query can stop the aliasing
from occurring.  Also, inner indexscans weren't using the PARAM_EXEC
mechanism until 9.1, so that's why the example doesn't fail before 9.1.

I've always been a bit suspicious of the theory espoused in
replace_outer_var that aliasing different Params is OK:
    * NOTE: in sufficiently complex querytrees, it is possible for the same    * varno/abslevel to refer to different
RTEsin different parts of the    * parsetree, so that different fields might end up sharing the same Param    * number.
As long as we check the vartype/typmod as well, I believe that    * this sort of aliasing will cause no trouble.  The
correctfield should    * get stored into the Param slot at execution in each part of the tree.
 

but I've never seen a provably wrong case before.  Most likely, this has
been broken since we introduced CTEs in 8.4: it's the decoupled timing
of execution of main query and CTE that's needed to allow execution of
different parts of the plan tree to overlap and thus create the risk.

(I get the impression that only recently have people been writing really
complex CTE queries, since we found another fundamental bug with them
just recently.)

I think probably the best fix is to rejigger things so that Params
assigned by different executions of SS_replace_correlation_vars and
createplan.c can't share PARAM_EXEC numbers.  This will result in
rather larger ecxt_param_exec_vals arrays at runtime, but the array
entries aren't very large, so I don't think it'll matter.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: pg_upgrade del/rmdir path fix
Next
From: Andrew Dunstan
Date:
Subject: pg_upgrade diffs on WIndows