Re: Bug in row_number() optimization - Mailing list pgsql-hackers

From Richard Guo
Subject Re: Bug in row_number() optimization
Date
Msg-id CAMbWs49AjzdAtLFs-aNST+tGUhibj1-i5_AUO4e7fqgTutsfZg@mail.gmail.com
Whole thread Raw
In response to Bug in row_number() optimization  (Sergey Shinderuk <s.shinderuk@postgrespro.ru>)
Responses Re: Bug in row_number() optimization
List pgsql-hackers

On Wed, Nov 16, 2022 at 7:38 AM Sergey Shinderuk <s.shinderuk@postgrespro.ru> wrote:
The failing query is:
SELECT * FROM
   (SELECT *,
           count(salary) OVER (PARTITION BY depname || '') c1, -- w1
           row_number() OVER (PARTITION BY depname) rn, -- w2
           count(*) OVER (PARTITION BY depname) c2, -- w2
           count(*) OVER (PARTITION BY '' || depname) c3 -- w3
    FROM empsalary
) e WHERE rn <= 1 AND c1 <= 3;
As far as I understand, ExecWindowAgg for the intermediate WindowAgg
node switches into pass-through mode, stops evaluating row_number(), and
returns the previous value instead. But if int8 is passed by reference,
the previous value stored in econtext->ecxt_aggvalues becomes a dangling
pointer when the per-output-tuple memory context is reset.
 
Yeah, you're right.  In this example the window function row_number()
goes into pass-through mode after the second evaluation because its
run condition does not hold true any more.  The remaining run would just
return the result from the second evaluation, which is stored in
econtext->ecxt_aggvalues[wfuncno].

If int8 is configured as pass-by-ref, the precomputed value from the
second evaluation is actually located in a memory area from context
ecxt_per_tuple_memory, with its pointer stored in ecxt_aggvalues.  As
this memory context is reset once per tuple, we would be prone to wrong
results.

I tried with memory context ecxt_per_query_memory when evaluating
window function in the case where int8 is configured as pass-by-ref and
I can see the problem vanishes.  I'm using the changes as below

--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -1027,8 +1027,14 @@ eval_windowfunction(WindowAggState *winstate, WindowStatePerFunc perfuncstate,
 {
        LOCAL_FCINFO(fcinfo, FUNC_MAX_ARGS);
        MemoryContext oldContext;
-
-       oldContext = MemoryContextSwitchTo(winstate->ss.ps.ps_ExprContext->ecxt_per_tuple_memory);
+       MemoryContext evalWfuncContext;
+
+#ifdef USE_FLOAT8_BYVAL
+       evalWfuncContext = winstate->ss.ps.ps_ExprContext->ecxt_per_tuple_memory;
+#else
+       evalWfuncContext = winstate->ss.ps.ps_ExprContext->ecxt_per_query_memory;
+#endif
+       oldContext = MemoryContextSwitchTo(evalWfuncContext);

Thanks
Richard
 

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: proposal: possibility to read dumped table's name from file
Next
From: "Drouvot, Bertrand"
Date:
Subject: Re: Introduce a new view for checkpointer related stats