On 2018/06/13 16:35, Amit Langote wrote:
> Fwiw, I see that the crash can also occur even when using a
> non-partitioned table in the query, as shown in the following example
> which reuses Rajkumar's test data and query:
>
> create table foo (a int, b int, c text);
> postgres=# insert into foo select i%20, i%30, to_char(i%12, 'FM0000') from
> generate_series(0, 36) i;
>
> select dense_rank(b) within group (order by a) from foo group by b order by 1;
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
>
> Following query in the regression test suite can also be made to crash by
> adding a group by clause:
>
> select dense_rank(3) within group (order by x) from (values
> (1),(1),(2),(2),(3),(3),(4)) v(x) group by (x);
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
>
> Looking at the core dump of this, it seems the following commit may be
> relevant:
>
> commit bf6c614a2f2c58312b3be34a47e7fb7362e07bcb
> Author: Andres Freund <andres@anarazel.de>
> Date: Thu Feb 15 21:55:31 2018 -0800
>
> Do execGrouping.c via expression eval machinery, take two.
I studied this a bit and found a bug that's causing the crash.
The above mentioned commit has this hunk:
@@ -1309,6 +1311,9 @@ hypothetical_dense_rank_final(PG_FUNCTION_ARGS)
PG_RETURN_INT64(rank);
osastate = (OSAPerGroupState *) PG_GETARG_POINTER(0);
+ econtext = osastate->qstate->econtext;
+ if (!econtext)
+ osastate->qstate->econtext = econtext =
CreateStandaloneExprContext();
In CreateStandloneExprContext(), we have this:
econtext->ecxt_per_query_memory = CurrentMemoryContext;
/*
* Create working memory for expression evaluation in this context.
*/
econtext->ecxt_per_tuple_memory =
AllocSetContextCreate(CurrentMemoryContext,
"ExprContext",
ALLOCSET_DEFAULT_SIZES);
I noticed when debugging the crashing query that CurrentMemoryContext is
actually per-tuple memory context of some expression context of the
calling code, which would get reset before getting here again. So, it's
wrong of hypothetical_dense_rank_final to call CreateStandloneExprContext
without first switching to an actual per-query context.
Attached patch seems to fix the crash.
Thanks,
Amit