Re: Memory leak in incremental sort re-scan - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Memory leak in incremental sort re-scan
Date
Msg-id 99e584a1-2f55-a56d-efb1-d38c57eb2beb@enterprisedb.com
Whole thread Raw
In response to Memory leak in incremental sort re-scan  (Laurenz Albe <laurenz.albe@cybertec.at>)
Responses Re: Memory leak in incremental sort re-scan
List pgsql-hackers
Hi,

On 6/15/23 13:48, Laurenz Albe wrote:
> ExecIncrementalSort() calls tuplesort_begin_common(), which creates the "TupleSort main"
> and "TupleSort sort" memory contexts, and ExecEndIncrementalSort() calls tuplesort_end(),
> which destroys them.
> But ExecReScanIncrementalSort() only resets the memory contexts.  Since the next call to
> ExecIncrementalSort() will create them again, we end up leaking these contexts for every
> re-scan.
> 
> Here is a reproducer with the regression test database:
> 
>   SET enable_sort = off;
>   SET enable_hashjoin = off;
>   SET enable_mergejoin = off;
>   SET enable_material = off;
> 
>   SELECT t.unique2, t2.r
>   FROM tenk1 AS t 
>      JOIN (SELECT unique1, 
>                   row_number() OVER (ORDER BY hundred, thousand) AS r 
>            FROM tenk1 
>            OFFSET 0) AS t2 
>         ON t.unique1 + 0 = t2.unique1
>   WHERE t.unique1 < 1000;
> 
> The execution plan:
> 
>  Nested Loop
>    Join Filter: ((t.unique1 + 0) = tenk1.unique1)
>    ->  Bitmap Heap Scan on tenk1 t
>          Recheck Cond: (unique1 < 1000)
>          ->  Bitmap Index Scan on tenk1_unique1
>                Index Cond: (unique1 < 1000)
>    ->  WindowAgg
>          ->  Incremental Sort
>                Sort Key: tenk1.hundred, tenk1.thousand
>                Presorted Key: tenk1.hundred
>                ->  Index Scan using tenk1_hundred on tenk1
> 
> 
> A memory context dump at the end of the execution looks like this:
> 
>       ExecutorState: 262144 total in 6 blocks; 74136 free (29 chunks); 188008 used
>         TupleSort main: 32832 total in 2 blocks; 7320 free (0 chunks); 25512 used
>           TupleSort sort: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
>             Caller tuples: 8192 total in 1 blocks (0 chunks); 7984 free (0 chunks); 208 used
>         TupleSort main: 32832 total in 2 blocks; 7256 free (0 chunks); 25576 used
>           TupleSort sort: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
>             Caller tuples: 8192 total in 1 blocks (0 chunks); 7984 free (0 chunks); 208 used
>         TupleSort main: 32832 total in 2 blocks; 7320 free (0 chunks); 25512 used
>           TupleSort sort: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
>             Caller tuples: 8192 total in 1 blocks (0 chunks); 7984 free (0 chunks); 208 used
>       [many more]
>         1903 more child contexts containing 93452928 total in 7597 blocks; 44073240 free (0 chunks); 49379688 used
> 
> 
> The following patch fixes the problem for me:
> 
> --- a/src/backend/executor/nodeIncrementalSort.c
> +++ b/src/backend/executor/nodeIncrementalSort.c
> @@ -1145,21 +1145,16 @@ ExecReScanIncrementalSort(IncrementalSortState *node)
>     node->execution_status = INCSORT_LOADFULLSORT;
>  
>     /*
> -    * If we've set up either of the sort states yet, we need to reset them.
> -    * We could end them and null out the pointers, but there's no reason to
> -    * repay the setup cost, and because ExecIncrementalSort guards presorted
> -    * column functions by checking to see if the full sort state has been
> -    * initialized yet, setting the sort states to null here might actually
> -    * cause a leak.
> +    * Release tuplesort resources.
>      */
>     if (node->fullsort_state != NULL)
>     {
> -       tuplesort_reset(node->fullsort_state);
> +       tuplesort_end(node->fullsort_state);
>         node->fullsort_state = NULL;
>     }
>     if (node->prefixsort_state != NULL)
>     {
> -       tuplesort_reset(node->prefixsort_state);
> +       tuplesort_end(node->prefixsort_state);
>         node->prefixsort_state = NULL;
>     }
>  
> 
> The original comment hints that this might mot be the correct thing to do...
> 

I think it's correct, but I need to look at the code more closely - it's
been a while. The code is a bit silly, as it resets the tuplesort and
then throws away all the pointers - so what could the _end() break?

AFAICS the comment says that we can't just do tuplesort_reset and keep
the pointers, because some other code depends on them being NULL.

In hindsight, that's a bit awkward - it'd probably be better to have a
separate flag, which would allow us to just reset the tuplesort.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: torikoshia
Date:
Subject: Re: RFC: Logging plan of the running query
Next
From: Tom Lane
Date:
Subject: Re: When IMMUTABLE is not.