Re: tuplesort_gettuple_common() and *should_free argument - Mailing list pgsql-hackers

From Andres Freund
Subject Re: tuplesort_gettuple_common() and *should_free argument
Date
Msg-id 20170404203202.m6rgxsegnxq54bgh@alap3.anarazel.de
Whole thread Raw
In response to Re: [HACKERS] tuplesort_gettuple_common() and *should_free argument  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: tuplesort_gettuple_common() and *should_free argument  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On 2017-03-13 18:14:07 -0700, Peter Geoghegan wrote:
> From 5351b5db257cb39832647d9117465c0217e6268b Mon Sep 17 00:00:00 2001
> From: Peter Geoghegan <pg@bowt.ie>
> Date: Thu, 13 Oct 2016 10:54:31 -0700
> Subject: [PATCH 1/2] Avoid copying within tuplesort_gettupleslot().

s/Avoid/Allow to avoid/

> Add a "copy" argument to make it optional to receive a copy of caller
> tuple that is safe to use following a subsequent manipulating of
> tuplesort's state.  This is a performance optimization.  Most existing
> tuplesort_gettupleslot() callers are made to opt out of copying.
> Existing callers that happen to rely on the validity of tuple memory
> beyond subsequent manipulations of the tuplesort request their own copy.
> 
> This brings tuplesort_gettupleslot() in line with
> tuplestore_gettupleslot().  In the future, a "copy" tuplesort_getdatum()
> argument may be added, that similarly allows callers to opt out of
> receiving their own copy of tuple.
> 
> In passing, clarify assumptions that callers of other tuplesort fetch
> routines may make about tuple memory validity, per gripe from Tom Lane.
> ---
>  src/backend/executor/nodeAgg.c         | 10 +++++++---
>  src/backend/executor/nodeSort.c        |  5 +++--
>  src/backend/utils/adt/orderedsetaggs.c |  5 +++--
>  src/backend/utils/sort/tuplesort.c     | 28 +++++++++++++++++-----------
>  src/include/utils/tuplesort.h          |  2 +-
>  5 files changed, 31 insertions(+), 19 deletions(-)
> 
> diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
> index aa08152..b650f71 100644
> --- a/src/backend/executor/nodeAgg.c
> +++ b/src/backend/executor/nodeAgg.c
> @@ -570,6 +570,10 @@ initialize_phase(AggState *aggstate, int newphase)
>   * Fetch a tuple from either the outer plan (for phase 0) or from the sorter
>   * populated by the previous phase.  Copy it to the sorter for the next phase
>   * if any.
> + *
> + * Callers cannot rely on memory for tuple in returned slot remaining valid
> + * past any subsequent manipulation of the sorter, such as another fetch of
> + * input from sorter.  (The sorter may recycle memory.)
>   */

It's not just the sorter, right? I'd just rephrase that the caller
cannot rely on tuple to stay valid beyond a single call.



>  static bool
>  tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
> @@ -2091,12 +2092,15 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
>   * NULL value in leading attribute will set abbreviated value to zeroed
>   * representation, which caller may rely on in abbreviated inequality check.
>   *
> - * The slot receives a copied tuple (sometimes allocated in caller memory
> - * context) that will stay valid regardless of future manipulations of the
> - * tuplesort's state.
> + * If copy is TRUE, the slot receives a copied tuple that will stay valid
> + * regardless of future manipulations of the tuplesort's state.  Memory is
> + * owned by the caller.  If copy is FALSE, the slot will just receive a
> + * pointer to a tuple held within the tuplesort, which is more efficient,
> + * but only safe for callers that are prepared to have any subsequent
> + * manipulation of the tuplesort's state invalidate slot contents.
>   */

Hm. Do we need to note anything about how long caller memory needs to
live? I think we'd get into trouble if the caller does a memory context
reset, without also clearing the slot?


>  bool
> -tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
> +tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
>                         TupleTableSlot *slot, Datum *abbrev)
>  {
>      MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
> @@ -2113,8 +2117,10 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
>          if (state->sortKeys->abbrev_converter && abbrev)
>              *abbrev = stup.datum1;
>  
> -        stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
> -        ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, true);
> +        if (copy)
> +            stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
> +
> +        ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, copy);
>          return true;


Other than these minimal adjustments, this looks good to go to me.

- Andres



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Logical decoding on standby
Next
From: Tomas Vondra
Date:
Subject: bug in SlabAlloc / VALGRIND_MAKE_MEM_DEFINED