Re: [HACKERS] Question about toasting code - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] Question about toasting code
Date
Msg-id 23888.1494186493@sss.pgh.pa.us
Whole thread Raw
In response to [HACKERS] Question about toasting code  (Mat Arye <mat@timescale.com>)
Responses Re: [HACKERS] Question about toasting code  (Mat Arye <mat@timescaledb.com>)
List pgsql-hackers
Mat Arye <mat@timescale.com> writes:
> This is in arrayfuncs.c:5022 (postgre 9.6.2)

> /*
> * Ensure pass-by-ref stuff is copied into mcontext; and detoast it too if
> * it's varlena.  (You might think that detoasting is not needed here
> * because construct_md_array can detoast the array elements later.
> * However, we must not let construct_md_array modify the ArrayBuildState
> * because that would mean array_agg_finalfn damages its input, which is
> * verboten.  Also, this way frequently saves one copying step.)
> */

> I am a bit confused by the comment.

> Does PG_DETOAST_DATUM_COPY(dvalue) modify dvalue?

No.

What the comment is on about is that construct_md_array does this:
       /* make sure data is not toasted */       if (elmlen == -1)           elems[i] =
PointerGetDatum(PG_DETOAST_DATUM(elems[i]));

that is, it intentionally modifies the passed-in elems array in
the case that one of the elements is a toasted value.  For most
callers, the elems array is only transient storage anyway.  But
it's problematic for makeMdArrayResult, because it would mean
that the ArrayBuildState is changed --- and while it's not changed
in a semantically significant way, that's still a problem, because
the detoasted value would be allocated in a context that is possibly
shorter-lived than the ArrayBuildState is.  (In a hashed aggregation
situation, the ArrayBuildState is going to live in storage that is
persistent across the aggregation, but we are calling makeMdArrayResult
in the context where we want the result value to be created, which
is of only per-output-tuple lifespan.)

You could imagine fixing this by having construct_md_array do some
context switches internally, but that would complicate it.  And in
any case, fixing the problem where it is allows us to combine the
possible detoasting with copying the value into the ArrayBuildState's
context, so we'd probably keep doing that even if it was safe not to.

> The thing I am struggling with is with the serialize/deserialize functions.
> Can I detoast inside the serialize function if I use _COPY? Or is there a
> reason I have to detoast during the sfunc?

Should be able to detoast in serialize if you want.  Are you having
trouble with that?  (It might be inefficient to do it that way, if
you have to serialize the same value multiple times.  But I'm not
sure if that can happen.)
        regards, tom lane



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [HACKERS] Detecting schema changes during logical replication
Next
From: Mat Arye
Date:
Subject: Re: [HACKERS] Question about toasting code