Re: partitioning - changing a slot's descriptor is expensive - Mailing list pgsql-hackers

From Amit Langote
Subject Re: partitioning - changing a slot's descriptor is expensive
Date
Msg-id 9f307f38-e04d-2271-9500-b7424c5d86d8@lab.ntt.co.jp
Whole thread Raw
In response to partitioning - changing a slot's descriptor is expensive  (Andres Freund <andres@anarazel.de>)
Responses Re: partitioning - changing a slot's descriptor is expensive
List pgsql-hackers
Hi Andres,

On 2018/06/27 14:09, Andres Freund wrote:
> Hi,
> 
> (sorry if I CCed the wrong folks, the code has changed a fair bit)
> 
> I noticed that several places in the partitioning code look like:
> 
>     /*
>      * If the tuple has been routed, it's been converted to the
>      * partition's rowtype, which might differ from the root
>      * table's.  We must convert it back to the root table's
>      * rowtype so that val_desc shown error message matches the
>      * input tuple.
>      */
>     if (resultRelInfo->ri_PartitionRoot)
>     {
>         TableTuple tuple = ExecFetchSlotTuple(slot);
>         TupleConversionMap *map;
> 
>         rel = resultRelInfo->ri_PartitionRoot;
>         tupdesc = RelationGetDescr(rel);
>         /* a reverse map */
>         map = convert_tuples_by_name(orig_tupdesc, tupdesc,
>                                      gettext_noop("could not convert row type"));
>         if (map != NULL)
>         {
>             tuple = do_convert_tuple(tuple, map);
>             ExecSetSlotDescriptor(slot, tupdesc);
>             ExecStoreTuple(tuple, slot, InvalidBuffer, false);
>         }
>     }

This particular code block (and a couple of others like it) are executed
only if a tuple fails partition constraint check, so maybe not a very
frequently executed piece of code.

There is however similar code that runs in non-error paths, such as in
ExecPrepareTupleRouting(), that is executed *if* the leaf partition and
the root parent have differing attribute numbers.  So, one might say that
that code's there to handle the special cases which may not arise much in
practice, but it may still be a good idea to make improvements if we can
do so.

> Unfortunately calling ExecSetSlotDescriptor() is far from cheap, it has
> to reallocate the values/isnull arrays (and potentially do desc pinning
> etc). Nor is it always cheap to call ExecFetchSlotTuple(slot) - it might
> be a virtual slot. Calling heap_deform_tuple(), as done in
> do_convert_tuple(), might be superfluous work, because the input tuple
> might already be entirely deformed in the input slot.
> 
> I think it'd be good to rewrite the code so there's an input and an
> output slot that each will keep their slot descriptors set.
> 
> Besides performance as the code stands, this'll also be important for
> pluggable storage (as we'd otherwise get a *lot* of back/forth
> conversions between tuple formats).
> 
> Anybody interesting in writing a patch that redoes this?

Thanks for the pointers above, I will see if I can produce a patch and
will also be interested in hearing what others may have to say.

Thanks,
Amit



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Excessive PostmasterIsAlive calls slow down WAL redo
Next
From: Amit Kapila
Date:
Subject: Re: Thinko/typo in ExecSimpleRelationInsert