Re: partitioning - changing a slot's descriptor is expensive - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: partitioning - changing a slot's descriptor is expensive
Date
Msg-id CAFjFpRcxb=d9=wdDxterEssKeLd2uAWo9578hRntBptxLkrOYg@mail.gmail.com
Whole thread Raw
In response to partitioning - changing a slot's descriptor is expensive  (Andres Freund <andres@anarazel.de>)
Responses Re: partitioning - changing a slot's descriptor is expensive
List pgsql-hackers
On Wed, Jun 27, 2018 at 10:39 AM, Andres Freund <andres@anarazel.de> wrote:
> Hi,
>
> (sorry if I CCed the wrong folks, the code has changed a fair bit)
>
> I noticed that several places in the partitioning code look like:
>
>     /*
>      * If the tuple has been routed, it's been converted to the
>      * partition's rowtype, which might differ from the root
>      * table's.  We must convert it back to the root table's
>      * rowtype so that val_desc shown error message matches the
>      * input tuple.
>      */
>     if (resultRelInfo->ri_PartitionRoot)
>     {
>         TableTuple tuple = ExecFetchSlotTuple(slot);
>         TupleConversionMap *map;
>
>         rel = resultRelInfo->ri_PartitionRoot;
>         tupdesc = RelationGetDescr(rel);
>         /* a reverse map */
>         map = convert_tuples_by_name(orig_tupdesc, tupdesc,
>                                      gettext_noop("could not convert row type"));
>         if (map != NULL)
>         {
>             tuple = do_convert_tuple(tuple, map);
>             ExecSetSlotDescriptor(slot, tupdesc);
>             ExecStoreTuple(tuple, slot, InvalidBuffer, false);
>         }
>     }
>
>
> Unfortunately calling ExecSetSlotDescriptor() is far from cheap, it has
> to reallocate the values/isnull arrays (and potentially do desc pinning
> etc).

I bumped into this code yesterday while looking at ExecFetchSlotTuple
and ExecMaterializeSlot usages. I was wondering the same.

ExecSetSlotDescriptor() always frees tts_values and tts_isnull array
even if the tuple descriptor being set has same number of attributes
as previous one. Most of the times that will be the case. I think we
should optimize that case. If you agree, I will submit a patch for
that.

Amit Langote has clarified that this doesn't happen often since
partitioned tables will not require tuple conversion usually. But I
agree that in pathological case this is a performance eater and should
be fixed.

> Nor is it always cheap to call ExecFetchSlotTuple(slot) - it might
> be a virtual slot. Calling heap_deform_tuple(), as done in
> do_convert_tuple(), might be superfluous work, because the input tuple
> might already be entirely deformed in the input slot.
>
> I think it'd be good to rewrite the code so there's an input and an
> output slot that each will keep their slot descriptors set.

+1 for that.

But I am worried that the code will have to create thousand slots if
there are thousand partitions. I think we will need to see how much
effect that has.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: [HACKERS] Support for Secure Transport SSL library on macOS asOpenSSL alternative
Next
From: Amit Kapila
Date:
Subject: Re: Few cosmetic suggestions for commit 16828d5c (Fast Alter TableAdd Column...)