Home > mailing lists

Re: Slotification of partition tuple conversion - Mailing list pgsql-hackers

From	Amit Langote
Subject	Re: Slotification of partition tuple conversion
Date	September 4, 2018 03:49:00
Msg-id	ed66156b-54fe-b5d2-10bf-3c73a8052e7e@lab.ntt.co.jp Whole thread Raw
In response to	Re: Slotification of partition tuple conversion (Amit Khandekar <amitdkhan.pg@gmail.com>)
List	pgsql-hackers

Tree view

Hi Amit,

Thanks for the updated patch and sorry I couldn't reply sooner.

On 2018/08/21 16:18, Amit Khandekar wrote:
> On 21 August 2018 at 08:12, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> Here are some comments on the patch:
> 
> Thanks for the review.
> 
>>
>> +ConvertTupleSlot
>>
>> Might be a good idea to call this ConvertSlotTuple?
> 
> I thought the name 'ConvertTupleSlot' emphasizes the fact that we are
> operating rather on a slot without having to touch the tuple wherever
> possible. Hence I chose the name. But I am open to suggestions if
> there are more votes against this.

Hmm, okay.

The verb here is "to convert", which still applies to a tuple, the only
difference is that the new interface accepts and returns TupleTableSlot,
instead of HeapTuple.

>> +                /*
>> +                 * Get the tuple in the per-tuple context. Also, we
>> cannot call
>> +                 * ExecMaterializeSlot(), otherwise the tuple will get freed
>> +                 * while storing the next tuple.
>> +                 */
>> +                oldcontext =
>> MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
>> +                tuple = ExecCopySlotTuple(slot);
>> +                MemoryContextSwitchTo(oldcontext);
>>
>> The comment about why we cannot call ExecMaterializeSlot is a bit unclear.
>>  Maybe, the comment doesn't need to mention that?  Instead, expand a bit
>> more on why the context switch here or how it interacts with recently
>> tuple buffering (0d5f05cde01)?
> 
> Done :
> 
> -   * Get the tuple in the per-tuple context. Also, we cannot call
> -   * ExecMaterializeSlot(), otherwise the tuple will get freed
> -   * while storing the next tuple.
> +  * Get the tuple in the per-tuple context, so that it will be
> +  * freed after each batch insert.
> 
> Specifically, we could not call ExecMaterializeSlot() because it sets
> tts_shouldFree to true which we don't want for batch inserts.

Thanks.

By the way, I was a bit confused by an existing comment just above this
patch's change, which says:

          /*
           * We might need to convert from the parent rowtype to the
           * partition rowtype.  Don't free the already stored tuple as it
           * may still be required for a multi-insert batch.
           */

I wasn't sure what "Don't free the already stored tuple" here meant, until
I saw a hunk from 0d5f05cde ("Allow multi-inserts during COPY into a
partitioned table") that introduced it:

@@ -2691,12 +2861,14 @@ CopyFrom(CopyState cstate)

          /*
           * We might need to convert from the parent rowtype to the
-          * partition rowtype.
+          * partition rowtype.  Don't free the already stored tuple as it
+          * may still be required for a multi-insert batch.
           */
          tuple =
ConvertPartitionTupleSlot(proute->parent_child_tupconv_maps[leaf_part_index],
                                               tuple,
                                               proute->partition_tuple_slot,
-                                              &slot);
+                                              &slot,
+                                              false);

So the "already stored tuple" means a previous tuple that may have been
stored in 'proute->partition_tuple_slot', which shouldn't be freed in
ConvertPartitionTupleSlot (via ExecStoreTuple), because it might also have
been stored in the batch insert tuple array.

Now, because with ConvertTupleSlot, there is no worry about freeing an
existing tuple, because we never perform ExecStoreTuple on
proute->partition_tuple_slot (or one of the slots in it if we consider my
patch at [1] which converts it to an array).  All I see applied to those
slots is ExecStoreVirtualTuple() in ConvertTupleSlot(), and in some cases,
ExecCopySlotTuple() in the caller of ConvertTupleSlot().

So, I think that that sentence is obsolete with your patch.

>> Seeing that all the sites in the partitioning code that previously called
>> do_convert_tuple() now call ConvertTupleSlot, I wonder if we don't need a
>> full TupleConversionMap, just the attribute map in it.  We don't need the
>> input/output Datum and bool arrays in it, because we'd be using the ones
>> from input and output slots of ConvertTupleSlot.  So, can we replace all
>> instances of TupleConversionMap in the partitioning code and data
>> structures by AttributeNumber arrays.
> 
> Yeah, I earlier thought I could get rid of do_convert_tuple()
> altogether. But there are places where we currently deal with tuples
> rather than slots. And here, we could not replace do_convert_tuple()
> unless we slotify the surrounding code similar to how you did in the
> Allocate-dedicated-slots patch. E.g.  ExecEvalConvertRowtype() and
> AfterTriggerSaveEvent() deals with tuples so the do_convert_tuple()
> calls there couldn't be replaced.
> 
> So I think we can work towards what you have in mind in form of
> incremental patches.

I was saying, because we no longer use do_convert_tuple for
"partitioning-related" tuple conversions, we could get rid of
TupleConversionMaps in the partitioning-specific PartitionTupleRouting
structure.

Also, since ConvertTupleSlot itself just uses the attrMap field of
TupleConversionMap, I was wondering if its signature should contain
AttrNumber * instead of TupleConversionMap *?

Maybe if we get around to getting rid of do_convert_tuple altogether, that
would also mean getting rid of the TupleConversionMap struct, because its
various tuple data related arrays would be redundant, because
ConvertTupleSlot has input and output TupleTableSlots, which have space
for that information.

>> Also, how about putting ConvertTupleSlot in execPartition.c and exporting
>> it in execPartition.h, instead of tupconvert.c and tupconvert.h, resp.?
> 
> I think the goal of ConvertTupleSlot() is to eventually replace
> do_convert_tuple() as well, so it would look more of a general
> conversion rather than specific for partitions. Hence I think it's
> better to have it in tupconvert.c.

Okay, maybe that makes sense for any future developments.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/6505cc8c-a8e4-986e-82d3-6106877ecd3a%40lab.ntt.co.jp

pgsql-hackers by date:

From: Amit Langote
Date: 04 September 2018, 03:47:07
Subject: Re: pointless check in RelationBuildPartitionDesc

From: Bruce Momjian
Date: 04 September 2018, 03:53:42
Subject: Re: Caching query plan costs

Re: Slotification of partition tuple conversion - Mailing list pgsql-hackers

Previous

Next