Thread: Tuple conversion naming

Tuple conversion naming

From
Andres Freund
Date:
Hi,

The naming around partition related tuple conversions is imo worthy of
improvement.

For building tuple conversion maps we have:
- convert_tuples_by_name
- convert_tuples_by_name_map
- convert_tuples_by_position
- ExecSetupChildParentMapForLeaf
- TupConvMapForLeaf
- free_conversion_map

I've a bunch of problems with this:
1) convert_tuples_by_name etc sound like they actually perform tuple
   conversion, rather than just prepare for it
2) convert_tuples_by_name_map returns not TupleConversionMap, but an
   array. But convert_tuples_by_name does return TupleConversionMap.
3) The naming is pretty inconsistent. free_conversion_map
   vs. convert_tuples_by_name, but both deal with TupleConversionMap?

For executing them we have:
- do_convert_tuple
- ConvertPartitionTupleSlot

which is two randomly differing spellings of related functionality,
without the name indicating that they, for reasons, don't both use
TupleConversionMap.  I'm also not particularly happy about
"do_convert_tuple", which is a pretty generic name.


A lot of this probably made sense at some time, but we got to clean this
up. We'll grow more partitioning related code, and this IMO makes
reading the code harder - and given the change rate, that's something I
frequently have to do.


Greetings,

Andres Freund


Re: Tuple conversion naming

From
Amit Langote
Date:
Hi,

I agree that some clean up might be in order, but want to clarify a few
points.

On 2018/10/02 15:11, Andres Freund wrote:
> Hi,
> 
> The naming around partition related tuple conversions is imo worthy of
> improvement.

Note that tuple conversion functionality in tupconvert.c has existed even
before partitioning, although partitioning may have contributed to
significant expansion of its usage.

> For building tuple conversion maps we have:
> - convert_tuples_by_name
> - convert_tuples_by_name_map
> - convert_tuples_by_position
> - ExecSetupChildParentMapForLeaf
> - TupConvMapForLeaf
> - free_conversion_map

Of these, ExecSetupChildParentMapForLeaf and TupConvMapForLeaf go away
with the patch posted in the "Speeding up INSERTs and UPDATEs to
partitioned tables" thread:

https://www.postgresql.org/message-id/CAKJS1f-SnNXUS0_2wOn-7ZsuvrmQd_6_t-uG8pyUKfdnE9_y-A%40mail.gmail.com

To summarize of what the above patch does and why it removed those
functions:  those functions allocate the memory needed to hold
TupleConversionMap pointers for *all* partitions in a given partition
tree, but the patch refactored the tuple routing initialization code such
that such allocations (the aforementioned and quite a few others) and the
functions are redundant.

> I've a bunch of problems with this:
> 1) convert_tuples_by_name etc sound like they actually perform tuple
>    conversion, rather than just prepare for it
> 2) convert_tuples_by_name_map returns not TupleConversionMap, but an
>    array. But convert_tuples_by_name does return TupleConversionMap.
> 3) The naming is pretty inconsistent. free_conversion_map
>    vs. convert_tuples_by_name, but both deal with TupleConversionMap?

Yeah, I had similar thoughts in past.

> For executing them we have:
> - do_convert_tuple
> - ConvertPartitionTupleSlot
>
> which is two randomly differing spellings of related functionality,
> without the name indicating that they, for reasons, don't both use
> TupleConversionMap.

Actually, ConvertPartitionTupleSlot performs two tasks: 1. performs the
conversion of the tuple in the input slot using do_convert_tuple, and 2.
switches the TupleTableSlot to be used from this point on to a
partition-specific one.

> I'm also not particularly happy about
> "do_convert_tuple", which is a pretty generic name.
> 
> A lot of this probably made sense at some time, but we got to clean this
> up. We'll grow more partitioning related code, and this IMO makes
> reading the code harder - and given the change rate, that's something I
> frequently have to do.

On the "Slotification of partition tuple conversion" thread, I suggested
that we try to decouple partitioning-related tuple conversion from
tupconvert.c, which may significantly improve things imho.  See the last
paragraph of my message here:

https://www.postgresql.org/message-id/ed66156b-54fe-b5d2-10bf-3c73a8052e7e%40lab.ntt.co.jp

Basically, Amit Khandekar proposed that we add another function with the
same functionality as do_convert_tuple to tupconvert.c that accepts and
returns TupleTableSlot instead of HeapTuple.  Per discussion, it turned
out that we won't need a full TupleConversionMap for partitioning-related
conversions if we start using this new function, so we could land the new
conversion function in execPartition.c instead of tupconvert.c.  Perhaps,
we could just open-code do_convert_tuple()'s functionality in the existing
ConvertPartitionTupleSlot().

Of course, that is not to say we shouldn't do anything about the possibly
unhelpful names of the functions that tupconvert.c exports.

Thanks,
Amit



Re: Tuple conversion naming

From
Andres Freund
Date:
Hi,

On 2018-10-02 16:18:19 +0900, Amit Langote wrote:
> Hi,
> 
> I agree that some clean up might be in order, but want to clarify a few
> points.
> 
> On 2018/10/02 15:11, Andres Freund wrote:
> > Hi,
> > 
> > The naming around partition related tuple conversions is imo worthy of
> > improvement.
> 
> Note that tuple conversion functionality in tupconvert.c has existed even
> before partitioning, although partitioning may have contributed to
> significant expansion of its usage.

Fair enough, doesn't really change my point tho :)


> > For executing them we have:
> > - do_convert_tuple
> > - ConvertPartitionTupleSlot
> >
> > which is two randomly differing spellings of related functionality,
> > without the name indicating that they, for reasons, don't both use
> > TupleConversionMap.
> 
> Actually, ConvertPartitionTupleSlot performs two tasks: 1. performs the
> conversion of the tuple in the input slot using do_convert_tuple, and 2.
> switches the TupleTableSlot to be used from this point on to a
> partition-specific one.

Yea, I had this mostly written with the view of the code after the
"slottification of partition tuple covnersion" thread.


> > I'm also not particularly happy about
> > "do_convert_tuple", which is a pretty generic name.
> > 
> > A lot of this probably made sense at some time, but we got to clean this
> > up. We'll grow more partitioning related code, and this IMO makes
> > reading the code harder - and given the change rate, that's something I
> > frequently have to do.
> 
> On the "Slotification of partition tuple conversion" thread, I suggested
> that we try to decouple partitioning-related tuple conversion from
> tupconvert.c, which may significantly improve things imho.  See the last
> paragraph of my message here:
> 
> https://www.postgresql.org/message-id/ed66156b-54fe-b5d2-10bf-3c73a8052e7e%40lab.ntt.co.jp
> 
> Basically, Amit Khandekar proposed that we add another function with the
> same functionality as do_convert_tuple to tupconvert.c that accepts and
> returns TupleTableSlot instead of HeapTuple.  Per discussion, it turned
> out that we won't need a full TupleConversionMap for partitioning-related
> conversions if we start using this new function, so we could land the new
> conversion function in execPartition.c instead of tupconvert.c.  Perhaps,
> we could just open-code do_convert_tuple()'s functionality in the existing
> ConvertPartitionTupleSlot().
> 
> Of course, that is not to say we shouldn't do anything about the possibly
> unhelpful names of the functions that tupconvert.c exports.

I'm not quite sure I see the point of a full decoupling of
partitioning-related tuple conversion from tupconvert.c. Yes, the
routines should be more clearly named, but why should the code be
separate?   I'm kinda wondering if we shouldn't have the tuple
conversion functions just use the slot based functionality in the back,
and just store those in the TupConversionMap.

Greetings,

Andres Freund


Re: Tuple conversion naming

From
Amit Langote
Date:
On 2018/10/02 16:40, Andres Freund wrote:
>>> For executing them we have:
>>> - do_convert_tuple
>>> - ConvertPartitionTupleSlot
>>>
>>> which is two randomly differing spellings of related functionality,
>>> without the name indicating that they, for reasons, don't both use
>>> TupleConversionMap.
>>
>> Actually, ConvertPartitionTupleSlot performs two tasks: 1. performs the
>> conversion of the tuple in the input slot using do_convert_tuple, and 2.
>> switches the TupleTableSlot to be used from this point on to a
>> partition-specific one.
> 
> Yea, I had this mostly written with the view of the code after the
> "slottification of partition tuple covnersion" thread.

Ah, okay.  I'm looking at the new patch you posted.

>>> I'm also not particularly happy about
>>> "do_convert_tuple", which is a pretty generic name.
>>>
>>> A lot of this probably made sense at some time, but we got to clean this
>>> up. We'll grow more partitioning related code, and this IMO makes
>>> reading the code harder - and given the change rate, that's something I
>>> frequently have to do.
>>
>> On the "Slotification of partition tuple conversion" thread, I suggested
>> that we try to decouple partitioning-related tuple conversion from
>> tupconvert.c, which may significantly improve things imho.  See the last
>> paragraph of my message here:
>>
>> https://www.postgresql.org/message-id/ed66156b-54fe-b5d2-10bf-3c73a8052e7e%40lab.ntt.co.jp
>>
>> Basically, Amit Khandekar proposed that we add another function with the
>> same functionality as do_convert_tuple to tupconvert.c that accepts and
>> returns TupleTableSlot instead of HeapTuple.  Per discussion, it turned
>> out that we won't need a full TupleConversionMap for partitioning-related
>> conversions if we start using this new function, so we could land the new
>> conversion function in execPartition.c instead of tupconvert.c.  Perhaps,
>> we could just open-code do_convert_tuple()'s functionality in the existing
>> ConvertPartitionTupleSlot().
>>
>> Of course, that is not to say we shouldn't do anything about the possibly
>> unhelpful names of the functions that tupconvert.c exports.
> 
> I'm not quite sure I see the point of a full decoupling of
> partitioning-related tuple conversion from tupconvert.c. Yes, the
> routines should be more clearly named, but why should the code be
> separate?

Hmm, okay.  After looking at your patch on the other thread, I feel less
strongly about this.

> I'm kinda wondering if we shouldn't have the tuple
> conversion functions just use the slot based functionality in the back,
> and just store those in the TupConversionMap.

Sorry, I didn't understand this.

Thanks,
Amit



Re: Tuple conversion naming

From
Andres Freund
Date:
Hi,

On 2018-10-02 17:28:26 +0900, Amit Langote wrote:
> On 2018/10/02 16:40, Andres Freund wrote:
> > I'm kinda wondering if we shouldn't have the tuple
> > conversion functions just use the slot based functionality in the back,
> > and just store those in the TupConversionMap.
> 
> Sorry, I didn't understand this.

I was basically wondering if we should just allocate two slots in
TupConversionMap and then drive the tuple conversion via the slots.

Greetings,

Andres Freund