Home > mailing lists

Re: [HACKERS] UPDATE of partition key - Mailing list pgsql-hackers

From	Amit Khandekar
Subject	Re: [HACKERS] UPDATE of partition key
Date	January 15, 2018 13:41:13
Msg-id	CAJ3gD9etax=AU62XP9a68vtyRENnDUm+qu_RdbvQ_YWLL_sdLw@mail.gmail.com Whole thread
In response to	Re: [HACKERS] UPDATE of partition key (Amit Khandekar <amitdkhan.pg@gmail.com>)
Responses	Re: [HACKERS] UPDATE of partition key
List	pgsql-hackers

Tree view

On 14 January 2018 at 17:27, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
> On 13 January 2018 at 02:56, Robert Haas <robertmhaas@gmail.com> wrote:
> > I guess I'm inclined to keep mt_per_sub_plan_maps for the case where
> > there are no partitions, but not use it when partitions are present.
> > What do you think about that?
>
> Even where partitions are present, in the usual case where there are no transition tables we won't require per-leaf
mapat all [1]. So I think we should keep mt_per_sub_plan_maps only for the case where per-leaf map is not allocated.
Andwe will not allocate mt_per_sub_plan_maps when mt_per_leaf_maps is needed. In other words, exactly one of the two
mapswill be allocated. 
>
> This is turning out to be close to what's already there in the last patch versions: use a single map array, and an
offsetsarray. The difference is : in the patch I am using the *same* variable for the two maps. Where as, now we are
talkingabout two different array variables for maps, but only allocating one of them. 
>
> Are you ok with this ? I think the thing you were against was to have a common *variable* for two purposes. But
above,I am saying we have two variables but assign a map array to only *one* of them and leave the other unused. 
>
> ---------
>
> Regarding the on-demand map allocation ....
> Where mt_per_sub_plan_maps is allocated, we won't have the on-demand allocation: all the maps will be allocated
initially.The reason is becaues the map_is_required array is only per-leaf. Or else, again, we need to keep another
map_is_requiredarray for per-subplan. May be we can support the on-demand stuff for subplan maps also, but only as a
separatechange after we are done with update-partition-key. 
>
>
> ---------
>
> Regarding mt_per_leaf_tupconv_required, I am thinking we can make it a bool, and name it :
mt_per_leaf_map_not_required.When it is true for a given index, it means, we have already called
convert_tuples_by_name()and it returned NULL; i.e. it means we are sure that map is not required. A false value means
weneed to call convert_tuples_by_name() if it is NULL, and then set mt_per_leaf_map_not_required to (map == NULL). 
>
> Instead of a bool array, , we can instead make it a Bitmapset. But I think access would become slower as compared to
array,particularly because it is going to be a heavily used function. 

I went ahead and did the above changes. I haven't yet merged these
changes in the main patch. Instead, I have attached it as an
incremental patch to be applied on the main v36 patch. The incremental
patch is not yet quite polished, and quite a bit of cosmetic changes
might be required, plus testing. But am posting it in case I have some
early feedback. Details :

The per-subplan map array variable is kept in ModifyTableState :
-       TupleConversionMap **mt_childparent_tupconv_maps;
-       /* Per plan/partition map for tuple conversion from child to root */
-       bool            mt_is_tupconv_perpart;  /* Is the above map
per-partition ? */
+       TupleConversionMap **mt_per_subplan_tupconv_maps;
+       /* Per plan map for tuple conversion from child to root */
 } ModifyTableState;

The per-leaf array variable and the not_required array is kept in
PartitionTupleRouting :
-       TupleConversionMap **partition_tupconv_maps;
+       TupleConversionMap **parent_child_tupconv_maps;
+       TupleConversionMap **child_parent_tupconv_maps;
+       bool       *child_parent_tupconv_map_not_reqd;
As you can see above, all the arrays are per-partition. So removed the
per-leaf tag in these arrays. Instead, renamed the existing
partition_tupconv_maps to parent_child_tupconv_maps, and the new
per-leaf array to child_parent_tupconv_maps

Have two separate functions ExecSetupChildParentMapForLeaf() and
ExecSetupChildParentMapForSubplan() since most of their code is
different. And now because of this, we can re-use
ExecSetupChildParentMapForLeaf() in both copy.c and nodeModifyTable.c.

Even inserts/copy will benefit from the on-demand map allocation. This
is because now there is a function TupConvMapForLeaf() that is called
in both copy.c and ExecInsert(). This is the function that does
on-demand allocation.

Attached the incremental patch conversion_map_changes.patch that has
the above changes. It is to be applied over the latest main patch
(update-partition-key_v36.patch).

Attachment

conversion_map_changes.patch

pgsql-hackers by date:

From: Amit Khandekar
Date: 15 January 2018, 13:38:26
Subject: Re: [HACKERS] UPDATE of partition key

From: Marco Nenciarini
Date: 15 January 2018, 13:50:37
Subject: Re: [PATCH] Logical decoding of TRUNCATE

Re: [HACKERS] UPDATE of partition key - Mailing list pgsql-hackers

Attachment

Previous

Next