Thread: [PATCH] use separate PartitionedRelOptions structure to store partitioned table options

Hi!

This message is follow up to the "Get rid of the StdRdOptions" patch thread:
https://www.postgresql.org/message-id/2620882.s52SJui4ql@x200m

I've split patch into even smaller parts and commitfest want each patch in 
separate thread. So it is new thread.

The idea of this patch is following: If you read the code, partitioned tables 
do not have any options (you will not find RELOPT_KIND_PARTITIONED in 
boolRelOpts, intRelOpts, realRelOpts, stringRelOpts and enumRelOpts in 
reloption.c), but it uses StdRdOptions to store them (these no options).

If partitioned table is to have it's own option set (even if it is empty now) 
it would be better to save it into separate structure, like it is done for 
Views, not adding them to StdRdOptions, like current code hints to do.

So in this patch I am creating a structure that would store partitioned table 
options (it is empty for now as there are no options for this relation kind), 
and all other code that would use this structure as soon as the first option 
comes.

I think it is bad idea to suggest option adder to ad it to StdRdOption, we 
already have a big mess there. Better if he add it to an new empty structure.

-- 
Software Developer: https://www.upwork.com/freelancers/~014a87e140ff02c0da
Body-oriented Therapist: https://vk.com/nataraj_rebalancing  (Russian)
Attachment
On Sun, Oct 06, 2019 at 03:47:46PM +0300, Nikolay Shaplov wrote:
> This message is follow up to the "Get rid of the StdRdOptions" patch thread:
> https://www.postgresql.org/message-id/2620882.s52SJui4ql@x200m
>
> I've split patch into even smaller parts and commitfest want each patch in
> separate thread. So it is new thread.

Splitting concepts into different threads may be fine, and usually
recommended.  Splitting a set of patches into multiple entries to ease
review and your goal to get a patch integrated and posted all these
into the same thread is usually recommended.  Now posting a full set
of patches across multiple threads, in way so as they have
dependencies with each other, is what I would call a confusing
situation.  That's hard to follow.

> The idea of this patch is following: If you read the code, partitioned tables
> do not have any options (you will not find RELOPT_KIND_PARTITIONED in
> boolRelOpts, intRelOpts, realRelOpts, stringRelOpts and enumRelOpts in
> reloption.c), but it uses StdRdOptions to store them (these no options).

I am not even sure that we actually need that.  What kind of reloption
you would think would suit into this subset?
--
Michael

Attachment
В письме от понедельник, 7 октября 2019 г. 14:57:14 MSK пользователь Michael
Paquier написал:
> On Sun, Oct 06, 2019 at 03:47:46PM +0300, Nikolay Shaplov wrote:
> > This message is follow up to the "Get rid of the StdRdOptions" patch
> > thread: https://www.postgresql.org/message-id/2620882.s52SJui4ql@x200m
> >
> > I've split patch into even smaller parts and commitfest want each patch in
> > separate thread. So it is new thread.
>
> Splitting concepts into different threads may be fine, and usually
> recommended.  Splitting a set of patches into multiple entries to ease
> review and your goal to get a patch integrated and posted all these
> into the same thread is usually recommended.  Now posting a full set
> of patches across multiple threads, in way so as they have
> dependencies with each other, is what I would call a confusing
> situation.  That's hard to follow.
I understand that. I've tried to add new patches to original thread, but
commitfest did not accept that for some reason. You can try to add patch from
this letter https://www.postgresql.org/message-id/2620882.s52SJui4ql@x200m
just to see how it works.

Since discussion actually did not started yet, it can be moved anywhere you
suggest, but please tell how exactly it should be done, because I do not
understand what is the better way.

> > The idea of this patch is following: If you read the code, partitioned
> > tables do not have any options (you will not find RELOPT_KIND_PARTITIONED
> > in boolRelOpts, intRelOpts, realRelOpts, stringRelOpts and enumRelOpts in
> > reloption.c), but it uses StdRdOptions to store them (these no options).
> I am not even sure that we actually need that.  What kind of reloption
> you would think would suit into this subset?

Actually I do not know. But the author of partitioned patch, added a stub for
partitioned tables to have some reloptions in future. But this stub is
designed to use StdRdOptions. Which is not correct, as I presume. So here I am
correcting the stub.

--
Software Developer: https://www.upwork.com/freelancers/~014a87e140ff02c0da
Body-oriented Therapist: https://vk.com/nataraj_rebalancing  (Russian)
Attachment
Hello,

On Mon, Oct 7, 2019 at 6:43 PM Nikolay Shaplov <dhyan@nataraj.su> wrote:
> В письме от понедельник, 7 октября 2019 г. 14:57:14 MSK пользователь Michael
> Paquier написал:
> > On Sun, Oct 06, 2019 at 03:47:46PM +0300, Nikolay Shaplov wrote:
> > > The idea of this patch is following: If you read the code, partitioned
> > > tables do not have any options (you will not find RELOPT_KIND_PARTITIONED
> > > in boolRelOpts, intRelOpts, realRelOpts, stringRelOpts and enumRelOpts in
> > > reloption.c), but it uses StdRdOptions to store them (these no options).
> > I am not even sure that we actually need that.  What kind of reloption
> > you would think would suit into this subset?
>
> Actually I do not know. But the author of partitioned patch, added a stub for
> partitioned tables to have some reloptions in future. But this stub is
> designed to use StdRdOptions. Which is not correct, as I presume. So here I am
> correcting the stub.

I wrote the patch that introduced RELOPT_KIND_PARTITIONED.  Yes, it
was added as a stub relopt_kind to eventually apply to reloptions that
could be sensibly applied to partitioned tables.  We never got around
to working on making the existing reloptions relevant to partitioned
tables, nor did we add any new partitioned table-specific reloptions,
so it remained an unused relopt_kind.

IIUC, this patch invents PartitionedRelOptions as the binary
representation for future RELOPT_KIND_PARTITIONED parameters.  As long
as others are on board with using different *Options structs for
different object kinds, I see no problem with this idea.

Thanks,
Amit



В письме от вторник, 8 октября 2019 г. 16:00:49 MSK пользователь Amit Langote
написал:

> > > > The idea of this patch is following: If you read the code, partitioned
> > > > tables do not have any options (you will not find
> > > > RELOPT_KIND_PARTITIONED
> > > > in boolRelOpts, intRelOpts, realRelOpts, stringRelOpts and enumRelOpts
> > > > in
> > > > reloption.c), but it uses StdRdOptions to store them (these no
> > > > options).
> > >
> > > I am not even sure that we actually need that.  What kind of reloption
> > > you would think would suit into this subset?
> >
> > Actually I do not know. But the author of partitioned patch, added a stub
> > for partitioned tables to have some reloptions in future. But this stub
> > is designed to use StdRdOptions. Which is not correct, as I presume. So
> > here I am correcting the stub.
>
> I wrote the patch that introduced RELOPT_KIND_PARTITIONED.  Yes, it
> was added as a stub relopt_kind to eventually apply to reloptions that
> could be sensibly applied to partitioned tables.  We never got around
> to working on making the existing reloptions relevant to partitioned
> tables, nor did we add any new partitioned table-specific reloptions,
> so it remained an unused relopt_kind.
Thank you for clarifying thing.

> IIUC, this patch invents PartitionedRelOptions as the binary
> representation for future RELOPT_KIND_PARTITIONED parameters.  As long
> as others are on board with using different *Options structs for
> different object kinds, I see no problem with this idea.
Yes, this is correct. Some Access Methods already use it's own Options
structure. As far as I can guess StdRdOptions still exists only for historical
reasons, and became quite a mess because of adding all kind of stuff there.
Better to separate it.

BTW, as far as you are familiar with this part of the code, may be you will
join the review if this particular patch?

--
Software Developer: https://www.upwork.com/freelancers/~014a87e140ff02c0da
Body-oriented Therapist: https://vk.com/nataraj_rebalancing  (Russian)



Hello,

On Tue, Oct 8, 2019 at 7:50 PM Nikolay Shaplov <dhyan@nataraj.su> wrote:
> В письме от вторник, 8 октября 2019 г. 16:00:49 MSK пользователь Amit Langote
> написал:
> > IIUC, this patch invents PartitionedRelOptions as the binary
> > representation for future RELOPT_KIND_PARTITIONED parameters.  As long
> > as others are on board with using different *Options structs for
> > different object kinds, I see no problem with this idea.
> Yes, this is correct. Some Access Methods already use it's own Options
> structure. As far as I can guess StdRdOptions still exists only for historical
> reasons, and became quite a mess because of adding all kind of stuff there.
> Better to separate it.
>
> BTW, as far as you are familiar with this part of the code, may be you will
> join the review if this particular patch?

Sure, I will try to check your patch.

Thanks,
Amit



Hello,

On Sun, Oct 6, 2019 at 9:48 PM Nikolay Shaplov <dhyan@nataraj.su> wrote:
> This message is follow up to the "Get rid of the StdRdOptions" patch thread:
> https://www.postgresql.org/message-id/2620882.s52SJui4ql@x200m
>
> I've split patch into even smaller parts and commitfest want each patch in
> separate thread. So it is new thread.
>
> The idea of this patch is following: If you read the code, partitioned tables
> do not have any options (you will not find RELOPT_KIND_PARTITIONED in
> boolRelOpts, intRelOpts, realRelOpts, stringRelOpts and enumRelOpts in
> reloption.c), but it uses StdRdOptions to store them (these no options).
>
> If partitioned table is to have it's own option set (even if it is empty now)
> it would be better to save it into separate structure, like it is done for
> Views, not adding them to StdRdOptions, like current code hints to do.
>
> So in this patch I am creating a structure that would store partitioned table
> options (it is empty for now as there are no options for this relation kind),
> and all other code that would use this structure as soon as the first option
> comes.
>
> I think it is bad idea to suggest option adder to ad it to StdRdOption, we
> already have a big mess there. Better if he add it to an new empty structure.

I tend to agree that this improves readability of the reloptions code a bit.

Some comments on the patch:

How about naming the function partitioned_table_reloptions() instead
of partitioned_reloptions()?

+ * Option parser for partitioned relations

Since partitioned_reloptions only caters to partitioned "tables",
maybe use "tables" here instead of "relations".

+  /*
+   * Since there is no options for patitioned table for now, we just do
+   * validation to report incorrect option error and leave.
+   */

Fix typo and minor rewording:

"Since there are no options for partitioned tables..."

+    switch ((int)relkind)

Needs a space between (int) and relkind, but I don't think the (int)
cast is really necessary.  I don't see it in other places.

+ *     Binary representation of relation options for Partitioned relations.

"for partitioned tables".

Speaking of partitioned relations vs tables, I see that you didn't
touch partitioned indexes (RELKIND_PARTITIONED_INDEX relations).  Is
that because we leave option handling to the index AMs?

Thanks,
Amit



В Thu, 10 Oct 2019 15:50:05 +0900
Amit Langote <amitlangote09@gmail.com> пишет:

> > I think it is bad idea to suggest option adder to ad it to
> > StdRdOption, we already have a big mess there. Better if he add it
> > to an new empty structure.
>
> I tend to agree that this improves readability of the reloptions code
> a bit.
>
> Some comments on the patch:
>
> How about naming the function partitioned_table_reloptions() instead
> of partitioned_reloptions()?

I have my doubts about using word table here... In relational model
there are no such concept as "table", it uses concept "relation". And
in postgres there were no tables, there were only relations. Heap
relation, toast relation, all kind of index relations... I do not
understand when and why word "table" appeared when we speak about some
virtual relation that is made of several real heap relation. That is
why I am not using the word table here...

But since you are the author of partition table code, and this code is
already accepted in the core, you should know better. So I will change
it the way you say.

> +    switch ((int)relkind)
>
> Needs a space between (int) and relkind, but I don't think the (int)
> cast is really necessary.  I don't see it in other places.
Oh. Yeh. This is my mistake... I had some strange compilation
problems, and this is a remnant of my efforts to find the cause of
it, I've forgot to clean...
Thanks!

> Speaking of partitioned relations vs tables, I see that you didn't
> touch partitioned indexes (RELKIND_PARTITIONED_INDEX relations).  Is
> that because we leave option handling to the index AMs?

Because for partitioned indexes the code says "Use same options
original index does"

bytea *
extractRelOptions(HeapTuple tuple, TupleDesc tupdesc,
                  amoptions_function amoptions)
{
............
    switch (classForm->relkind)
    {
        case RELKIND_RELATION:
        case RELKIND_TOASTVALUE:
        case RELKIND_MATVIEW:
            options = heap_reloptions(classForm->relkind, datum, false);
            break;
        case RELKIND_PARTITIONED_TABLE:
            options = partitioned_table_reloptions(datum, false);
            break;
        case RELKIND_VIEW:
            options = view_reloptions(datum, false);
            break;
        case RELKIND_INDEX:
        case RELKIND_PARTITIONED_INDEX:
            options = index_reloptions(amoptions, datum, false);
            break;


Here you see the function accepts amoptions method that know how to
parse options for a particular index, and passes it to index_reloptions
functions. For both indexes and partitioned indexes it is taken from AM
"method" amoptions. So options uses exactly the same code and same
options both for indexes and for partitioned indexes.

I do not know if it is correct from global point of view, but from the
point of view of reloptions engine, it does not require any attention:
change index options and get partitioned_index options for free...

Actually I would expect some problems there, because sooner or later,
somebody would want to set custom fillfactor to partitioned table, or
set some custom autovacuum options for it. But I would prefer to think
about it when I am done with reloption engine rewriting... Working in
both direction will cause more trouble then get benefits...



Attachment
> > I think it is bad idea to suggest option adder to ad it to
> > StdRdOption, we already have a big mess there. Better if he add it
> > to an new empty structure.  
>
> I tend to agree that this improves readability of the reloptions code
> a bit.
>
> Some comments on the patch:
>
> How about naming the function partitioned_table_reloptions() instead
> of partitioned_reloptions()?  

I have my doubts about using word table here... In relational model
there are no such concept as "table", it uses concept "relation". And
in postgres there were no tables, there were only relations. Heap
relation, toast relation, all kind of index relations... I do not
understand when and why word "table" appeared when we speak about some
virtual relation that is made of several real heap relation. That is
why I am not using the word table here...

But since you are the author of partition table code, and this code is
already accepted in the core, you should know better. So I will change
it the way you say.
 
> +    switch ((int)relkind)
>
> Needs a space between (int) and relkind, but I don't think the (int)
> cast is really necessary.  I don't see it in other places.  
Oh. Yeh. This is my mistake... I had some strange compilation
problems, and this is a remnant of my efforts to find the cause of
it, I've forgot to clean...
Thanks!
   
> Speaking of partitioned relations vs tables, I see that you didn't
> touch partitioned indexes (RELKIND_PARTITIONED_INDEX relations).  Is
> that because we leave option handling to the index AMs?  

Because for partitioned indexes the code says "Use same options
original index does"

bytea
* extractRelOptions(HeapTuple tuple, TupleDesc
tupdesc, amoptions_function amoptions)                                 
{          
............                                     
    switch
(classForm->relkind)
{ case
RELKIND_RELATION: case
RELKIND_TOASTVALUE: case
RELKIND_MATVIEW: options = heap_reloptions(classForm->relkind, datum,
false);
break; case
RELKIND_PARTITIONED_TABLE: options =
partitioned_table_reloptions(datum, false);
break; case
RELKIND_VIEW: options = view_reloptions(datum,
false);
break; case
RELKIND_INDEX: case
RELKIND_PARTITIONED_INDEX: options = index_reloptions(amoptions, datum,
false); break;                  


Here you see the function accepts amoptions method that know how to
parse options for a particular index, and passes it to index_reloptions
functions. For both indexes and partitioned indexes it is taken from AM
"method" amoptions. So options uses exactly the same code and same
options both for indexes and for partitioned indexes.

I do not know if it is correct from global point of view, but from the
point of view of reloptions engine, it does not require any attention:
change index options and get partitioned_index options for free...

Actually I would expect some problems there, because sooner or later,
somebody would want to set custom fillfactor to partitioned table, or
set some custom autovacuum options for it. But I would prefer to think
about it when I am done with reloption engine rewriting... Working in
both direction will cause more trouble then get benefits...
Attachment
Hi Nikolay,

Sorry for the late reply.

On Fri, Oct 11, 2019 at 7:38 PM Nikolay Shaplov <dhyan@nataraj.su> wrote:
> В Thu, 10 Oct 2019 15:50:05 +0900
> Amit Langote <amitlangote09@gmail.com> пишет:
> > > I think it is bad idea to suggest option adder to ad it to
> > > StdRdOption, we already have a big mess there. Better if he add it
> > > to an new empty structure.
> >
> > I tend to agree that this improves readability of the reloptions code
> > a bit.
> >
> > Some comments on the patch:
> >
> > How about naming the function partitioned_table_reloptions() instead
> > of partitioned_reloptions()?
>
> I have my doubts about using word table here... In relational model
> there are no such concept as "table", it uses concept "relation". And
> in postgres there were no tables, there were only relations. Heap
> relation, toast relation, all kind of index relations... I do not
> understand when and why word "table" appeared when we speak about some
> virtual relation that is made of several real heap relation. That is
> why I am not using the word table here...
>
> But since you are the author of partition table code, and this code is
> already accepted in the core, you should know better. So I will change
> it the way you say.

Sure, they're all relations in the abstract, but we've got to
distinguish different kinds in the code somehow.

Anyway, I just want the code to be consistent with what we've already
got, especially, considering that we might need similar function for
partitioned "indexes" in the future.

> > Speaking of partitioned relations vs tables, I see that you didn't
> > touch partitioned indexes (RELKIND_PARTITIONED_INDEX relations).  Is
> > that because we leave option handling to the index AMs?
>
> Because for partitioned indexes the code says "Use same options
> original index does"
>
> bytea *
> extractRelOptions(HeapTuple tuple, TupleDesc tupdesc,
>                   amoptions_function amoptions)
> {
> ............
>     switch (classForm->relkind)
>     {
>         case RELKIND_RELATION:
>         case RELKIND_TOASTVALUE:
>         case RELKIND_MATVIEW:
>             options = heap_reloptions(classForm->relkind, datum, false);
>             break;
>         case RELKIND_PARTITIONED_TABLE:
>             options = partitioned_table_reloptions(datum, false);
>             break;
>         case RELKIND_VIEW:
>             options = view_reloptions(datum, false);
>             break;
>         case RELKIND_INDEX:
>         case RELKIND_PARTITIONED_INDEX:
>             options = index_reloptions(amoptions, datum, false);
>             break;
>
>
> Here you see the function accepts amoptions method that know how to
> parse options for a particular index, and passes it to index_reloptions
> functions. For both indexes and partitioned indexes it is taken from AM
> "method" amoptions. So options uses exactly the same code and same
> options both for indexes and for partitioned indexes.
>
> I do not know if it is correct from global point of view, but from the
> point of view of reloptions engine, it does not require any attention:
> change index options and get partitioned_index options for free...
>
> Actually I would expect some problems there, because sooner or later,
> somebody would want to set custom fillfactor to partitioned table, or
> set some custom autovacuum options for it.

Yeah, I have never run into this code before, but this might need
revisiting, if only to be consistent with the table counterpart.

> But I would prefer to think
> about it when I am done with reloption engine rewriting... Working in
> both direction will cause more trouble then get benefits...

Sure, this seems like a topic for another thread.

I looked atthe v2 patch and noticed a typo:

+ *     Binary representation of relation options for rtitioned tables.

s/rtitioned/partitioned/g

Other than that, looks good.

Thanks,
Amit



В письме от среда, 23 октября 2019 г. 11:59:45 MSK пользователь Amit Langote
написал:

> Sorry for the late reply.
Same apologies from my side. Get decent time-slot for postgres dev only now.



> I looked atthe v2 patch and noticed a typo:
>
> + *     Binary representation of relation options for rtitioned tables.
>
> s/rtitioned/partitioned/g
>
> Other than that, looks good.
Here goes v3 patch with the typo fixed

--
Software Developer: https://www.upwork.com/freelancers/~014a87e140ff02c0da
Body-oriented Therapist: https://vk.com/nataraj_rebalancing  (Russian)
Attachment
On Mon, Nov 11, 2019 at 05:22:32PM +0300, Nikolay Shaplov wrote:
> Here goes v3 patch with the typo fixed

Still one here in v3 of the patch:
+   * Since there are no options for patitioned tables for now, we just do
+   * validation to report incorrect option error and leave.
It looks like you are having a hard time with partitioned tables.

+  if (validate)
+       parseRelOptions(reloptions, validate, RELOPT_KIND_PARTITIONED,
+                       &numoptions);
We have been through great length to have build_reloptions, so
wouldn't it be better to also have this code path do so?  Sure you
need to pass NULL for the parsing table..  But there is a point to
reduce the code paths using directly parseRelOptions() and the
follow-up, expected calls to allocate and to fill in the set of
reloptions.
--
Michael

Attachment
On Tue, Nov 12, 2019 at 01:50:03PM +0900, Michael Paquier wrote:
> We have been through great length to have build_reloptions, so
> wouldn't it be better to also have this code path do so?  Sure you
> need to pass NULL for the parsing table..  But there is a point to
> reduce the code paths using directly parseRelOptions() and the
> follow-up, expected calls to allocate and to fill in the set of
> reloptions.

So I have been looking at this one, and finished with the attached.
It looks much better to use build_reloptions() IMO, taking advantage
of the same sanity checks present for the other relkinds.
--
Michael

Attachment
В письме от среда, 13 ноября 2019 г. 16:30:29 MSK пользователь Michael Paquier
написал:
> On Tue, Nov 12, 2019 at 01:50:03PM +0900, Michael Paquier wrote:
> > We have been through great length to have build_reloptions, so
> > wouldn't it be better to also have this code path do so?  Sure you
> > need to pass NULL for the parsing table..  But there is a point to
> > reduce the code paths using directly parseRelOptions() and the
> > follow-up, expected calls to allocate and to fill in the set of
> > reloptions.
>
> So I have been looking at this one, and finished with the attached.
> It looks much better to use build_reloptions() IMO, taking advantage
> of the same sanity checks present for the other relkinds.
Thanks!

I did not read that thread yet, when I sent v3 patch.
build_reloptions is a good stuff and we should use it for sure.

I've looked at yours v4 version of the patch, it is exactly what we need here.
Can we commit it as it is?


--
Software Developer: https://www.upwork.com/freelancers/~014a87e140ff02c0da
Body-oriented Therapist: https://vk.com/nataraj_rebalancing  (Russian)



On Wed, Nov 13, 2019 at 05:02:24PM +0300, Nikolay Shaplov wrote:
> I did not read that thread yet, when I sent v3 patch.
> build_reloptions is a good stuff and we should use it for sure.
>
> I've looked at yours v4 version of the patch, it is exactly what we need here.
> Can we commit it as it is?

I have done an extra lookup, removing PartitionedRelOptions because we
have no need for it yet, and committed the split.
--
Michael

Attachment