Thread: [PATCH] get rid of StdRdOptions, use individual binary reloptions representation for each relation kind instead

This is part or my bigger patch
https://www.postgresql.org/message-id/flat/2146419.veIEZdk4E4@x200m#2146419.veIEZdk4E4@x200mwe've decided to 
 
commit by smaller parts.

Now in postgres an StdRdOptions structure is used as binary represenations of 
reloptions for heap, toast, and some indexes. It has a number of fields, and 
only heap relation uses them all. Toast relations uses only autovacuum 
options, and indexes uses only fillfactor option.

So for example if you set custom fillfactor value for some index, then it will 
lead to allocating 84 bytes of memory (sizeof StdRdOptions on i386) and only 8 
bytes will be actually used (varlena header and fillfactor value). 74 bytes is 
not much, but allocating them for each index for no particular reason is bad 
idea, as I think.

Moreover when I wrote my big reloption refactoring patch, I came to "one 
reloption kind - one binary representation" philosophy. It allows to write 
code with more logic in it.

This patch replaces StdRdOptions with HeapRelOptions, ToastRelOptions, 
BTRelOptions, HashRelOptions, SpGistRelOptions and PartitionedRelOptions

one for each relation kind that were using StdRdOptions before.

The second thing I've done, I've renamed Relation* macroses from 
src/include/utils/rel.h, that were working with reloptions. I've renamed them 
into Heap*, Toast* and View* (depend on what relation options they were 
actually using)

I did it because there names were misleading. For example 
RelationHasCheckOption can be called only for View relation, and will give 
wrong result for other relation types. It just takes binary representation of 
reloptions, cast is to (ViewOptions *) and then returns some value from it. 
Naming it as ViewHasCheckOption would better reflect what it actually do, and 
strictly specify that it is applicable only to View relations.


Possible flaws:

I replaced 

saveFreeSpace = RelationGetTargetPageFreeSpace(state->rs_new_rel,
                                    HEAP_DEFAULT_FILLFACTOR);

with

if (IsToastRelation(state->rs_new_rel))
      saveFreeSpace = ToastGetTargetPageFreeSpace();
else
      saveFreeSpace = HeapGetTargetPageFreeSpace(state->rs_new_rel);

wherever I met it (and other cases like that), but I am not sure if in some 
cases that part of code is used for heap only or not. So may be this "ifs" is 
not needed and should be removed, and only Heap-case should be left. But I am 
not that much familiar with postgres internals to see it for sure... I need 
advice of more experienced developers here.

-- 
Do code for fun.
Attachment
Hi,

On 2018-02-22 19:48:46 +0300, Nikolay Shaplov wrote:
> This is part or my bigger patch
https://www.postgresql.org/message-id/flat/2146419.veIEZdk4E4@x200m#2146419.veIEZdk4E4@x200mwe've decided to 
 
> commit by smaller parts.

I've not read that thread. Is this supposed to be a first step towards
something larger?


> So for example if you set custom fillfactor value for some index, then it will 
> lead to allocating 84 bytes of memory (sizeof StdRdOptions on i386) and only 8 
> bytes will be actually used (varlena header and fillfactor value). 74 bytes is 
> not much, but allocating them for each index for no particular reason is bad 
> idea, as I think.

I'm not sure this is a particularly strong motivator though?  Does the
patch have a goal besides saving a few bytes?


> Moreover when I wrote my big reloption refactoring patch, I came to "one 
> reloption kind - one binary representation" philosophy. It allows to write 
> code with more logic in it.

I have no idea what this means?


Are you aiming this for v11 or v12?


Greetings,

Andres Freund


В письме от 1 марта 2018 16:15:32 пользователь Andres Freund написал:

> > This is part or my bigger patch
> > https://www.postgresql.org/message-id/flat/2146419.veIEZdk4E4@x200m#21464
> > 19.veIEZdk4E4@x200m we've decided to commit by smaller parts.
>
> I've not read that thread. Is this supposed to be a first step towards
> something larger?

I've started from the idea of opclass options. Same as here
https://commitfest.postgresql.org/17/1559/
And found out that current reloptions code is not flexible at all, and could
not be easily reused for other kind of options (that are not reloptions).

So I tried to change reloptions code to make universal, and then use it for
any option purposes.

In the patch https://commitfest.postgresql.org/14/992/ I created a new file
options.c, that works with options as with an abstraction. It has some option
definition structure, that has all information about how this options should
be parsed and transformed into binary representation, I called this structure
OptionsCatalog, but name can be changed. And it also have all kind of
functions that do all needed tranformation with options.

Then reloptions.c uses functions from options options.c for all options
transformations, and takes care about relation specificity of reloptions.

While doing all of these, I first of all had to adjust code that was written
around reloptions, so that there code would stay in tune with new reloptions
code. And second, I met places where current reloption related code were
imperfect, and I decided to change it too...

Since I get a really big patch as a result, it was decided to commit it in
parts.

This patch is the adjusting of reloption related code. It does not change a
great deal, but when you first commit it, the main reloptions refactoring
patch will no longer look like a big mess, it will became much more logical.

Enum options patch is more about making coder a little bit more perfect. It is
not really needed for the main patch, but it is more easy to introduce it
before, then after.
https://commitfest.postgresql.org/17/1489/

There is one more patch, that deals with toast.* options for tables with no
TOAST, but we can set it aside, it is independent enough, to deal with it
later...
https://commitfest.postgresql.org/17/1486/

The patch with reloptions tests I've written while working on this patch, were
already commited
https://commitfest.postgresql.org/15/1314/


> > So for example if you set custom fillfactor value for some index, then it
> > will lead to allocating 84 bytes of memory (sizeof StdRdOptions on i386)
> > and only 8 bytes will be actually used (varlena header and fillfactor
> > value). 74 bytes is not much, but allocating them for each index for no
> > particular reason is bad idea, as I think.
> I'm not sure this is a particularly strong motivator though?  Does the
> patch have a goal besides saving a few bytes?

My real motivation for this patch is to make code more uniform. So the patch
over it will be much clearer. And to make result code more clear and uniform
too.

I guess long time ago, the StdRdOptions were the way to make code uniform.
There were only one binary representation of options, and all relation were
using it. Quite uniform.

But now, some relation kinds uses it's own options binary representations.
Some still uses StdRdOptions, but only some elements from it. It is not
uniform enough for me.

If start using individual binary representation for each relation kind, then
code will be uniform and homogeneous again, and my next patch over it will be
more easy to read and understand.

> > Moreover when I wrote my big reloption refactoring patch, I came to "one
> > reloption kind - one binary representation" philosophy. It allows to write
> > code with more logic in it.
> I have no idea what this means?
I hope I managed to explain it above... May be I am not good enough with
explaining, or with English. Or with explaining in English. If it is not clear
enough, please tell me, what points are not clear, I'll try to explain more...


> Are you aiming this for v11 or v12?
I hope to commit StdRdOptions and Enum_Options patches before v11.
Hope it does not go against any rules, since there patches does not change any
big functionality, just do some refactoring and optimization, of the code that
already exists.

As for main reloptions refactoring patch, if there is a chance to get commited
before v11 release, it would be great. If not, I hope to post it to
commitfest, right after v11 were released, so it would be possible to commit
opclass options upon it before v12 release.


--
Do code for fun.
Attachment
Hi,

On 2018-03-02 20:22:21 +0300, Nikolay Shaplov wrote:
> Since I get a really big patch as a result, it was decided to commit it in 
> parts.

I get that, but I strongly suggest not creating 10 loosely related
threads, but keeping it as a patch series in one thread. It's really
hard to follow for people not continually paying otherwise.  Having to
search the list, collect together multiple threads, trying to read them
in some chronological order... That's not a reasonably efficient way to
interact, therefore the likelihood of timely review will be reduced.

Greetings,

Andres Freund


В письме от 2 марта 2018 11:27:49 пользователь Andres Freund написал:

> > Since I get a really big patch as a result, it was decided to commit it in
> > parts.
>
> I get that, but I strongly suggest not creating 10 loosely related
> threads, but keeping it as a patch series in one thread. It's really
> hard to follow for people not continually paying otherwise.

Oups. Sorry I thought I should do other way round and create a new thread for
new patch. And tried to post a link to other threads for connectivity wherever
I can.

Will do it as you say from now on.

But I am still confused what should I do if I am commiting two patch from the
patch series in simultaneously...

--
Do code for fun.
Attachment
Hi!

I've rebased the patch against recent master.

I've imported changes from 857f9c36 commit.

BTW this commit shows why do this patch is important: 857f9c36 adds new option 
for b-tree indexes. But thanks to the StdRdOptions this option will exist for 
no practical use in all heaps that has just any option set to non-default 
value, and in indexes that use StdRdOptions (and also has any option set)
And there will be more. StdRdOptions is long outdated solution and it needs to 
be replaced.


-- 
Do code for fun.
Attachment
On Fri, Sep 14, 2018 at 09:30:25PM +0300, Nikolay Shaplov wrote:
> BTW this commit shows why do this patch is important: 857f9c36 adds new option
> for b-tree indexes. But thanks to the StdRdOptions this option will exist for
> no practical use in all heaps that has just any option set to non-default
> value, and in indexes that use StdRdOptions (and also has any option set)
> And there will be more. StdRdOptions is long outdated solution and it needs to
> be replaced.

This patch does not apply anymore, so this is moved to next CF, waiting
for author.
--
Michael

Attachment
В письме от 2 октября 2018 13:46:13 пользователь Michael Paquier написал:
> On Fri, Sep 14, 2018 at 09:30:25PM +0300, Nikolay Shaplov wrote:
> > BTW this commit shows why do this patch is important: 857f9c36 adds new
> > option for b-tree indexes. But thanks to the StdRdOptions this option
> > will exist for no practical use in all heaps that has just any option set
> > to non-default value, and in indexes that use StdRdOptions (and also has
> > any option set) And there will be more. StdRdOptions is long outdated
> > solution and it needs to be replaced.
>
> This patch does not apply anymore, so this is moved to next CF, waiting
> for author.
Oup...It seems to me that I've fogot to rebase it when it is needed...
Did it now



--
Do code for fun.
Attachment
> On Mon, Nov 19, 2018 at 2:30 PM Nikolay Shaplov <dhyan@nataraj.su> wrote:
>
> В письме от 2 октября 2018 13:46:13 пользователь Michael Paquier написал:
> > On Fri, Sep 14, 2018 at 09:30:25PM +0300, Nikolay Shaplov wrote:
> > > BTW this commit shows why do this patch is important: 857f9c36 adds new
> > > option for b-tree indexes. But thanks to the StdRdOptions this option
> > > will exist for no practical use in all heaps that has just any option set
> > > to non-default value, and in indexes that use StdRdOptions (and also has
> > > any option set) And there will be more. StdRdOptions is long outdated
> > > solution and it needs to be replaced.
> >
> > This patch does not apply anymore, so this is moved to next CF, waiting
> > for author.
> Oup...It seems to me that I've fogot to rebase it when it is needed...
> Did it now

Looks like there are some problems with this patch on windows:

src/backend/access/common/reloptions.c(1469): error C2059: syntax error : '}'

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.22359

> On Fri, Mar 2, 2018 at 8:28 PM Andres Freund <andres@anarazel.de> wrote:
>
> On 2018-03-02 20:22:21 +0300, Nikolay Shaplov wrote:
> > Since I get a really big patch as a result, it was decided to commit it in
> > parts.
>
> I get that, but I strongly suggest not creating 10 loosely related
> threads, but keeping it as a patch series in one thread. It's really
> hard to follow for people not continually paying otherwise.

Totally agree. Probably this also makes it harder to see the big picture behind
this patch - which is in turn probably preventing it from getting some more
review. I hope it doesn't sounds ridiculous, taking into account your efforts
by splitting the patch, but maybe it makes sense to gather these pieces
together (as a separate commits, of course) in one thread?


В письме от пятница, 30 ноября 2018 г. 23:57:21 MSK пользователь Dmitry Dolgov
написал:

> Looks like there are some problems with this patch on windows:

> src/backend/access/common/reloptions.c(1469): error C2059: syntax error :
> '}'
>
> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.22359
Phew... It took me a while to get proper windows building environment... But
finally I did it...
This was some MSVC specific error, that happens when you create empty array or
something like this.  I've rewritten that function to remove this array at
all. Now MSVC successfully builds it.

I also did some codestyle improvements, and rebased the patch against the
current master.

The new patch is in attachment.

> > I get that, but I strongly suggest not creating 10 loosely related
> > threads, but keeping it as a patch series in one thread. It's really
> > hard to follow for people not continually paying otherwise.
>
> Totally agree. Probably this also makes it harder to see the big picture
> behind this patch - which is in turn probably preventing it from getting
> some more review. I hope it doesn't sounds ridiculous, taking into account
> your efforts by splitting the patch, but maybe it makes sense to gather
> these pieces together (as a separate commits, of course) in one thread?

The big picture is following. I started from the task: Add possibility to set
up opclass parameters. (Nikita Glukhov now doing it)

I found an reloptions code, that does almost same thing, but it is not flexible
at all, and I can't reuse it for opclass parameters as it is now.

So I came to decision to rewrite reloptions code, so it can be used for
reloptions opclass options and any other kind of options we may need in
future.

While rewriting the code, I found some places in the code that goes from what
seems to be a very long time, and also need refreshing.

This is one of the things.

It is not 100% necessary. Postgres will work with it as it is for ten years or
more. But since I've touched this part of the code, I want to make this code
more consistent, and more neat.

That's what I am doing. That is what all this patches about.

I'd be happy if we move this task at last.
Attachment
One thing I would like to revise here is to avoid unnecessary API change
-- for example the RelationHasCascadedCheckOption macro does not really
need to be renamed because it only applies to views, so there's no
possible conflict with other relation types.  We can keep the original
name and add a StaticAssert that the given relation is indeed a view.
This should keep churn somewhat low.  Of course, this doesn't work for
some options where you need a different one for different relation
kinds, such as fillfactor, but that's unavoidable.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В письме от среда, 2 января 2019 г. 0:05:10 MSK пользователь Alvaro Herrera
написал:
> One thing I would like to revise here is to avoid unnecessary API change
> -- for example the RelationHasCascadedCheckOption macro does not really
> need to be renamed because it only applies to views, so there's no
> possible conflict with other relation types.  We can keep the original
> name and add a StaticAssert that the given relation is indeed a view.
> This should keep churn somewhat low.  Of course, this doesn't work for
> some options where you need a different one for different relation
> kinds, such as fillfactor, but that's unavoidable.

My intention was to make API names more consistent. If you are addressing View
specific option, it is good to address it via View[Something] macros or
function. Relations[Something] seems to be a bad name, since we are dealing
not with any relation in general, but with a view relation.

This is internal API, right? If we change it everywhere, then it is changed
and nothing will be broken?

May be it may lead to problems I am unable to see, but I still unable to see
them so I can't make any judgment about it.

If you insist (you have much more experience than I) I would do as you advise,
but still I do not understand.


On 2019-Jan-02, Nikolay Shaplov wrote:

> This is internal API, right? If we change it everywhere, then it is changed 
> and nothing will be broken?

No, it's exported for extensions to use.  If we change it unnecessarily,
extension authors will hate me (not you) for breaking the compile and
requiring an #if VERSION patch.

These may not be in common usage, but I don't need any additional
reasons for people to hate me.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В письме от четверг, 3 января 2019 г. 16:10:20 MSK пользователь Alvaro Herrera
написал:
> On 2019-Jan-02, Nikolay Shaplov wrote:
> > This is internal API, right? If we change it everywhere, then it is
> > changed and nothing will be broken?
>
> No, it's exported for extensions to use.  If we change it unnecessarily,
> extension authors will hate me (not you) for breaking the compile and
> requiring an #if VERSION patch.

Ok, that's a good reason...

Can we think about backward compatibility aliases?

#define ViewHasCheckOption(relation)                       \
     ((relation)->rd_options &&                        \
         ((ViewOptions *) (relation)->rd_options)->check_option_offset != 0)

/* Alias for backward compatibility */
#define RelationHasCheckOption(relation) ViewHasCheckOption(relation)

And keep them for as log as needed to avoid #if VERSION in thirdparty code?

Or that is not the case?



On 2019-Jan-03, Nikolay Shaplov wrote:

> Can we think about backward compatibility aliases? 
> 
> #define ViewHasCheckOption(relation)                       \
>      ((relation)->rd_options &&                        \
>          ((ViewOptions *) (relation)->rd_options)->check_option_offset != 0)
> 
> /* Alias for backward compatibility */
> #define RelationHasCheckOption(relation) ViewHasCheckOption(relation)
> 
> And keep them for as log as needed to avoid #if VERSION in thirdparty code?

Well, if you do this, at some point you need to tell the extension
author to change the code.  Or they can just keep the code unchanged
forever.  I don't think it's worth the bother.

I would have liked to get a StaticAssert in the definition, but I don't
think it's possible.  A standard Assert() should be possible, though.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В письме от четверг, 3 января 2019 г. 17:15:08 MSK пользователь Alvaro Herrera
написал:

> > Can we think about backward compatibility aliases?
.....
> > And keep them for as log as needed to avoid #if VERSION in thirdparty
> > code?
> Well, if you do this, at some point you need to tell the extension
> author to change the code.  Or they can just keep the code unchanged
> forever.  I don't think it's worth the bother.
Ok, you are the Boss ;-)

I've reverted all the macros names back to Relation* except those related to
fillfactor.

> I would have liked to get a StaticAssert in the definition, but I don't
> think it's possible.  A standard Assert() should be possible, though.

This is a really good idea. I felt uncomfortable with this (ViewOptions *)
type convertation without checking that it is really View. With Assert I fell
much more safe.

I've added AssertMacro to all options related macroses.

And so, adding asserts discovered a bug with parallel_workers options. That
are defined as Heap only option, but in get_relation_info in src/backend/
optimizer/util/plancat.c  a RelationGetParallelWorkers macros is also called
for toasts and other kinds of relations.
I've started a new thread dedicated to this issue, since it needs to be fixed
regardless to this patch.
And for now  I added here a hotfix for this issue that is good enough for now.


I also reworked some comments. BTW do you know what is this comments spoke
about:

 * RelationGetFillFactor() and RelationGetTargetPageFreeSpace() can only
 * be applied to relations that use this format or a superset for
 * private options data.

It is speaking about some old times when there can be some other type of
options? 'cause I do not understand what it is speaking about.
I've removed it, I hope I did not remove anything important.
Attachment
You introduced new macros IsHeapRelation and IsViewRelation, but I don't
want to introduce such API.  Such things have been heavily contested and
I don't to have one more thing to worry about for this patch, so please
just put the relkind directly in the code.

On 2019-Jan-07, Nikolay Shaplov wrote:

> I also reworked some comments. BTW do you know what is this comments spoke 
> about:
> 
>  * RelationGetFillFactor() and RelationGetTargetPageFreeSpace() can only        
>  * be applied to relations that use this format or a superset for               
>  * private options data.
> 
> It is speaking about some old times when there can be some other type of 
> options? 'cause I do not understand what it is speaking about. 
> I've removed it, I hope I did not remove anything important.

As I understand it's talking about the varlenas being StdRdOptions and
not anything else.  I think extensibility could cause some relkinds to
use different options.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В письме от четверг, 17 января 2019 г. 20:33:06 MSK пользователь Alvaro
Herrera написал:

> You introduced new macros IsHeapRelation and IsViewRelation, but I don't
> want to introduce such API.  Such things have been heavily contested and
> I don't to have one more thing to worry about for this patch, so please
> just put the relkind directly in the code.
Sorry.
I've been trying to avoid repeating

   (AssertMacro(relation->rd_rel->relkind == RELKIND_RELATION ||       \
                relation->rd_rel->relkind == RELKIND_MATVIEW),         \
all the time.

Fixed.

Also I make more relaxed parallel_workers behavior in get_relation_info as we
discussed in another thread.
Attachment
Hi Nikolay,

This patch does not apply.  Please refer to http://commitfest.cputube.org/ and update it.
How about separating your patch by feature or units that can be phased commit.
For example, adding assert macro at first, refactoring StdRdOptions by the next, etc.

So I change status to "Waiting for Author".

Regards,
Aya Iwata

Hello.

At Mon, 18 Mar 2019 03:03:04 +0000, "Iwata, Aya" <iwata.aya@jp.fujitsu.com> wrote in
<71E660EB361DF14299875B198D4CE5423DF05777@g01jpexmbkw25>
> This patch does not apply.  Please refer to http://commitfest.cputube.org/ and update it.
> How about separating your patch by feature or units that can be phased commit.
> For example, adding assert macro at first, refactoring StdRdOptions by the next, etc.
> 
> So I change status to "Waiting for Author".

That seems to be a good oppotunity. I have some comments.

rel.h:
-#define RelationGetToastTupleTarget(relation, defaulttarg) \
-    ((relation)->rd_options ? \
-     ((StdRdOptions *) (relation)->rd_options)->toast_tuple_target : (defaulttarg))
+#define RelationGetToastTupleTarget(relation, defaulttarg)                 \
+    (AssertMacro(relation->rd_rel->relkind == RELKIND_RELATION ||        \
+                 relation->rd_rel->relkind == RELKIND_MATVIEW),            \
+     ((relation)->rd_options ?                                             \
+      ((HeapRelOptions *) (relation)->rd_options)->toast_tuple_target : \
+            (defaulttarg)))

Index AMs parse reloptions by their handler
functions. HeapRelOptions in this patch are parsed in
relation_reloptions calling heap_reloptions. Maybe at least it
should be called as table_options. But I'm not sure what is the
desirable shape of relation_reloptions for the moment.

hio.c:

-    saveFreeSpace = RelationGetTargetPageFreeSpace(relation,
-                                                   HEAP_DEFAULT_FILLFACTOR);
+    if (IsToastRelation(relation))
+        saveFreeSpace = ToastGetTargetPageFreeSpace();
+    else
+        saveFreeSpace = HeapGetTargetPageFreeSpace(relation);

This locution appears four times in the patch and that's the all
where the two macros appear. And it might not be good that
RelationGetBufferForTuple identifies a heap directly since I
suppose that currently tost is a part of heap. Thus it'd be
better that HeapGetTargetPageFreeSpace handles the toast case.
Similary, it doesn't looks good that RelationGetBufferForTuple
consults HeapGetTargretPageFreeSpace() directly. Perhaps it
should be called via TableGetTargetPageFreeSpace(). I'm not sure
what is the right shape of the relationship among a relation and
a table and other kinds of relation. extract_autovac_opts
penetrates through the modularity boundary of toast/heap in a
similar way.


plancat.c:
+    if (relation->rd_rel->relkind == RELKIND_RELATION ||
+        relation->rd_rel->relkind == RELKIND_MATVIEW)
+        rel->rel_parallel_workers = RelationGetParallelWorkers(relation, -1);
+    else
+        rel->rel_parallel_workers = -1;

rel.h:
#define RelationGetParallelWorkers(relation, defaultpw)                 \
    (AssertMacro(relation->rd_rel->relkind == RELKIND_RELATION ||        \
                 relation->rd_rel->relkind == RELKIND_MATVIEW),            \
     ((relation)->rd_options ?                                             \
      ((HeapRelOptions *) (relation)->rd_options)->parallel_workers :     \
            (defaultpw)))

These function/macros are doing the same check twice at a call.


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Hi,

> hio.c:
> 
> -    saveFreeSpace = RelationGetTargetPageFreeSpace(relation,
> -
> HEAP_DEFAULT_FILLFACTOR);
> +    if (IsToastRelation(relation))
> +        saveFreeSpace = ToastGetTargetPageFreeSpace();
> +    else
> +        saveFreeSpace = HeapGetTargetPageFreeSpace(relation);
> 
> This locution appears four times in the patch and that's the all where the
> two macros appear. And it might not be good that RelationGetBufferForTuple
> identifies a heap directly since I suppose that currently tost is a part of
> heap. Thus it'd be better that HeapGetTargetPageFreeSpace handles the toast
> case.
> Similary, it doesn't looks good that RelationGetBufferForTuple consults
> HeapGetTargretPageFreeSpace() directly. Perhaps it should be called via
> TableGetTargetPageFreeSpace(). I'm not sure what is the right shape of the
> relationship among a relation and a table and other kinds of relation.
> extract_autovac_opts penetrates through the modularity boundary of
> toast/heap in a similar way.
I think so, too.
And In current codes, RelationGetTargetPageFreeSpace users(ex. heap_page_prune_opt) don't have to think whatever target
istoast or heap, but in your change, they need to use them properly.
 

You told us "big picture" about opclass around the beginning of this thread.
In my understanding, the purpose of this refactoring is to make reloptions more flexible to add opclass. 
I understand this change may be needed for opclass, however I think that this is not the best way to refactor
reloption.

How about discussing more about this specification including opclass, and finding the best way to refactor reloption?
I have not read the previous thread tightly, so you may have already started preparing.

Regards,
Aya Iwata



В письме от понедельник, 18 марта 2019 г. 3:03:04 MSK пользователь Iwata, Aya
написал:
> Hi Nikolay,
Hi!
Sorry for long delay. Postgres is not my primary work, so sometimes it takes a
while to get to it.

> This patch does not apply.
Oh... Sorry... here goes new version

> Please refer to http://commitfest.cputube.org/
> and update it.
Oh! a nice service... Wish it can stream status changes as RSS feeds... But I
can wirte a script on top of it. It would be more easy...

> How about separating your patch by feature or units that
> can be phased commit. For example, adding assert macro at first,
> refactoring StdRdOptions by the next, etc.
No I think we should not. All Macros changes are driven by removing
StdRdOptions. (Actually AssertMarco added there, but I do not think it is a
great addition that require a singe patch)
If it is really necessary I would file AssertMacro addition as a single patch,
but I would abstain from doing it if possible...

Attachment
В письме от понедельник, 18 марта 2019 г. 17:00:24 MSK пользователь Kyotaro
HORIGUCHI написал:

> > So I change status to "Waiting for Author".
> That seems to be a good oppotunity. I have some comments.
>
> rel.h:
> -#define RelationGetToastTupleTarget(relation, defaulttarg) \
> -    ((relation)->rd_options ? \
> -     ((StdRdOptions *) (relation)->rd_options)->toast_tuple_target :
> (defaulttarg)) +#define RelationGetToastTupleTarget(relation, defaulttarg)
>                \ +    (AssertMacro(relation->rd_rel->relkind ==
> RELKIND_RELATION ||        \ +                 relation->rd_rel->relkind ==
> RELKIND_MATVIEW),            \ +     ((relation)->rd_options ?
>                                \ +      ((HeapRelOptions *)
> (relation)->rd_options)->toast_tuple_target : \ +
> (defaulttarg)))
>
> Index AMs parse reloptions by their handler
> functions. HeapRelOptions in this patch are parsed in
> relation_reloptions calling heap_reloptions. Maybe at least it
> should be called as table_options. But I'm not sure what is the
> desirable shape of relation_reloptions for the moment.

If we create some TableOptions, or table_options  we will create a bigger
mess, then we have now. Table == relation, and index is also relation.
So better to follow the rule: we have heap relation, toast relation, and all
variety  of index relations. Each kind of relation have it's own options set,
each kind of relation has own marcoses to access it's options. This will make
the situation more structured.

> -    saveFreeSpace = RelationGetTargetPageFreeSpace(relation,
> -
> HEAP_DEFAULT_FILLFACTOR); +    if (IsToastRelation(relation))
> +        saveFreeSpace = ToastGetTargetPageFreeSpace();
> +    else
> +        saveFreeSpace = HeapGetTargetPageFreeSpace(relation);
>
> This locution appears four times in the patch and that's the all
> where the two macros appear. And it might not be good that
> RelationGetBufferForTuple identifies a heap directly since I
> suppose that currently tost is a part of heap. Thus it'd be
> better that HeapGetTargetPageFreeSpace handles the toast case.
> Similary, it doesn't looks good that RelationGetBufferForTuple
> consults HeapGetTargretPageFreeSpace() directly. Perhaps it
> should be called via TableGetTargetPageFreeSpace(). I'm not sure
> what is the right shape of the relationship among a relation and
> a table and other kinds of relation. extract_autovac_opts
> penetrates through the modularity boundary of toast/heap in a
> similar way.
So I started from the idea, I said above: each relation kind has it's own
macroses to access it's option.
If some options are often used combined, it might be good to create some
helper-macros, that will do this combination. But Alvaro is against adding any
new macroses (I understand why), and keep old one as intact as possible. So
here I can suggest only one thing: keep to the rule: "Each reloption kind has
it's own option access macroses" and let the developers of heap-core make
their live more comfortable with a helpers function the way they need it.

So I would leave this decision to Alvaro... He knows better...

> plancat.c:
> +    if (relation->rd_rel->relkind == RELKIND_RELATION ||
> +        relation->rd_rel->relkind == RELKIND_MATVIEW)
> +        rel->rel_parallel_workers = RelationGetParallelWorkers(relation,
> -1); +    else
> +        rel->rel_parallel_workers = -1;
>
> rel.h:
> #define RelationGetParallelWorkers(relation, defaultpw)                 \
>     (AssertMacro(relation->rd_rel->relkind == RELKIND_RELATION ||        \
>                  relation->rd_rel->relkind == RELKIND_MATVIEW),            \
> ((relation)->rd_options ?                                             \
> ((HeapRelOptions *) (relation)->rd_options)->parallel_workers :     \
> (defaultpw)))
>
> These function/macros are doing the same check twice at a call.
No. The first check is actual check, that does in runtime in production, and it
decides if we need to fetch some options or just use default value.

The second check is AssertMacro check. If you look into AssertMacro code you
will see that in production builds it do not perform any real check, just an
empty function.
AssertMacro is needed for developer builds, where it would crash if check is
not passed. It is needed to make sure that function is always used in proper
context. You should build postgres with --enable-cassert options to get it to
work. So there is no double checks here...






В письме от среда, 20 марта 2019 г. 6:15:38 MSK пользователь Iwata, Aya
написал:

> You told us "big picture" about opclass around the beginning of this thread.
> In my understanding, the purpose of this refactoring is to make reloptions
> more flexible to add opclass. I understand this change may be needed for
> opclass, however I think that this is not the best way to refactor
> reloption.
>
> How about discussing more about this specification including opclass, and
> finding the best way to refactor reloption? I have not read the previous
> thread tightly, so you may have already started preparing.

The idea is the following: now there are options that are build in (and
actually nailed to) the postgres core. And there are options that can be
created in Access  Methods in extensions. They share some code, but not all of
it. And it is bad. There should be one set of option-related code for both
in-core relations and indexes, and for indexes from extensions. If this code
is well written it can be used for opclass options as well.

So we need to unnail options code from reloptions.c, so no options are nailed
to it.
So you need to move options definitions (at least for indexes) inside access
method code. But we can't do it just that, because some indexes uses
StdRdOptions structure for options, it is big, and indexes uses only fillfactor
from there...
What should we do? Create an individual Options structure for each index.
So we did it.
And now we have StdRdOptions that is used only for Heap and Toast. And Toast
also does not use all of the variables of the structure.
Why everywhere we have it's own option structure, but for Heap and Toast it is
joined, and in not a very effective way?
To successfully apply a patch I plan to commit I need a single option
structure for each relation kind. Otherwise I will have to write some hack
code around it.
I do not want to do so. So it is better to get rid of StdRdOptions completely.

This is the only purpose of this patch. Get rid of StdRdOptions and give each
relation kind it's own option set.

StdRdOptions is ancient stuff. I guess it were added when there was fillfactor
only. Now life is more complex. Each relation kind has it's own set of
options. Let's not mix them together!

PS. If you wand to have some impression of what refactioring I am planning at
the end, have a look at the patch https://commitfest.postgresql.org/15/992/
It is old, but you can get an idea.




On Mon, Apr 1, 2019 at 4:36 AM Nikolay Shaplov <dhyan@nataraj.su> wrote:
> В письме от понедельник, 18 марта 2019 г. 3:03:04 MSK пользователь Iwata, Aya
> написал:
> > This patch does not apply.
> Oh... Sorry... here goes new version

Hi Nikolay,

Could we please have a new rebase?

Thanks,

-- 
Thomas Munro
https://enterprisedb.com

It seems strange to have relation_reloptions which abstracts away the
need to know which function to call for each relkind, and separately
have a bunch of places that call the specific relkind.  Why not just
call the specific function directly?  It doesn't seem that we're gaining
any abstraction ... maybe it'd be better to just remove
relation_reloptions entirely.  Or maybe we need to make it do a better
job so that other places don't have to call the specific functions at
all.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Hi Nikolay,

I have had a crack at re-basing the current patch against 12b2, but I didn’t trivially succeed.

It’s probably more my bodged fixing of the rejections than a fundamental problem. But I now get compile fails in — and seems like only — vacuum.c.

gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument -g -O2 -I../../../src/include  -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk    -c -o vacuum.o vacuum.c
vacuum.c:1761:20: error: expected expression
                        ((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup)
                                        ^
vacuum.c:1761:6: error: use of undeclared identifier 'StdRdOptions'
                        ((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup)
                          ^
vacuum.c:1771:20: error: expected expression
                        ((StdRdOptions *) onerel->rd_options)->vacuum_truncate)
                                        ^
vacuum.c:1771:6: error: use of undeclared identifier 'StdRdOptions'
                        ((StdRdOptions *) onerel->rd_options)->vacuum_truncate)
                          ^
4 errors generated.

I see that your patch removed that particular type, so I guess that feature in vacuum.c has been added in the meantime.

Would you have a more recent patch?

For what it is worth, I had a play at getting it to work, and only made things much worse!

denty.

On 1 Jul 2019, at 12:52, Thomas Munro <thomas.munro@gmail.com> wrote:

On Mon, Apr 1, 2019 at 4:36 AM Nikolay Shaplov <dhyan@nataraj.su> wrote:
В письме от понедельник, 18 марта 2019 г. 3:03:04 MSK пользователь Iwata, Aya
написал:
This patch does not apply.
Oh... Sorry... here goes new version

Hi Nikolay,

Could we please have a new rebase?

Thanks,

--
Thomas Munro
https://enterprisedb.com

On Thu, Jul 04, 2019 at 07:44:42PM +0100, Dent John wrote:
> I see that your patch removed that particular type, so I guess that
> feature in vacuum.c has been added in the meantime.
>
> Would you have a more recent patch?

I have switched the patch as waiting on author.
--
Michael

Attachment
В письме от понедельник, 1 июля 2019 г. 23:52:13 MSK пользователь Thomas Munro
написал:

> > > This patch does not apply.
> >
> > Oh... Sorry... here goes new version
>
>
> Hi Nikolay,
>
> Could we please have a new rebase?

Sorry, a new reloptions have been introduced, and I need some time to
understand how do they work and to fit them into new option structures. And I
do not have much dev-time right now. If we are lucky, and changes will be
obvious, will do it today or tomorrow.
В письме от четверг, 4 июля 2019 г. 19:44:42 MSK пользователь Dent John
написал:
> Hi Nikolay,
>
> I have had a crack at re-basing the current patch against 12b2, but I didn’t
> trivially succeed.
>
> It’s probably more my bodged fixing of the rejections than a fundamental
> problem. But I now get compile fails in — and seems like only — vacuum.c.

I am really sorry for such big delay. Two new relation options were added, it
needed careful checking while importing them back to the patch.
I did not find proper timeslot for doing this quick enough.
Hope I am not terribly late.
Here goes an updated version.




Attachment
Hi Nikolay,

Thanks for the revised patch. It applies now no problem, and seems to work fine.

For me, I still find the relopts area quite odd. I wonder if your patch doesn’t go far enough?

For example, take log_autovacuum_min_duration. It’s described intRelOpts, which implicitly defines its type (an int),
andexplicitly sets its min/max, default, and lock level, as well as it’s kind (HEAP | TOAST). But then it’s described
againin the relopt_parse_elt[] in toast_reloptions() and heap_reloptions(), which implicitly defines it to apply to
HEAP| TOAST, and fact of it being an int. (It’s, of course, same for all reloptions.) 

The relopt_parse_elt[] is hugely entirely duplicative. I wonder if it is not best simply to consolidate — either push
allinfo into the static {bool,int,real,string}RelOpts[] structures, or push all into the relopt_parse_elt[]. 

I notice the thread earlier talks of some of the APIs being public interfaces, which may be used by EXTENSIONs. I’m not
sureI buy that in its fullest sense. For sure, an EXTENSION may invoke the APIs. But no EXTENSION can define new/alter
existingoptions, because the {bool,int,real,string}RelOpts[] are not currently runtime-extendable. I guess we probably
shouldpreserve the API’s functionality, and allow fillRelOptions(), if provided with a tab, for it to restrict filling
toonly those supplied in the tab. But in general core code, my opinion is that fillRelOptions() could be provided with
aNULL tab, and for it to scavenge all needed information from the static {bool,int,real,string}RelOpts[] structures. 

That links to what I initially found most confusing: AUTOVACUUM_RELOPTIONS. I understand it’s there because there are a
bunchof shared reloptions. Pre-patch, default_reloptions() meant there was no need for the macro, but your patch drops
default_reloptions().default_reloptions() is horrible, but I feel the macro approach is worse. :-) 

Sorry for the delay providing the feedback. It took me a few sittings to grok what was going on, and to work out what I
thoughabout it. 

denty.

> On 12 Jul 2019, at 15:13, Nikolay Shaplov <dhyan@nataraj.su> wrote:
>
> В письме от четверг, 4 июля 2019 г. 19:44:42 MSK пользователь Dent John
> написал:
>> Hi Nikolay,
>>
>> I have had a crack at re-basing the current patch against 12b2, but I didn’t
>> trivially succeed.
>>
>> It’s probably more my bodged fixing of the rejections than a fundamental
>> problem. But I now get compile fails in — and seems like only — vacuum.c.
>
> I am really sorry for such big delay. Two new relation options were added, it
> needed careful checking while importing them back to the patch.
> I did not find proper timeslot for doing this quick enough.
> Hope I am not terribly late.
> Here goes an updated version.
>
>
>
> <get-rid-of-StrRdOptions_5.diff>




Hi Nikolay,

thanks for sending a new version of the patch. I've done a basic review
today, so let me share some comments about the patch.

Firstly, there's an important question why should we actually do this.

At the beginning of this thread you mentioned memory usage - e.g. for
indexes the reduced struct is just 8B (instead of 82B). I doubt it's
worth doing for this reason - it's a tiny amount of memory (you'd need
~16k indexes to save 1MB). So memory consumption does not seem like a
pressing issue (e.g. we're probably wasting way more memory thanks to
AllocSet using the 2^N bins to allocate chunks). I see Andres already
voiced a similar opinion last year ....

You then mentioned that

  > My real motivation for this patch is to make code more uniform.
  > So the patch over it will be much clearer. And to make result
  > code more clear and uniform too.

which seems like a much better reason. The thing is - I'm not quite
convinced this patch makes the code more uniform/clearer. While that's
somewhat subjective opinion, there are cases where the code/API gets
obviously better - and this does not seem like one of those cases :-(

The one remaining possible benefit of this patch is making the code
easier to reuse / extend from other patches. In fact, that's why I'm
looking at this patch, because in the "opclass parameters" thread it was
repeatedly mentioned this patch would be useful.

So I think it'd be good to coordinate with Nikita Glukhov, and rebase
the opclass parameters patch on top of this one to verify/demonstrate
how it benefits that patch.



Now, some comments about the patch itself:


1) I see there's a bunch of functions parsing reloptions with about this
pattern:

    bytea *
    btoptions(Datum reloptions, bool validate)
    {
        static const relopt_parse_elt tab[] = { ... }
    
        options = parseRelOptions(
    
        /* if none set, we're done */
        if (numoptions == 0)
            return NULL;
    
        rdopts = allocateReloptStruct(...)
    
        fillRelOptions(rdopts, ...);
    
        pfree(options);
    
        return (bytea *) rdopts;
    }

so I wonder if the patch might define a wrapper doing all of this,
instead of copying it on a number of places.


2) The patch makes various new places aware about how reloptions for
different relkinds are parsed differently. IMHO that's a bad thing,
because it essentially violates layering / levels of abstraction.

For example, there's this change in heap_multi_insert():

-    saveFreeSpace = RelationGetTargetPageFreeSpace(relation,
-                                                   HEAP_DEFAULT_FILLFACTOR);
+    if (IsToastRelation(relation))
+        saveFreeSpace = ToastGetTargetPageFreeSpace();
+    else
+        saveFreeSpace = HeapGetTargetPageFreeSpace(relation);

so a code which was entirely oblivious to TOAST vs. non-TOAST relations
suddenly cares about this difference. And there's no good reason for
that, because this if might be added to the original macro, eliminating
this change entirely. And this exact same change in on various other
places.

The same thing applies to this change in get_relation_info:

-    /* Retrieve the parallel_workers reloption, or -1 if not set. */
-    rel->rel_parallel_workers = RelationGetParallelWorkers(relation, -1);
+    /*
+     * Retrieve the parallel_workers for heap and mat.view relations.
+     * Use -1 if not set, or if we are dealing with other relation kinds
+     */
+    if (relation->rd_rel->relkind == RELKIND_RELATION ||
+        relation->rd_rel->relkind == RELKIND_MATVIEW)
+        rel->rel_parallel_workers = RelationGetParallelWorkers(relation, -1);
+    else
+        rel->rel_parallel_workers = -1;

and this vacuum_rel chunk is not particularly readable either:

    if (onerel->rd_options == NULL ||
-       ((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup)
+       (!IsToastRelation(onerel) &&
+        ((HeapRelOptions *) onerel->rd_options)->vacuum_index_cleanup) ||
+       (IsToastRelation(onerel) &&
+        ((ToastRelOptions *) onerel->rd_options)->vacuum_index_cleanup))

(To be fair, this already was looking directly at StdRdOptions, but
adding a new macro to get vacuum_index_cleanup would be a good idea
anyway).


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



On Sat, Jul 13, 2019 at 2:14 AM Nikolay Shaplov <dhyan@nataraj.su> wrote:
> Here goes an updated version.

On Sat, Jul 20, 2019 at 8:21 PM Dent John <denty@qqdd.eu> wrote:
> [review]

On Sun, Jul 28, 2019 at 5:38 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> [review]

Hi Nikolay,

Looks like some good actionable feedback.  I've moved this patch to
September, and set it to "Waiting on Author".

-- 
Thomas Munro
https://enterprisedb.com



On Thu, Aug 01, 2019 at 09:39:53PM +1200, Thomas Munro wrote:
> Looks like some good actionable feedback.  I've moved this patch to
> September, and set it to "Waiting on Author".

The patch is in this state for two months now, so I have switched it
to "returned with feedback".  The latest patch does not apply, and it
would require an update for the new test module dummy_index_am.

As I have touched recently in the area of the code, I got a look at it
and the core of the idea is that it would be cleaner to move the
control of reloptions from reloptions.c to each AM, as well as have an
extra later for toast.  This has additional costs in multiple ways:
- The parsing and filling of reloptions gets more duplicated, though
this could be solved with a wrapper routine (something HEAD already
does when filling in rd_options).
- Logic which was out of toast previously is not anymore.

Some of these have been mentioned by Tomas upthread, and I share
similar points.  I also doubt that the removal of
RelationGetTargetPageFreeSpace() is a good thing, and this complicates
more the checks around options and the handling of rd_options which
may become optionally NULL, making a bit more brittle the whole
structure.  We may be able to get useful pieces for it, though it is
not clear what would be the benefits.  It may help as well to group
that within the thread dedicated to the rework of the reloption APIs
as a preliminary, refactoring step.
--
Michael

Attachment

Re: [PATCH] Do not use StdRdOptions in Access Methods

From
Nikolay Shaplov
Date:
В Fri, 27 Sep 2019 17:24:49 +0900
Michael Paquier <michael@paquier.xyz> пишет:

> > Looks like some good actionable feedback.  I've moved this patch to
> > September, and set it to "Waiting on Author".
>
> The patch is in this state for two months now, so I have switched it
> to "returned with feedback".  The latest patch does not apply, and it
> would require an update for the new test module dummy_index_am.

I've been thinking about this patch and came to a conclusion that it
would be better to split it to even smaller parts, so they can be
easily reviewed, one by one. May be even leaving most complex
Heap/Toast part for later.

This is first part of this patch. Here we give each Access Method it's
own option binary representation instead of StdRdOptions.

I think this change is obvious. Because, first, Access Methods do not
use most of the values defined in StdRdOptions.

Second this patch make Options structure code unified for all core
Access Methods. Before some AM used StdRdOptions, some AM used it's own
structure, now all AM uses own structures and code is written in the
same style, so it would be more easy to update it in future.

John Dent, would you join reviewing this part of the patch? I hope it
will be more easy now...


--
Software Developer: https://www.upwork.com/freelancers/~014a87e140ff02c0da
Body-oriented Therapist: https://vk.com/nataraj_rebalancing  (Russian)
Attachment
В письме от пятница, 27 сентября 2019 г. 17:24:49 MSK пользователь Michael
Paquier написал:

> The patch is in this state for two months now, so I have switched it
> to "returned with feedback".

So I've split this patch into even smaller parts, so it would be more easy to
review.

Do not use StdRdOptions in Access Methods
https://www.postgresql.org/message-id/4127670.gFlpRb6XCm@x200m

Use empty PartitionedRelOption structure for storing partitioned table options
instead of StdRdOption
https://www.postgresql.org/message-id/1627387.Qykg9O6zpu@x200m

and

Some useful asserts in ViewOptions Macroses
https://www.postgresql.org/message-id/3634983.eHpMQ1mJnI@x200m

as for removing StdRdOptions for Heap and Toast, I would suggest for commit it
later. It is not critical for my reloptions refactoring plans. I can do it
having one structure for two relation kinds. So we can split it into two
later, or do not split at all...

--
Software Developer: https://www.upwork.com/freelancers/~014a87e140ff02c0da
Body-oriented Therapist: https://vk.com/nataraj_rebalancing  (Russian)
Attachment