Re: [PATCH] Finally split StdRdOptions into HeapOptions and ToastOptions - Mailing list pgsql-hackers

From Nikolay Shaplov
Subject Re: [PATCH] Finally split StdRdOptions into HeapOptions and ToastOptions
Date
Msg-id 2923309.24gN2MKNTD@x200m
Whole thread Raw
In response to Re: [PATCH] Finally split StdRdOptions into HeapOptions andToastOptions  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
В письме от суббота, 7 марта 2020 г. 10:03:40 MSK пользователь Michael Paquier
написал:
> On Wed, Mar 04, 2020 at 10:58:31PM +0300, Nikolay Shaplov wrote:
> > But the truth is that my goal is to move all code that defines all option
> > names, min/max values etc, move it inside am code. To move data from
> > boolRelOpts, intRelOpts, realRelOpts, enumRelOpts, enumRelOpts from
> > reloptions.c into the code that implement AMs that uses these options.
> >
> > I did it for indexes in patch I've offered several years ago. Now we have
> > also relaion AM.
> >
> > But I would prefer to fix index AM relioptions first, and then copy that
> > solution for relations.
>
> How do you think that this part should be changed then, if this needs
> any changes?  It seems to me that we have a rather correct layer for
> index AMs by requiring each one to define the available option set
> using indoptions through their handler, with option fetching macros
> located within each AM.

My idea is like this:

Now information about what reloption AM has and how they should be parsed are
stored in two places. One is relioptions.c and boolRelOpts, intRelOpts,
realRelOpts, enumRelOpts, enumRelOpts arrays, another is
static const relopt_parse_elt tab[] inside  amoptions_function amoptions;
function of each AM.

My suggestion is to merge all this data into one structure. Like
option_definition


/* generic struct to hold shared data */
typedef struct option_definition_basic
{
   const char *name;           /* must be first (used as list termination
                                * marker) */
   const char *desc;
   LOCKMODE    lockmode;
   option_definition_flags flags;
   option_type type;
   int         struct_offset;  /* offset of the value in Bytea representation
*/
}  option_definition_basic;

typedef struct option_definition_bool
{
   option_definition_basic base;
   bool        default_val;
}  option_definition_bool;

typedef struct option_definition_int
{
   option_definition_basic base;
   int         default_val;
   int         min;
   int         max;
}  option_definition_int;

typedef struct option_definition_real
{
   option_definition_basic base;
   double      default_val;
   double      min;
   double      max;
}  option_definition_real;

typedef struct option_definition_enum
{
   option_definition_basic base;
   const char **allowed_values;/* Null terminated array of allowed values for
                                * the option */
   int         default_val;    /* Number of item of allowed_values array */
}  option_definition_enum;

This example from my old code, I guess I should add a union here, this will
make code more readable... But the idea is the same, we have one structure
that describes how this option should be parsed.

Then we gather all options definitions for one object (for example for index)
into a structure called OptionsDefSet

typedef struct OptionDefSet
{
   option_definition_basic **definitions;
   int         num;            /* Number of catalog items in use */
   int         num_allocated;  /* Number of catalog items allocated */
   bool        forbid_realloc; /* If number of items of the catalog were
                                * strictly set to certain value do no allow
                                * adding more idems */
   Size        struct_size;    /* Size of a structure for options in binary
                                * representation */
   postprocess_bytea_options_function postprocess_fun; /* This function is
                                                        * called after options
                                                        * were converted in
                                                        * Bytea represenation.
                                                       * Can be used for extra
                                                     * validation and so on */
   char       *namespace;      /* Catalog is used for options from this
                                * namespase */
}  OptionDefSet;


This DefSet will have all we need to parse options for certain object.

This all will be stored in the AM code.

Then we replace amoptions_function amoptions; function of the AM with
amoptions_def_set_function amoptions_def_set; function. This function will
return OptionDefSet to whoever called it.

So whenever we need an option in index_reloptions from reloption.c we get this
DefSet, and then call a function that parses options using this data.


Why do we need this.

1. To have all options related data in one place. Now when a developer wants
to add an option, he needs to find all places in the code this option should
be added. It is not good thing. It brings troubles and errors.

2. Now we have two different options storages for in-core AM options and for
contib AM options. They are boolRelOpts, intRelOpts, realRelOpts, enumRelOpts,
enumRelOpts arrays for in-core AMm's and static relopt_gen **custom_options
for contrib AM. It if better ho have same tool for both. My idea will allow to
have same code both in contrib and in-core AM.

3. Then we will be able to have reloptions-like options just anywhere, just
define an OptionDefSet, and put code that provides option values from SQL and
pg_catalog. I came to the idea of rewriting option code when I have been
working on the task of adding opclass parameters, Nikita Glukhov works on it
now. It is very simulator to options, but now you can not use reloptions code
for that. When we will have OptionDefSet I've described, it can be used for
opclass parameters too.

The example of how it can be done can be found at https://
commitfest.postgresql.org/15/992/
The code is quite outdated, and has a lot of extra code that already commited
or is not really needed. But you can see the idea.

I will try to provide new version of it soon, with no extra code, that can be
applied to current master, but it would be good to apply it to the master
branch where there is no StdRdOptions. This will make patch more readable and
understandable.

So, can I please have it in the code :-)

> > Because if I first copy AM solution from indexes to relation, then I will
> > have to fix it in two places.
> >
> > So I would prefer to keep reloptions for relations in relations.c, only
> > split them into HeapOptions and ToastOptions, then change AM for indexes
> > moving option definition into AM's and then clone the solution for
> > relations.
> Then, for table AMs, it seems to me that you are right for long-term
> perspective to have the toast-related options in reloptions.c, or
> perhaps directly located within more toast-related file (?) as table
> AMs interact with toast using heapam_relation_needs_toast_table and
> such callbacks.  So for heap, moving the option handling to roughly
> heapam_handler.c is a natural move, though this requires a redesign of
> the existing structure to use option handling closer to what
> indoptions does, but for tables.
We also have view reloptions and attribute options, as well as toast options.
We can keep them in reloption.c or find better place for them. It is not a big
problem I think. And as for heap options, yes, as heap now has AM, they should
be moved inside AM. I can do it when we are finished with index options.





pgsql-hackers by date:

Previous
From: Evgeniy Shishkin
Date:
Subject: Re: effective_io_concurrency's steampunk spindle maths
Next
From: Fabien COELHO
Date:
Subject: Re: pg_ls_tmpdir to show directories and shared filesets