Re: [HACKERS] [PATCH] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind for custom AM - Mailing list pgsql-hackers

From Nikolay Shaplov
Subject Re: [HACKERS] [PATCH] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind for custom AM
Date
Msg-id 1703774.tbaA9tY9qs@x200m
Whole thread Raw
In response to [HACKERS] [PATCH] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind for custom AM  (Nikolay Shaplov <dhyan@nataraj.su>)
Responses Re: [HACKERS] [PATCH] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind for custom AM
List pgsql-hackers
В письме от 6 февраля 2017 23:30:03 пользователь Nikolay Shaplov написал:

I've rebased the patch, so it can be applied to current master. See
attachment.

Alvaro in private letter told that he will review the patch some time later.
So I am waiting.

Also added this patch to commitfest:
https://commitfest.postgresql.org/13/992/


> (Please see Attribution notice at the bottom of the letter)
>
> Hi! Some time ago I've send a proposal, about removing all am-related code
> from src/backend/access/common/reloptions.c and move it into somewhere
> inside am code. It was in
> https://www.postgresql.org/message-id/4636505.Bu9AKW1Kzc%40nataraj-amd64
> thread.
>
> Now I have a patch that is ready
>
> Before explaining what have been done, I need to define some concepts, so it
> would be easy to speak about them in the further explanation.
>
> Reloptions values in postgres actually exists in four representations:
> DefList*, TextArray[], Bytea and so called Values.
>
> - DefList representation is a list of DefElem, it is the representation in
> which reloptions comes from syntax analyzer, when relation is created or
> altered, and also this representation that is needed for pg_dump to dump a
> create statement that creates this relation.
>
> - TextArray[] representation is a way, how reloptions are stored in
> pg_catalog. It is TEXT[] attribute in the pg_catalog table and from the
> poinf o view of postgres code it is Datum array of TEXT Datums, when read
> into the memory
>
> -  Bytea (or I will also call it Binary) representation, is a representation
> with which relation code really works. It has C-structure inside in which
> fixed-length options are sored, some space is reserved for storing varlen
> values after the structure, and certainly it has BYTEA header in front of
> all of it. It is cached, it is fast to access. It is a good stuff.
>
> - Values (or in some cases RawValues) -- internal representation of option
> parser. Technically it is List of option_value structures (old name for it
> is relopt_value). This representation is used for parsing and validating
> options. Actually all the convertations between representations listed
> above are dove through Values representation. Values are called RawValues
> when Values List were already created, but it was not parsed yet, and all
> option values are in raw state: represented as a text.
>
>
> DefList and TextArray representations are converted in both ways(always
> through Values representation): we need DefList -> TextArray to create or
> change entry in the pg_catalog, and need TextArray -> DefList for pg_dump
>
> Bytea representation is converted from TextArray representation (also
> through Values representation). When relation code, tries to get cached
> Bytea representation for certain relation, and none is actually cached, then
> cache code calls extractRelOptions function that fetches entry from
> pg_catalog for that relation, gets reloptions  TextArray[], converts it
> into Bytea and cache it.
>
>
> Each reloption has it's own definition. This definitions tells how this
> option should be parsed, validated and converted. Before my patch this
> information were stored in relopt_gen and relopt_parse_elt structures. In
> my patch all this information were moved into option_definition_basic
> structure (that is actually old relopt_gen + relopt_parse_elt)
>
> The set of options definition I would call a options catalog. Before my
> patch there was one global options catalog (actually it was divided into
> parts by data types and has another part for dynamically added options, but
> nevertheless, it was global). After my patch there would be a separate
> options catalog for each relation type. For AM-relation this catalog would
> be available via AM-handler. for other relation types it would be still
> kept in reloption.c
>
>
>
> Now I am ready to tell what actually have been done in this patch:
>
> 1. I moved options definitions from global lists from reloptions.c
> [type name]RelOpts to lists that are kept inside AM code. One list per AM.
> heap options, toast options and options for all other relation types that
> are not accessible through  AccessMethod, all were kept inside
> reloptions.c, but now it is one list for each relation type, not a global
> one
>
> 2. Each AccessMethod had amoptions function that was responsible for
> converting reloptions from TextArray[] representation into Bytea
> representaions. This function used functions from reloptions.c and also had
> an array of relopt_parse_elt that defines how parsed reloption data should
> be packed inside Bytea chunk. I've moved data from relopt_parse_elt array
> into option definitions that are stored in the catalog (since catalog are
> now defined inside the AccessMethod we have access to a structure of Bytea
> representation at the place where we define option catalog)
>
> 3. Speaking of amoptions function, I've completely removed it, since we do
> not need it. For developers of Access Methods who still want to do some
> custom action with just parsed data, I've added postprocess_fun to option
> catalog. If this function is defined, it is called by convertation function
> right after Bytea representation were created. In postprocess_fun developer
> can do some custom validation, or change just parsed data. This feature is
> used in bloom index.
>
> 4. Instead of amoptions function I've added amrelopt_catalog function to the
> Access Method. This function returns option definition catalog for an
> Access Method. The catalog is used for processing options.
>
> 5. As before, relation cache calls extractRelOptions when it needs options
> that were not previously cached.
> Before my patch, it created Bytea representations for non-AM relations, and
> called amoptions AM function to get Bytea representation for AM relation.
> In by patch, it now gets option catalog, (using local functions for non-AM
> relations, and  using amrelopt_catalog function for AM-relations), and then
> use this catalog to convert options from TextArray into Bytea
> representation.
>
> 6. I've added some more featues:
>
> - Now you can set OPTION_DEFINITION_FLAG_FORBID_ALTER flag in definition of
> the option, and then postgres will refuse to change option value using
> ALTER command. In many indexes, some options are used only for index
> creation. After this it's value is saved in MetaPage, and used from there.
> Nevertheless user is allowed to change option value, though it affects
> nothing, one need to recreate index to really change such value. Now using
> this flag we can prevent user from getting an illusion that he successfully
> changed option value.
>
> - After applying my patch if you try to set toast. option for table that
> does not have toast, you will get an error. Before postgres will accept
> toast option for a table without a toast, but this value will be lost. This
> is bad behavior, so now postgres will throw an error in this case.
>
> - I've noticed that all string relation options in postgres are technically
> enum options. So I've added enum option type. This will save some bytes of
> the memory and some tacts of the CPU. I also left string option type
> (through it is not used now). A. Korotkov said that it might be needed
> later for storing patterns, xml/json paths or something like that.
>
> - Added some more tests that will trigger more options code. Though I am
> dreaming about more advanced test system, that will allow to check that
> certain value is received somewhere deep in the code.
>
> 7. And now the most controversial change. I've got rid of  StdRdOptions
> structure, in which options for heap, toast, nbtree, hash and spgist were
> stored in Bytea representation. In indexes only fillfactor value were
> actually used from the whole big structure. So I've created a separate
> structure for each relation type. HeapOptions and ToastOptions are very
> similar to each other. So common part of creating these catalogs were moved
> to
> add_autovacuum_options function that are called while creating each catalog.
>
> Now to the controversial part: in src/include/utils/rel.h had a lot of
> Relation* macroses that used StdRdOptions structure. I've changed them into
> View* Heap* and Toast* analogues, and changed the code to use these macroses
> instead of Relation* one. But this part of code I least sure of. I'd like
> to ask experienced developers to double check it.
>
> 8. I've moved all options-abstract code into options.c and options.h file,
> and left in reloptions.c and reloptions.h only the code that concerns
> relation options. Actually the main idea of this patch was to get this
> abstract code in order to use it for custom attribute options later. All
> the rest just a side effects :-)
>
> So, before adding this to commitfest I want somebody to give a general look
> at it, if the code is OK.
>
> Alvaro Herrera, you once said that you can review the patch...
>
> The patch is available in the attachment.  It can be applied to current
> master
>
> You can also see latest version on my github
> https://github.com/dhyannataraj/postgres/tree/reloption_fix
> at the reloption_fix branch.
>
>
> ATTRIBUTION NOTICE: I wrote this patch, when I was an  employee in Postgres
> Professional company. So this patch should be attributed as patch from
> Postgres Pro (or something like that). For some reasons I did not manage to
> commit this patch while I left that job. But I got a permission to commit it
> even if I left the company.
> So my contribution to this patch as a independent developer was final code
> cleanup, and writing some more comments. All the rest was a work of Postgres
> Pro employee Nikolay Shaplov.

--
Nikolay Shaplov, independent Perl & C/C++ developer. Available for hire.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: [HACKERS] Documentation improvements for partitioning
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)