Thread: [PROPOSAL][PROTOTYPE] Individual options for each index column: Opclass options
[PROPOSAL][PROTOTYPE] Individual options for each index column: Opclass options
From
Nikolay Shaplov
Date:
Hi! There was an interesting idea. Now it is possible to add in runtime options for relation, when you are creating a new extension with new access method. But if you add a custom operator class, you have no tool to tune it's behavior. But in some cases it would be nice to have such tool. For example in intarray extension for gist__intbig_ops there is an option SIGLENINT that defines the length of the signature that will be used while creating index. Now it is #defined in the code, but theoretically one can tune index size/performance by manually setting SIGLENINT value that is more suitable for his data. Moreover the authors of bloom extension have to create workaround for customizing each column behavior: they sets number of bits per column by setting reloptions col1 — col16. This is not handy. I would seven say a little bit ugly. So adding options for opclass seems to be really good idea. As far as in postgres index is a relation, index column is an attribute in this relation and each column might have only one opclass, this brings us to conclusion that when we as speaking about options of an opclass in the index column, then we can actually speak about an options of an attribute. If we look into pg_attribute table, we will see, that there is already attoptions attribute there. So we can use it. Just add a way to add them in runtime and teach opclass to use it. The general idea is the following: Each access method should have amattoptions function, that works similar to amoptions function (parses and validate text[] of attoptions into internal data structure) but it takes an additional argument -- the number of an attribute we are processing. amattoptions do this job by calling a specific support function of an opclass, that knows how to parse it (and do it in a similar way as amoptions function of an access method) the result of amattoptions should be cached as attoptions are cached now or similar way. (In current version of prototype I do it wrong, but should change it later) All the support functions of the access method should accept parsed attoptions in their argument lists, and use it if necessary. In the prototype I've implemented if for gist opclass, and added attoptions as a last argument of all support functions. So it is even backward compatible as all funtions I have not touched just ignores this last argument. The implementation of an access method should get attoptions from cache, if they are there, or from am->amattoptions if not (in the prototype it is done using get_index_attribute_options function) and pass it to the support functions. In the prototype I've added a member to GISTSTATE structure for storing parsed attoptions, and fill in initGISTstate and used it all over the code. In other implementations solutions might be different. So... The prototype I wrote is in attachment. It is really dirty. Some things are not done yet. Some things should be changed. But never the less work, and it shows what I am going to do better than all explanations. To see how it works you should do the following: # create extension intarray ; # create table test (i int[]); # create table test2 (i int[]); # create index ON test USING GIST (i USING gist__intbig_ops WITH OPTIONS (sig_len_int=22) ); # create index ON test2 USING GIST (i USING gist__intbig_ops WITH OPTIONS (sig_len_int=120) ); # select attoptions from pg_attribute where attrelid = 'test_i_idx'::regclass OR attrelid = 'test2_i_idx'::regclass; attoptions ------------------- {sig_len_int=120} {sig_len_int=22} (2 rows) and if you uncomment development warning output from the code you will see that these values are really used through the code. Speaking of the syntax, I've added WITH OPTIONS keywords for specifying attoptions, not just WITH, that most of us would expect. I did it because index_elem, the node that is used for definition of an index column is used not only in CREATE INDEX expression, but also in CREATE TABLE CONSTRAINT EXCLUDE. Because CREATE TABLE CONSTRAINT EXCLUDE actually creates an index, and one can specify there full specification of index element there. But right after this you should write "WITH operator" clause. So if I will try to use WITH keyword for attoptions, a syntax parser will be confused, as it cat not distinguish one WITH keyword from another. So decided to use WITH OTPTIONS keywords for attoptions. There might be some tricks that will allow to use WITH in both places but I think "WITH OPTIONS" might make SQL code more readable, because an expression with two "WITH" in a row will be hard to read for human too ;-) So we come to the last part of this message, that is most important for me for the moment: One of the issues that I've solved while writing this patch, was the following: attoptions and reloptions are should use almost the same code for options parsing. For reloptions this code is written, it is good, but you can't reuse it for attoptions as it is binded to relopt_kind and quite centralized. So first step to implement attoptions for indexes, will be rewriting reoptions code to get rid of relopt_kind, make all access methods has its own option descriptor catalogs and use reference to this catalog instead of relopt_kind. This should be a separate patch and I think I will start another thread for it. I will write another letter for relopt_kind remove issue... -- Nikolay Shaplov Postgres Professional: http://www.postgrespro.com Russian Postgres Company
Attachment
Re: [PROPOSAL][PROTOTYPE] Individual options for each index column: Opclass options
From
Robert Haas
Date:
On Tue, May 24, 2016 at 10:06 AM, Nikolay Shaplov <n.shaplov@postgrespro.ru> wrote: > So adding options for opclass seems to be really good idea. > To see how it works you should do the following: > > # create extension intarray ; > # create table test (i int[]); > # create table test2 (i int[]); > # create index ON test USING GIST (i USING gist__intbig_ops WITH OPTIONS > (sig_len_int=22) ); > # create index ON test2 USING GIST (i USING gist__intbig_ops WITH OPTIONS > (sig_len_int=120) ); I think supporting syntax of this type would be a good idea. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company