Re: [PATCH] Opclass parameters - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: [PATCH] Opclass parameters
Date
Msg-id 20190912001634.oeteg5edzke47om4@development
Whole thread Raw
In response to Re: [PATCH] Opclass parameters  (Nikita Glukhov <n.gluhov@postgrespro.ru>)
Responses Re: [PATCH] Opclass parameters
Re: [PATCH] Opclass parameters
List pgsql-hackers
On Wed, Sep 11, 2019 at 01:44:28AM +0300, Nikita Glukhov wrote:
>On 11.09.2019 1:03, Tomas Vondra wrote:
>
>>On Tue, Sep 10, 2019 at 04:30:41AM +0300, Nikita Glukhov wrote:
>>
>>>2. New AM method amattoptions().
>>>
>>>  amattoptions() is used to specify per-column AM-specific options.
>>>  The example is signature length for bloom indexes (patch #3).
>>>
>>
>>I'm somewhat confused how am I supposed to use this, considering the 
>>patch
>>set only defines this for the contrib/bloom index AM. So let's say I want
>>to create a custom BRIN opclass with per-attribute options (like the two
>>BRIN opclasses I work on in the other thread). Clearly, I can't tweak the
>>IndexAmRoutine from the extension. ISTM the patch series should 
>>modify all
>>existing index AMs to have a valid amattoptions() implementation, calling
>>the new amproc if defined.
>>
>>Or what is the correct way to define custom opclass for existing index AM
>>(e.g. BRIN) with attribute options?
>>
>Per-attribute opclass options are implemented independently from per-attribute
>AM options.  amattoptions() is optional and needs to be defined only if AM has
>per-attribute options.

OK, thanks for the explanation - so the per-attribute opclass options will
work even when the AM does not have amattoptions() defined. Is there any
practical reason why not to just define everything as opclass options and
get rid of the amattoptions() entirely?

> amproc #0 is called regardless of whether amattoptions
>is defined or not.  That was the main reason why uniform procnum 0 was
>picked.
>

I still think using procnum 0 and passing the data through fn_expr are not
the right solution. Firstly, traditionally the amprocs are either required
or optional, with required procs having low procnums and optional starting
at 11 or so. The 0 breaks this, because it's optional but it contradicts
the procnum rule. Also, what happens if we need to add another optional
amproc defined for all AMs? Surely we won't use -1.

IMHO we should keep AM-specific procnum and pass it somehow to the AM
machinery.

FWIW there seems to be a bug in identify_opfamily_groups() which does
this: 

    /* Ignore strategy numbers outside supported range */
    if (oprform->amopstrategy > 0 && oprform->amopstrategy < 64)
        thisgroup->operatorset |= ((uint64) 1) << oprform->amopstrategy;

but then identify_opfamily_groups() computes allfuncs without any such
restriction, i.e. it includes procnum 0. And then it fails on this check

    if (thisgroup->functionset != allfuncs) {...}

None of the built-in brin opclasses defines the new amproc, so the code
does not hit this issue. I only noticed this with the opclasses added in
the other thread.

As for the fn_expr, I still think this seems like a misuse of a field
which was intended for something else. I wonder if it might be breaking
some exising code - either in code or in some extension. It seems quite
possible.

It just seems like we're inventing a new way to novel way to pass data
into a function while we already have parameters for that purpose. Adding
parameters may require care so as not to break existing opclasses, but it
seems like the right approach.

>You should simply define function like that and use it as amproc #0:
>
>Datum
>brin_bloom_options(PG_FUNCTION_ARGS)
>{
>    local_relopts *relopts = (local_relopts *) PG_GETARG_POINTER(0);
>    BloomOptions *blopts = NULL;
>
>    extend_local_reloptions(relopts, blopts, sizeof(*blopts));
>
>    add_local_real_reloption(relopts, "n_distinct_per_range", "desc",
>                             -0.1, -1.0, INT_MAX, &blopts->nDistinctPerRange);
>
>    add_local_real_reloption(relopts, "false_positive_rate", "desc",
>                             0.01, 0.001, 1.0, &blopts->falsePositiveRate);
>
>    PG_RETURN_VOID();
>}
>

OK, this did the trick. Thanks. I don't have a clear opinion on the API,
but it certainly looks like an improvement.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: amcheck verification for GiST
Next
From: Peter Geoghegan
Date:
Subject: Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.