Thread: Re: Index AM API cleanup

Re: Index AM API cleanup

From
Peter Eisentraut
Date:
I want to call out one particular aspect that is central to this patch
series that needs more broader discussion and agreement.

The problem is that btree indexes (and to a lesser extent hash
indexes) are hard-coded in various places.  This includes various
places in the optimizer (see patch v17-0018) that might be harder to
understand, but also more easy-to-understand places such as:

- Only hash and btree are supported for replica identity.  (see patch 
v17-0015)

- Only btree is supported for speculative insertion.  (see patch v17-0016)

- Only btree is supported for foreign keys.  (see patch v17-0017)

- Only btree can participate in merge join.  (see patch v17-0020)

The problem in cases such as these, and some of them actually contain
code comments about this, is that we somehow need to know what the
equality operator (and in some cases ordering operator) for a type is.
And the way this is currently done is that the btree and hash strategy
numbers are hardcoded, and other index AMs are just not supported
because there is no way to find that out from those.

We faced a similar issue for the temporal primary keys feature.  One
thing this feature had to overcome is that

- Only btree is supported for primary keys.

For that feature, we wanted to add support for gist indexes, and we
had to again find a way to get the equals operator (and also the
overlaps operator).

The solution we ended up with there is to add a support function to
gist that translates the strategy numbers used by the operator class
to "well-known" strategy numbers that we can understand (commit
7406ab623fe).  It later turned out that this problem actually might be
a subset of the general problem being discussed here, so we can also
change that solution if we find a more general solution here.

So the question is, how do we communicate these fundamental semantics
of operators?

The solution we discussed for temporal primary keys is that we just
mandate that certain strategy numbers have fixed meanings.  This is
how btree and hash work anyway, but this was never required for gist.
And there are many gist opclass implementations already out there that
don't use compatible strategy numbers, so that solution would not have
worked (at least without some upgrading pain).  Hence the idea of
mapping the actual strategy numbers to a fixed set of well-known ones.

For the general case across all index AMs, we could similarly decree
that all index AMs use certain strategy numbers for fixed purposes.
But again, this already wouldn't work for gist and perhaps others
already out there.  So here again we need some kind of mapping to
numbers with a fixed purpose.  Or perhaps new index AMs can start out
with the "correct" numbers and only some existing ones would need the
mapping.

And then what do you map them to?

One idea would be to map them to the existing btree numbers.  This
could work, but I always found it kind of confusing that you'd then
have multiple sets of strategy numbers in play, one native to the
index AM or opclass, and one sort of global one.

Another idea is to map them to another thing altogether, a new enum.
This is what this patch series does, essentially.  But it turns out
(so argues the patch series), that this enum already exists, namely

typedef enum RowCompareType
{
     /* Values of this enum are chosen to match btree strategy numbers */
     ROWCOMPARE_LT = 1,          /* BTLessStrategyNumber */
     ROWCOMPARE_LE = 2,          /* BTLessEqualStrategyNumber */
     ROWCOMPARE_EQ = 3,          /* BTEqualStrategyNumber */
     ROWCOMPARE_GE = 4,          /* BTGreaterEqualStrategyNumber */
     ROWCOMPARE_GT = 5,          /* BTGreaterStrategyNumber */
     ROWCOMPARE_NE = 6,          /* no such btree strategy */
} RowCompareType;

which in spite of the name, yes, sounds like exactly that kind of
thing.  And it's also used in many places for this kind of thing
already.

The patch series adds index AM callbacks

     amtranslatestrategy
     amtranslaterctype

to convert between strategy number and RowCompareType, and also adds

     ROWCOMPARE_NONE
     ROWCOMPARE_INVALID

to handle cases where there is no mapping.  I suppose for the temporal
PK use, we would need to add something like ROWCOMPARE_OVERLAPS.

So this is the idea.  To take a step back, I can see the following
options:

1. Require all index AMs to use btree-compatible strategy numbers.
    (Previously rejected, too much upgrading mess.)

2. Provide mapping between index AM strategy numbers and btree strategy
    numbers.  (This doesn't have a space for non-btree operations like
    overlaps.)

3. Provide mapping between index AM strategy numbers and the existing
    RT*StrategyNumber numbers.  (We can expand the set as we want.)
    (This is what the existing mapping function for gist does.)

4. Provide mapping between index AM strategy numbers and some
    completely new set of numbers/symbols.

5. Provide mapping between index AM strategy numbers and
    RowCompareType (with some small extensions).  This is what this
    patch does.

Other ideas?  Thoughts?




Re: Index AM API cleanup

From
Mark Dilger
Date:

> On Oct 30, 2024, at 12:54 PM, Peter Eisentraut <peter@eisentraut.org> wrote:
>
> So this is the idea.  To take a step back, I can see the following
> options:
>
> 1. Require all index AMs to use btree-compatible strategy numbers.
>   (Previously rejected, too much upgrading mess.)
>
> 2. Provide mapping between index AM strategy numbers and btree strategy
>   numbers.  (This doesn't have a space for non-btree operations like
>   overlaps.)

I agree that neither of these options are good, for the reasons you mention.

> 3. Provide mapping between index AM strategy numbers and the existing
>   RT*StrategyNumber numbers.  (We can expand the set as we want.)
>   (This is what the existing mapping function for gist does.)

The point of such a mapping is that core code outside any index AM can know what an AM is claiming it can do when it
advertisesthat it implements one of these strategy numbers.  We don't need any new entries in that mapping until some
corefeature depends on the semantics of the new entry.  Right now, all six of the btree comparators (including
not-equal)have semantics that are used outside AMs by functions like match_clause_to_ordering_op().  If any index AM
authorcomes along and creates an index AM which purports to provide these six semantics but actually does something
semanticallyinconsistent with what the backend thinks these mean, that index AM is totally at fault when, for example,
ORDERBY returns the wrong results. 

On the other hand, if we add RTOverlapStrategyNumber to the common framework of strategy numbers, without anything
outsidean index AM depending on that, how is an index AM author to know exactly how an "overlaps" operator is supposed
tobehave?  No doubt, brin, gist, spgist, and friends all have their own understanding of what RTOverlapStrategyNumber
means,but how is a new index AM supposed to know if it has analogized that concept correctly to its own operator?  And
ifseveral major versions later, you come along to create some feature, let's say a logical replication feature
dependingon "overlaps" semantics, how are you to know whether all the index AMs in the wild which advertise they
providean "overlaps" operator will work correctly with your new feature?  When logical replication breaks, who is at
fault? Perversely, knowing that RTOverlapStrategyNumber is already in the list, you would likely implement the new
logicalreplication feature on some new strategy number, perhaps naming it RTLogicalOverlapStrategyNumber, to avoid such
conflicts.

The RT*StrategyNumber list is much too long, containing many such problematic entries.  We should not, in my opinion,
addthese to the list prior to some new feature which depends on them, such as a planner or executor optimization. 

> 4. Provide mapping between index AM strategy numbers and some
>   completely new set of numbers/symbols.

This is fine, if the new set is sufficiently restricted.  However, as mentioned below, the set of sufficiently
restrictedvalues is identical to what we currently define as a RowCompareType.  It creates needless code churn to throw
thataway and replace it with a new name. 

> 5. Provide mapping between index AM strategy numbers and
>   RowCompareType (with some small extensions).  This is what this
>   patch does.

As the patch author, obviously this is the one I chose.  The "small extensions" are just to handle "no such value" type
cases.


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Index AM API cleanup

From
Kirill Reshke
Date:
On Thu, 31 Oct 2024 at 15:02, Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>
>
>
> > On Oct 30, 2024, at 12:54 PM, Peter Eisentraut <peter@eisentraut.org> wrote:
> >
> > So this is the idea.  To take a step back, I can see the following
> > options:
> >
> > 1. Require all index AMs to use btree-compatible strategy numbers.
> >   (Previously rejected, too much upgrading mess.)
> >
> > 2. Provide mapping between index AM strategy numbers and btree strategy
> >   numbers.  (This doesn't have a space for non-btree operations like
> >   overlaps.)
>
> I agree that neither of these options are good, for the reasons you mention.
>
> > 3. Provide mapping between index AM strategy numbers and the existing
> >   RT*StrategyNumber numbers.  (We can expand the set as we want.)
> >   (This is what the existing mapping function for gist does.)
>
> The point of such a mapping is that core code outside any index AM can know what an AM is claiming it can do when it
advertisesthat it implements one of these strategy numbers.  We don't need any new entries in that mapping until some
corefeature depends on the semantics of the new entry.  Right now, all six of the btree comparators (including
not-equal)have semantics that are used outside AMs by functions like match_clause_to_ordering_op().  If any index AM
authorcomes along and creates an index AM which purports to provide these six semantics but actually does something
semanticallyinconsistent with what the backend thinks these mean, that index AM is totally at fault when, for example,
ORDERBY returns the wrong results. 
>
> On the other hand, if we add RTOverlapStrategyNumber to the common framework of strategy numbers, without anything
outsidean index AM depending on that, how is an index AM author to know exactly how an "overlaps" operator is supposed
tobehave?  No doubt, brin, gist, spgist, and friends all have their own understanding of what RTOverlapStrategyNumber
means,but how is a new index AM supposed to know if it has analogized that concept correctly to its own operator?  And
ifseveral major versions later, you come along to create some feature, let's say a logical replication feature
dependingon "overlaps" semantics, how are you to know whether all the index AMs in the wild which advertise they
providean "overlaps" operator will work correctly with your new feature?  When logical replication breaks, who is at
fault? Perversely, knowing that RTOverlapStrategyNumber is already in the list, you would likely implement the new
logicalreplication feature on some new strategy number, perhaps naming it RTLogicalOverlapStrategyNumber, to avoid such
conflicts.
>
> The RT*StrategyNumber list is much too long, containing many such problematic entries.  We should not, in my opinion,
addthese to the list prior to some new feature which depends on them, such as a planner or executor optimization. 
>
> > 4. Provide mapping between index AM strategy numbers and some
> >   completely new set of numbers/symbols.
>
> This is fine, if the new set is sufficiently restricted.  However, as mentioned below, the set of sufficiently
restrictedvalues is identical to what we currently define as a RowCompareType.  It creates needless code churn to throw
thataway and replace it with a new name. 
>
> > 5. Provide mapping between index AM strategy numbers and
> >   RowCompareType (with some small extensions).  This is what this
> >   patch does.
>
> As the patch author, obviously this is the one I chose.  The "small extensions" are just to handle "no such value"
typecases. 
>
>
> —
> Mark Dilger
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Hi! Can we please have a rebased version of this patch series?

--
Best regards,
Kirill Reshke



Re: Index AM API cleanup

From
Mark Dilger
Date:

> On Nov 16, 2024, at 9:10 AM, Kirill Reshke <reshkekirill@gmail.com> wrote:
>
> Hi! Can we please have a rebased version of this patch series?

Sorry for the delay, and thanks for your interest.  I will try to get around to rebasing in the next few days.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Index AM API cleanup

From
Peter Eisentraut
Date:
On 26.11.24 15:27, Andrew Dunstan wrote:
> On 2024-11-19 Tu 6:26 PM, Mark Dilger wrote:
>>> On Nov 16, 2024, at 9:10 AM, Kirill Reshke <reshkekirill@gmail.com> 
>>> wrote:
>>>
>>> Hi! Can we please have a rebased version of this patch series?
>> Sorry for the delay, and thanks for your interest.  I will try to get 
>> around to rebasing in the next few days.
> 
> beat you to it :-)

Thanks for that.

I'm content to move forward with the approach of mapping to 
RowCompareType, as we had already discussed.

I think, however, that we should rename RowCompareType.  Otherwise, it's 
just going to be confusing forevermore.  I suggest to rename it simply 
to CompareType.

What I really don't like is that there is ROWCOMPARE_INVALID *and* 
ROWCOMPARE_NONE.  No one is going to understand that and keep it 
straight.  I also don't really see a reason for having both.  I tried 
removing one of them, and the only thing that failed was the new code in 
AlterOpFamilyAdd() that tries to use strategy_get_rctype() to detect 
valid operator numbers.  But that's new code that is not essential to 
the purpose of this patch series.  So I strongly suggest that we just 
have ROWCOMPARE_INVALID (to mirror InvalidStrategy) with value 0.  If 
there is a reason to have a _NONE that is separate from that, we need to 
think of a different interface.

We should make sure that gist is properly integrated into this 
framework, give that we already have strategy number mapping logic 
there.  The current patch series does not provide mapping functions for 
gist.  We should add those.  They should make use of 
GistTranslateStratnum() (and we might need to add an inverse function). 
The problem is that gist and GistTranslateStratnum() also need the 
opclass, because the strategy numbers are not fixed for gist.  So the 
amtranslatestrategy and amtranslaterctype callbacks probably need to 
expanded to cover that.

It might be that spgist has a similar requirement, I'm not sure.  The 
code in spgutils.c raises some doubts about what it's doing.  I'm 
curious why this patch set provides an implementation for spgist but not 
gist.  Is there anything interesting about spgist for this purpose?

I'm going to try to code up the gist support on top of this patch set to 
make sure that it will fit well.  I'll report back.




Re: Index AM API cleanup

From
Mark Dilger
Date:

> On Nov 27, 2024, at 4:57 AM, Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 26.11.24 15:27, Andrew Dunstan wrote:
>> On 2024-11-19 Tu 6:26 PM, Mark Dilger wrote:
>>>> On Nov 16, 2024, at 9:10 AM, Kirill Reshke <reshkekirill@gmail.com> wrote:
>>>>
>>>> Hi! Can we please have a rebased version of this patch series?
>>> Sorry for the delay, and thanks for your interest.  I will try to get around to rebasing in the next few days.
>> beat you to it :-)
>
> Thanks for that.
>
> I'm content to move forward with the approach of mapping to RowCompareType, as we had already discussed.
>
> I think, however, that we should rename RowCompareType.  Otherwise, it's just going to be confusing forevermore.  I
suggestto rename it simply to CompareType. 

I kept the name RowCompareType to avoid code churn, but go ahead and rename it if you prefer.

> What I really don't like is that there is ROWCOMPARE_INVALID *and* ROWCOMPARE_NONE.  No one is going to understand
thatand keep it straight.  I also don't really see a reason for having both.  I tried removing one of them, and the
onlything that failed was the new code in AlterOpFamilyAdd() that tries to use strategy_get_rctype() to detect valid
operatornumbers.  But that's new code that is not essential to the purpose of this patch series.  So I strongly suggest
thatwe just have ROWCOMPARE_INVALID (to mirror InvalidStrategy) with value 0.  If there is a reason to have a _NONE
thatis separate from that, we need to think of a different interface. 

Yeah, those two can be combined if you like.  I found the distinction between them useful during patch development, but
Iagree people will have a hard time understanding the difference.  For the record, the difference is between "this
indexdoesn't provide the requested functionality" vs. "you passed in an invalid parameter". 

> We should make sure that gist is properly integrated into this framework, give that we already have strategy number
mappinglogic there.  The current patch series does not provide mapping functions for gist.  We should add those.  They
shouldmake use of GistTranslateStratnum() (and we might need to add an inverse function). The problem is that gist and
GistTranslateStratnum()also need the opclass, because the strategy numbers are not fixed for gist.  So the
amtranslatestrategyand amtranslaterctype callbacks probably need to expanded to cover that. 
>
> It might be that spgist has a similar requirement, I'm not sure.  The code in spgutils.c raises some doubts about
whatit's doing.  I'm curious why this patch set provides an implementation for spgist but not gist.  Is there anything
interestingabout spgist for this purpose? 

That was experimental, as the code comment indicates:

+               /*
+                * Assume these have the semantics of =, !=, <, <=, >, >= as their
+                * names imply.
+                *
+                * TODO: Audit the behavior of these RTree strategies as used by spgist
+                *               to determine if this assumption is appropriate.
+                */

In hindsight, this patch might have been less confusing if I had simply not provided the callbacks for spgist.

> I'm going to try to code up the gist support on top of this patch set to make sure that it will fit well.  I'll
reportback. 

The design of the patch allows indexes to simply not participate in the mapping.  There's no requirement that an index
providethese callbacks.  So one approach is to just not touch gist and spgist and leave that for another day.  However,
ifyou want to verify that the interface is sufficient to meet gist's needs, then go ahead and try something out. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Index AM API cleanup

From
Paul Jungwirth
Date:
On 8/26/24 08:10, Mark Dilger wrote:
 > Paul, it seems what you are doing in v39-0001-Add-stratnum-GiST-support-function.patch is similar 
to what I am doing in v17-0012-Convert-strategies-to-and-from-row-compare-types.patch.

Thank you inviting me to share some thoughts here! The goals of this patch make a lot of sense to 
me. I'm very interested in making the non-btree AMs more useful, both to users and to Postgres core 
itself. Here is a breakdown of what I think so far:

- Strategy Numbers
- Can you create unique GiST indexes or not?
- Updating opclasses
- Logical replication
- Empty ranges

Most of this is pretty general and not a line-by-line review of the patches. I'd like to do that 
too, but this email is already long and late.


# Strategy Numbers

I think this is the latest re strategy numbers:

On 12/4/24 06:49, Peter Eisentraut wrote:
 > On 27.11.24 13:57, Peter Eisentraut wrote:
 >> I think, however, that we should rename RowCompareType.  Otherwise, it's just going to be
 >> confusing forevermore.  I suggest to rename it simply to CompareType.
 >
 >> I'm going to try to code up the gist support on top of this patch set to make sure that it will
 >> fit well.  I'll report back.
 >
 > Here is a patch set in that direction.  It renames RowCompareType to CompareType and updates the
 > surrounding commentary a bit.  And then I'm changing the gist strategy mapping to use the
 > CompareType values instead of the RT* strategy numbers.  Seeing this now, I like this a lot better
 > than what we have now, because it makes it clearer in the API and the code what is a real strategy
 > number and what's a different kind of thing.  (This isn't entirely the above-mentioned integration
 > of the gist support into your patch set yet, but it's a meaningful part of it.)

I see that this supports COMPARE_OVERLAPS and COMPARE_CONTAINED_BY, which provides most of what 
foreign keys require. Great!

But I also need the intersect operator (*). There is no stratnum for that. Even if I added a 
stratnum, it is not really about an *index*. With my current approach, that made me uncomfortable 
using pg_amop, since I'd need to invent a new amoppurpose. Intersect isn't used for search or 
ordering, just for foreign keys (and later for temporal update/delete). Similarly, This new enum is 
COMPARE_*, and intersect isn't a comparison.

Is there anything we can do about that? Is pg_amop really only for indexes, or can I add a new 
amoppurpose and use it? And does this enum need to be only for indexes, or can it be for other 
things too? Maybe instead of COMPARE_* it can be OPERATION_*.

Today I just hardcode the built-in intersect operator for ranges and multiranges, but I would rather 
have a solution that lets people define foreign keys on arbitrary types. Temporal primary keys 
support arbitrary types (well other than the empty-range issue, which is something I think we can 
resolve), as do temporal UPDATE and DELETE, so getting foreign keys to support them would complete 
the picture. To me this goal is more faithful to Postgres's heritage as an "object-relational" 
database, a philosophy that has made it so extensible.

As I mentioned, UPDATE/DELETE FOR PORTION OF also needs intersect. But it only needs a proc, not an 
operator. For that I added another GiST support proc, intersect, that is just the same proc that 
implements the operator. But if foreign keys had a way to get an operator for any type, then FOR 
PORTION OF could use that too, because an operator implies a proc (but not vice versa).
Then I'd need one less support function.

Or here is another approach: I was thinking that instead of a `stratnum` support proc, used to get 
strategy numbers (whether by well-known stratnum or by COMPARE_*), we could have a support proc that 
*returns operator oids* (again, whether by well-known strategy number or by COMPARE_*). Instead of 
`stratnum` we could call it something like `get_operator`. I can't think of any other use for 
strategy numbers than looking up operators. The existing callsites of GistTranslateStratnum all 
immediately call get_opfamily_member (and Peter's patches don't change that). What if we had a 
higher-level wrapper of get_opfamily_member that took the same parameters, but used this new support 
function (if defined) when called for non-btree AMs?

Then (1) foreign keys could use it get the intersect operator (2) I could drop the FOR PORTION OF 
support proc. (The stratnum support proc only exists in v18devel, so it's still easy to change.)

One problem is: how do extension authors know the oids of their operators? I guess they could look 
them up in pg_operator (via pg_depend?) But we should acknowledge that it's harder for them to get 
oids than for core. Or maybe this isn't such a big deal after all: pg_operator is unique on 
(oprname, oprleft, oprright, oprnamespace).

Are there other problems? It does seem like a risk to have a second "source of truth", especially 
one that is implemented as code rather than data. What if the code changes? But still I like this 
approach. Non-btree indexes can't really use pg_amop anyway, because amopstrategy doesn't mean 
anything. I'm happy to make a commit replacing the stratnum support proc with get_operator as I've 
described it.

But then what is its input? If it's a stratnum, there is no problem: I'll add an 
RTIntersectStrategyNumber. If it's a COMPARE_*, then we still want a non-index-specific name for 
this enum.

Either approach to getting an intersect operator seems fine to me: using pg_amop with a third 
amoppurpose, or replacing stratnum with get_operator. But both seem to expand an index-specific 
abstraction to include something new. What do you think?


# Can you create unique GiST indexes or not?

Postgres lets you ask about the capabilities of different AMs. For instance:

        postgres=# select amname, pg_indexam_has_property(oid, 'can_unique') from pg_am where amtype = 'i';
         amname | pg_indexam_has_property
        --------+-------------------------
         btree  | t
         hash   | f
         gist   | f
         gin    | f
         spgist | f
         brin   | f
        (6 rows)

So if GiST can't support unique indexes, why can you do this?:

        postgres=# create extension btree_gist;
        CREATE EXTENSION
        postgres=# create table t (id int, valid_at daterange, unique (id, valid_at without overlaps));
        CREATE TABLE
        postgres=# \d t
                                             Table "public.t"
            Column  |   Type    | Collation | Nullable | Default
        ----------+-----------+-----------+----------+---------
         id       | integer   |           |          |
         valid_at | daterange |           |          |
        Indexes:
                "t_id_valid_at_key" UNIQUE (id, valid_at WITHOUT OVERLAPS)

And also:

        postgres=# select indisunique from pg_index where indexrelid = 't_id_valid_at_key'::regclass;
         indisunique
        -------------
         t

But:

        postgres=# create unique index idx_t on t using gist (id, valid_at);
        ERROR:  access method "gist" does not support unique indexes

It seems like we have an inconsistency, as Matthew van de Meent brought up here (cc'd): 
https://www.postgresql.org/message-id/CAEze2WiD%2BU1BuJDLGL%3DFXxa8hDxNALVE6Jij0cNXjp10Q%3DnZHw%40mail.gmail.com

The reason is that GiST *might* support unique indexes, but it depends on the opclass. (And note 
GiST has supported effectively-unique indexes via exclusion constraints for a long time.)

 From another perspective, the error message is correct: you can't create a unique GiST index using 
CREATE UNIQUE INDEX. You can only create a UNIQUE/PRIMARY KEY *constraint*, which creates a unique 
GiST index to implement it. But that is not a very satisfying answer.

But since we are improving stratnums, we could allow the above command, even with what is already 
committed today: just see if we can find an equals operator, and create the index. I'm interested in 
working on that myself, but the other temporal patches are a higher priority.

In the meantime perhaps we can improve the error message to sound less contradictory. How about 
"access method 'gist' does not support unique indexes without a WITHOUT OVERLAPS constraint"?

And anyway, creating simply-unique GiST indexes is not very valuable, when you could just create a 
btree index instead. The main reason is legalistic, so we aren't guilty of contradicting our error 
message. It doesn't satisfy Matthew's use-case. He wants to avoid long exclusive locks by doing 
CREATE INDEX CONCURRENTLY then ADD CONSTRAINT USING INDEX. But then you'd need to CREATE UNIQUE 
INDEX with overlaps for the last element, not equals. There is no syntax for that. Perhaps we could 
borrow the syntax for constraints: CREATE UNIQUE INDEX idx ON t USING gist (id, valid_at WITHOUT 
OVERLAPS). It's not part of the standard though, and it might have parse conflicts with COLLATE or 
an opclass name.

But that *still* doesn't get there, because (1) the index has to enforce (temporal) uniqueness 
before you add the constraint (2) it needs to allow CONCURRENTLY. Exclusion constraints don't 
enforce their checks while updating the index, but with a separate search (IIRC), and the details 
would probably have to move from pg_constraint.conexclop to pg_index.indexclop (which doesn't exist 
today). To pre-create *any* exclusion constraint index, we'd need a more general syntax than WITHOUT 
OVERLAPS. Something like (id WITH =, valid_at WITH &&).

And then that has to work with CONCURRENTLY. You *can* create a GiST index CONCURRENTLY today, but 
not one that enforces anything. There was work already to allow CONCURRENTLY to REINDEX exclusion 
constraint indexes, but I assume it's tricky, because it has stalled twice:
see 

https://www.postgresql.org/message-id/flat/CAB7nPqS%2BWYN021oQHd9GPe_5dSVcVXMvEBW_E2AV9OOEwggMHw%40mail.gmail.com#e1a372074cfdf37bf9e5b4e29ddf7b2d

from 2012 and 

https://www.postgresql.org/message-id/flat/60052986-956b-4478-45ed-8bd119e9b9cf%402ndquadrant.com#74948a1044c56c5e817a5050f554ddee

from 2019. So there are some substantial gaps to fill, and I think the first step for now is just 
updating the error message.

What else can we do here? We have pg_indexam_has_property, pg_index_has_property, and 
pg_index_column_has_property. Do we want a pg_opclass_has_property?

Or here is a more modest suggestion: we could permit pg_index_has_property to answer can_unique 
questions. Today it returns NULL for that property. Perhaps it should (by default) answer whatever 
the AM can do. Then we could update gistproperty (which today only answers column-level inquiries), 
and return indisunique. To be honest that feels like an oversight that should have been included in 
the temporal primary key patch. I'll sign up to do that if people think it makes sense.

Another thought: should pg_indexam_has_property(783, 'can_unique') return NULL, on the theory that 
NULL means "unknown", and as of pg 18 GiST uniqueness is no longer "false" but now "it depends"? 
Maybe that is too cute. If we did that, probably it should be paired with a way to get a definitive 
answer for the scenario you care about (without creating an index first). pg_opclass_has_property 
could do that. Or maybe we add a second pg_indexam_has_property that also takes an array of opclass 
oids?


# Logical replication

I had to teach logical replication how to handle temporal index constraints. It wants a unique 
identifier for REPLICA IDENTITY. Even with WITHOUT OVERLAPS, the combination of index keys do 
identify a unique record. But again, logical replication needs to know which operators implement 
equals for non-btree indexes. I think the discussion in this thread has already covered this 
sufficiently. Peter's COMPARE_* change doesn't introduce any problems from what I can see.


# Updating opclasses

Today you can alter an opfamily, but you can't alter an opclass. So to add stratnum support 
functions, the btree_gist upgrade script does this:

        ALTER OPERATOR FAMILY gist_int4_ops USING gist ADD
            FUNCTION 12 (int4, int4) gist_stratnum_btree (int2) ;

Past upgrade scripts have done similar things.

But that doesn't achieve quite the same thing as putting the function in CREATE OPERATOR CLASS. Do 
we want a way to attach a new support function to an *opclass*? There are some problems to solve 
with staleness, the relcache for example I think, but nothing seems insurmountable. Maybe it risks 
incoherent indexes, if you change how they are computed partway through, but if so it was probably 
true already with ALTER OPERATOR FAMILY.


# Empty ranges

This is probably not relevant to this discussion, but just in case: in the v17 cycle we had to 
revert temporal PKs because empty ranges allowed duplicate records (because 'empty' && 'empty' is 
false, so there was no conflict). The solution was to reject empty ranges in a temporal PK or UNIQUE 
constraint, where they really don't make sense. Is there something generalized the index AM API 
could offer for this kind of thing?

Yours,

-- 
Paul              ~{:-)
pj@illuminatedcomputing.com




Re: Index AM API cleanup

From
Mark Dilger
Date:

> On Jan 24, 2025, at 11:18 PM, Peter Eisentraut <peter@eisentraut.org> wrote:
>
> I've been working on integrating Mark's "Index AM API cleanup" patch set with the existing gist strategy number
mappingfrom Paul's application time patch set.  Here is what I've come up with. 

Thank you.

> The previously committed patch (v19.1) already changed the gist strategy number mapping to use the (Row)CompareType,
asin Mark's proposal. 
>
> In this patch set, I'm attaching the existing standalone gist translate function as the index AM API function for the
gistindex AM.  And then all existing callers are changed to call through the index AM API functions provided by Mark's
patchset. 
>
> Patches 0001, 0002, 0003 are some preparatory renaming and refactoring patches.

These all look sensible.  I like that they reduce code duplication. No objection.

> Patches 0004 and 0005 are patch v19-0008 from Mark's (via Andrew) v19 patch set, split into two patches, and with
somefunction renaming from my side. 

0004 needs the copyright notice updated to 2025.

> Patch 0006 then pulls it all together.  The key change is that we also need to pass the opclass to the index AM API
functions,so that access methods like gist can use it.  Actually, I changed that to pass opfamily and opcintype
instead. I think this matches better with the rest of the "Index AM API cleanup" patch set, because it's more common to
havethe opfamily and type handy than the opclass.  (And internally, the gist support function is attached to the
opfamilyanyway, so it's actually simpler that way.) 
>
> I think this fits together quite well now.  Several places where gist was hardcoded are now fully (or mostly)
independentof gist.  Also, the somewhat hackish get_equal_strategy_number() in the logical replication code disappears
completelyand is replaced by a proper index AM API function.  (This also takes care of patch v19-0011 from Mark's patch
set.)

I find the hardcoded GIST_AM_OID in GetOperatorFromCompareType() antithetical to the purpose of this patch-set.
Surely,we can avoid hardcoding this.  The patch as you have it works, I admit, but as soon as an Index AM author tries
todo the equivalent of what GiST is doing, they'll hit a wall.  Making them submit a patch and wait until the next
majorrelease of Postgres ships isn't a solution, either; index AM authors may work separately from the core cadence,
andin any event, often want their new indexes to work with postgres going back several versions. 

How about we fix this now?  I threw this together in a hurry, and might have screwed it up, so please check carefully.
Ifyou like this, we should go at least one more round of making sure this has thorough regression testing, but just to
getyour feedback, this can be applied atop your patch-set: 

From 9710af54a8d468a1c7a06464d27d2f56dd9ee490 Mon Sep 17 00:00:00 2001
From: Mark Dilger <mark.dilger@enterprisedb.com>
Date: Wed, 29 Jan 2025 19:08:27 -0800
Subject: [PATCH v19.3] Avoid hardcoding GIST_AM_OID

---
 src/backend/catalog/pg_constraint.c | 5 ++++-
 src/backend/commands/indexcmds.c    | 5 ++---
 src/backend/commands/tablecmds.c    | 4 +++-
 src/backend/utils/adt/ri_triggers.c | 3 ++-
 src/include/catalog/pg_constraint.h | 3 ++-
 src/include/commands/defrem.h       | 2 +-
 6 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/src/backend/catalog/pg_constraint.c b/src/backend/catalog/pg_constraint.c
index ac80652baf..492de1e3c1 100644
--- a/src/backend/catalog/pg_constraint.c
+++ b/src/backend/catalog/pg_constraint.c
@@ -1622,7 +1622,8 @@ DeconstructFkConstraintRow(HeapTuple tuple, int *numfks,
  * to just what was updated/deleted.
  */
 void
-FindFKPeriodOpers(Oid opclass,
+FindFKPeriodOpers(Oid amid,
+                  Oid opclass,
                   Oid *containedbyoperoid,
                   Oid *aggedcontainedbyoperoid,
                   Oid *intersectoperoid)
@@ -1654,6 +1655,7 @@ FindFKPeriodOpers(Oid opclass,
                                InvalidOid,
                                COMPARE_CONTAINED_BY,
                                containedbyoperoid,
+                               amid,
                                &strat);

     /*
@@ -1665,6 +1667,7 @@ FindFKPeriodOpers(Oid opclass,
                                ANYMULTIRANGEOID,
                                COMPARE_CONTAINED_BY,
                                aggedcontainedbyoperoid,
+                               amid,
                                &strat);

     switch (opcintype)
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 637ff52b50..e617125a5b 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -2168,7 +2168,7 @@ ComputeIndexAttrs(IndexInfo *indexInfo,
                 cmptype = COMPARE_OVERLAP;
             else
                 cmptype = COMPARE_EQ;
-            GetOperatorFromCompareType(opclassOids[attn], InvalidOid, cmptype, &opid, &strat);
+            GetOperatorFromCompareType(opclassOids[attn], InvalidOid, cmptype, &opid, accessMethodId, &strat);
             indexInfo->ii_ExclusionOps[attn] = opid;
             indexInfo->ii_ExclusionProcs[attn] = get_opcode(opid);
             indexInfo->ii_ExclusionStrats[attn] = strat;
@@ -2420,9 +2420,8 @@ GetDefaultOpClass(Oid type_id, Oid am_id)
  */
 void
 GetOperatorFromCompareType(Oid opclass, Oid rhstype, CompareType cmptype,
-                           Oid *opid, StrategyNumber *strat)
+                           Oid *opid, Oid amid, StrategyNumber *strat)
 {
-    Oid            amid = GIST_AM_OID;    /* For now we only need GiST support. */
     Oid            opfamily;
     Oid            opcintype;

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 18f64db6e3..6c76c40fdc 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -10236,8 +10236,10 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
         Oid            periodoperoid;
         Oid            aggedperiodoperoid;
         Oid            intersectoperoid;
+        Oid            amoid;

-        FindFKPeriodOpers(opclasses[numpks - 1], &periodoperoid, &aggedperiodoperoid,
+        amoid = get_rel_relam(indexOid);
+        FindFKPeriodOpers(amoid, opclasses[numpks - 1], &periodoperoid, &aggedperiodoperoid,
                           &intersectoperoid);
     }

diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 3d9985b17c..68b1c2d4b3 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -2337,8 +2337,9 @@ ri_LoadConstraintInfo(Oid constraintOid)
     if (riinfo->hasperiod)
     {
         Oid            opclass = get_index_column_opclass(conForm->conindid, riinfo->nkeys);
+        Oid            amid = get_rel_relam(conForm->conindid);

-        FindFKPeriodOpers(opclass,
+        FindFKPeriodOpers(amid, opclass,
                           &riinfo->period_contained_by_oper,
                           &riinfo->agged_period_contained_by_oper,
                           &riinfo->period_intersect_oper);
diff --git a/src/include/catalog/pg_constraint.h b/src/include/catalog/pg_constraint.h
index 6da164e7e4..7ba6f077cd 100644
--- a/src/include/catalog/pg_constraint.h
+++ b/src/include/catalog/pg_constraint.h
@@ -288,7 +288,8 @@ extern void DeconstructFkConstraintRow(HeapTuple tuple, int *numfks,
                                        AttrNumber *conkey, AttrNumber *confkey,
                                        Oid *pf_eq_oprs, Oid *pp_eq_oprs, Oid *ff_eq_oprs,
                                        int *num_fk_del_set_cols, AttrNumber *fk_del_set_cols);
-extern void FindFKPeriodOpers(Oid opclass,
+extern void FindFKPeriodOpers(Oid accessMethodId,
+                              Oid opclass,
                               Oid *containedbyoperoid,
                               Oid *aggedcontainedbyoperoid,
                               Oid *intersectoperoid);
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 6d9348bac8..6785770737 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -51,7 +51,7 @@ extern Oid    GetDefaultOpClass(Oid type_id, Oid am_id);
 extern Oid    ResolveOpClass(const List *opclass, Oid attrType,
                            const char *accessMethodName, Oid accessMethodId);
 extern void GetOperatorFromCompareType(Oid opclass, Oid rhstype, CompareType cmptype,
-                                       Oid *opid, StrategyNumber *strat);
+                                       Oid *opid, Oid amid, StrategyNumber *strat);

 /* commands/functioncmds.c */
 extern ObjectAddress CreateFunction(ParseState *pstate, CreateFunctionStmt *stmt);
--
2.39.3 (Apple Git-145)

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company