Thread: Handling of opckeytype / CREATE OPERATOR CLASS (bug?)

Handling of opckeytype / CREATE OPERATOR CLASS (bug?)

From
Tomas Vondra
Date:
Hi,

while working on the new BRIN opclasses in [1], I stumbled on something
I think is actually a bug in how CREATE OPERATOR CLASS handles the
storage type. If you look at built-in opclasses, there's a bunch of
cases where (opcintype == opckeytype), for example the BRIN opclasses
for inet data type:

test=# select oid, opcname, opcfamily, opcintype, opckeytype from
pg_opclass where opcintype = 869 order by oid;

  oid  |        opcname        | opcfamily | opcintype | opckeytype
-------+-----------------------+-----------+-----------+------------
 10105 | inet_minmax_ops       |      4075 |       869 |        869
 10106 | inet_inclusion_ops    |      4102 |       869 |        869

The fact that opckeytype is set is important, because this allows the
opclass to handle data with cidr data type - there's no opclass for
cidr, but we can do this:

create table t (a cidr);
insert into t values ('127.0.0.1');
insert into t values ('127.0.0.2');
insert into t values ('127.0.0.3');
create index on t using brin (a inet_minmax_ops);

This seems to work fine. Now, if you manually update the opckeytype to
0, you'll get this:

test=# update pg_opclass set opckeytype = 0 where oid = 10105;
UPDATE 1
test=# create index on t using brin (a inet_minmax_ops);
ERROR:  missing operator 1(650,650) in opfamily 4075


Unfortunately, it turns out it's impossible to re-create this opclass
using CREATE OPERATOR CLASS, because the opclasscmds.c does this:

    /* Just drop the spec if same as column datatype */
    if (storageoid == typeoid && false)
        storageoid = InvalidOid;

So the storageoid is reset to 0. This only works for the built-in
opclasses because those are injected directly into the catalogs.

Either the CREATE OPERATOR CLASS is not considering something before
resetting the storageoid, or maybe it should be reset for all opclasses
(including the built-in ones) and the code that's using it should
restore it in those cases (AFAICS opckeytype=0 means it's the same as
opcintkey).

Attached is a PoC patch doing the first thing - this does the trick for
me, but I'm not 100% sure it's the right fix.


[1]
https://www.postgresql.org/message-id/54b47e66-bd8a-d44a-2090-fd4b2f49abe6%40enterprisedb.com

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: Handling of opckeytype / CREATE OPERATOR CLASS (bug?)

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
> while working on the new BRIN opclasses in [1], I stumbled on something
> I think is actually a bug in how CREATE OPERATOR CLASS handles the
> storage type.

Hm.  Both catalogs.sgml and pg_opclass.h say specifically that
opckeytype should be zero if it's to be the same as the input
column type.  I don't think just dropping the enforcement of that
is the right answer.

I don't recall for sure what-all might depend on that.  I suspect
that it's mostly for the benefit of polymorphic opclasses, so
maybe the right thing is to say that the opckeytype can be
polymorphic if opcintype is, and then we resolve it as per
the usual polymorphism rules.

In any case, it's fairly suspicious that the only opclasses
violating the existing rule are johnny-come-lately BRIN opclasses.

            regards, tom lane



Re: Handling of opckeytype / CREATE OPERATOR CLASS (bug?)

From
Tomas Vondra
Date:

On 3/23/21 3:13 AM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
>> while working on the new BRIN opclasses in [1], I stumbled on something
>> I think is actually a bug in how CREATE OPERATOR CLASS handles the
>> storage type.
> 
> Hm.  Both catalogs.sgml and pg_opclass.h say specifically that
> opckeytype should be zero if it's to be the same as the input
> column type.  I don't think just dropping the enforcement of that
> is the right answer.
> 

Yeah, that's possible. I was mostly just demonstrating the difference in
behavior. Maybe the right fix is to fix the catalog contents and then
tweak the AM code, or something.

> I don't recall for sure what-all might depend on that.  I suspect
> that it's mostly for the benefit of polymorphic opclasses, so
> maybe the right thing is to say that the opckeytype can be
> polymorphic if opcintype is, and then we resolve it as per
> the usual polymorphism rules.
> 

I did an experiment - fixed all the opclasses violating the rule by
removing the opckeytype, and ran make checke. The only cases causing
issues were cidr and int4range. Not that it proves anything.

> In any case, it's fairly suspicious that the only opclasses
> violating the existing rule are johnny-come-lately BRIN opclasses.
> 

Right, that seems suspicious.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Handling of opckeytype / CREATE OPERATOR CLASS (bug?)

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
> On 3/23/21 3:13 AM, Tom Lane wrote:
>> Hm.  Both catalogs.sgml and pg_opclass.h say specifically that
>> opckeytype should be zero if it's to be the same as the input
>> column type.  I don't think just dropping the enforcement of that
>> is the right answer.

> Yeah, that's possible. I was mostly just demonstrating the difference in
> behavior. Maybe the right fix is to fix the catalog contents and then
> tweak the AM code, or something.

Digging in our git history, the rule about zero opckeytype dates to
2001 (f933766ba), which precedes our invention of polymorphic types
in 2003 (somewhere around 730840c9b).  So I'm pretty sure that that
was a poor man's substitute for polymorphic opclasses, which we
failed to clean up entirely after we got real polymorphic opclasses.

Now, I'd be in favor of cleaning that up and just using standard
polymorphism rules throughout.  But (without having studied the code)
it looks like the immediate issue is that something in the BRIN code is
unfamiliar with the rule for zero opckeytype.  It might be a noticeably
smaller patch to fix that than to get rid of the convention about zero.

            regards, tom lane



Re: Handling of opckeytype / CREATE OPERATOR CLASS (bug?)

From
Tomas Vondra
Date:

On 3/23/21 6:15 AM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
>> On 3/23/21 3:13 AM, Tom Lane wrote:
>>> Hm.  Both catalogs.sgml and pg_opclass.h say specifically that
>>> opckeytype should be zero if it's to be the same as the input
>>> column type.  I don't think just dropping the enforcement of that
>>> is the right answer.
> 
>> Yeah, that's possible. I was mostly just demonstrating the difference in
>> behavior. Maybe the right fix is to fix the catalog contents and then
>> tweak the AM code, or something.
> 
> Digging in our git history, the rule about zero opckeytype dates to
> 2001 (f933766ba), which precedes our invention of polymorphic types
> in 2003 (somewhere around 730840c9b).  So I'm pretty sure that that
> was a poor man's substitute for polymorphic opclasses, which we
> failed to clean up entirely after we got real polymorphic opclasses.
> 
> Now, I'd be in favor of cleaning that up and just using standard
> polymorphism rules throughout.  But (without having studied the code)
> it looks like the immediate issue is that something in the BRIN code is
> unfamiliar with the rule for zero opckeytype.  It might be a noticeably
> smaller patch to fix that than to get rid of the convention about zero.
> 

That's possible. I'm not familiar with how we deal with polymorphic
opclasses etc. but I tried to look for places dealing with opckeytype,
so that I can compare BRIN vs. the other AMs, but the only references
seem to be in amvalidate() functions.

So either the difference is not very obvious, or maybe the other AMs
don't trigger this for some reason. For example btree has a separate
opclass for cidr, so it does not have to use "inet_ops" as polymorphic.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Handling of opckeytype / CREATE OPERATOR CLASS (bug?)

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
> On 3/23/21 6:15 AM, Tom Lane wrote:
>> Digging in our git history, the rule about zero opckeytype dates to
>> 2001 (f933766ba), which precedes our invention of polymorphic types
>> in 2003 (somewhere around 730840c9b).  So I'm pretty sure that that
>> was a poor man's substitute for polymorphic opclasses, which we
>> failed to clean up entirely after we got real polymorphic opclasses.

> That's possible. I'm not familiar with how we deal with polymorphic
> opclasses etc. but I tried to look for places dealing with opckeytype,
> so that I can compare BRIN vs. the other AMs, but the only references
> seem to be in amvalidate() functions.
> So either the difference is not very obvious, or maybe the other AMs
> don't trigger this for some reason. For example btree has a separate
> opclass for cidr, so it does not have to use "inet_ops" as polymorphic.

I think the difference is that brin is trying to look up opclass members
based on the recorded type of the index's column (not the underlying
table column).  I don't recall that anyplace else does that.  btree
for instance does some lookups based on opcintype, but I don't think
it looks at the index column type anywhere.

After poking at it a bit more, the convention for zero does allow us
to do some things that regular polymorphism won't.  As an example:

test=# create table vc (id varchar(9) primary key);
CREATE TABLE
test=# \d+ vc_pkey
                           Index "public.vc_pkey"
 Column |         Type         | Key? | Definition | Storage  | Stats target 
--------+----------------------+------+------------+----------+--------------
 id     | character varying(9) | yes  | id         | extended | 
primary key, btree, for table "public.vc"

If btree text_ops had opckeytype = 'text' then this index column
would show as just "text", which while not fatal seems like a loss
of information.

So I'm coming around to the idea that opckeytype = opcintype and
opckeytype = 0 are valid but distinct situations, and CREATE OPCLASS
indeed ought not smash one to the other.  But we'd better poke around
at the documentation, pg_dump, etc and make sure everything plays
nice with that.

            regards, tom lane