Thread: obj_unique_identifier(oid)
Hi all! I was a bit frustrated there was no function to generate a unique identifier for any oid. Instead of complaining, I decided to automate the process as far as possible. :) The result is a simple perl function to automatically generate a function for each regclass able to generate a unique text identifier. The function obj_unique_identifier(oid) will return a unique name for _any_ oid. I have looked at the unique constraints for each system_catalog to make sure all identifiers are unique. Source code: perl script to generate .sql file: https://github.com/gluefinance/pov/blob/master/sql/schema/pov/functions/obj_unique_identifier.pl output from perl script: https://github.com/gluefinance/pov/blob/master/sql/schema/pov/functions/obj_unique_identifier.sql I would highly appreicate feedback on the structure of the identifier. It must be composed in a way which will guarantee uniqueness. Example: glue=# select obj_unique_identifier(refobjid) from pg_depend order by random() limit 10; obj_unique_identifier ------------------------------------------------------------------------------------pg_proc.pg_catalog.iso8859_1_to_utf8(integer, integer,cstring, internal, integer)pg_operator.pg_catalog.float8.pg_catalog.float8.pg_catalog.-pg_operator.pg_catalog.money.pg_catalog.int4.pg_catalog.*pg_amproc.gin.pg_catalog.array_ops.pg_catalog._time.pg_catalog._time.4pg_operator.pg_catalog.int2.pg_catalog.int4.pg_catalog.-pg_class.pg_catalog.pg_statio_sys_sequencespg_amproc.gin.pg_catalog.array_ops.pg_catalog._bool.pg_catalog._bool.1pg_class.pg_catalog.pg_stat_all_indexespg_class.pg_catalog.pg_typepg_proc.pg_catalog.pg_stat_get_function_time(oid) (10 rows) -- Best regards, Joel Jacobson Glue Finance
Joel Jacobson <joel@gluefinance.com> writes: > The function obj_unique_identifier(oid) will return a unique name for _any_ oid. Surely this is broken by design? You can *not* assume that the same OID isn't in use for different things in different system catalogs. They're only guaranteed unique within a catalog. That's the main reason why pg_depend has to include the classid. regards, tom lane
Sent from my iPhone On 7 jan 2011, at 20:46, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Joel Jacobson <joel@gluefinance.com> writes: >> The function obj_unique_identifier(oid) will return a unique name for _any_ oid. > > Surely this is broken by design? You can *not* assume that the same OID > isn't in use for different things in different system catalogs. They're > only guaranteed unique within a catalog. That's the main reason why > pg_depend has to include the classid. > > regards, tom lane Correct. That is why the regclass name (classid) is included in the unique name.
The function should take both classid and oid as input. I'll fix. Sent from my iPhone On 7 jan 2011, at 20:59, Joel Jacobson <joel@gluefinance.com> wrote: > Sent from my iPhone > > On 7 jan 2011, at 20:46, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Joel Jacobson <joel@gluefinance.com> writes: >>> The function obj_unique_identifier(oid) will return a unique name for _any_ oid. >> >> Surely this is broken by design? You can *not* assume that the same OID >> isn't in use for different things in different system catalogs. They're >> only guaranteed unique within a catalog. That's the main reason why >> pg_depend has to include the classid. >> >> regards, tom lane > > Correct. That is why the regclass name (classid) is included in the unique name.
On Jan 7, 2011, at 1:46 PM, Tom Lane wrote: > Joel Jacobson <joel@gluefinance.com> writes: >> The function obj_unique_identifier(oid) will return a unique name for _any_ oid. > > Surely this is broken by design? You can *not* assume that the same OID > isn't in use for different things in different system catalogs. They're > only guaranteed unique within a catalog. That's the main reason why > pg_depend has to include the classid. BTW, if you're looking at making pg_depnd easier to use, see http://archives.postgresql.org/message-id/1290000774-sup-2218@alvh.no-ip.org -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
Sent from my iPhone On 7 jan 2011, at 20:46, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Joel Jacobson <joel@gluefinance.com> writes: >> The function obj_unique_identifier(oid) will return a unique name for _any_ oid. > > Surely this is broken by design? You can *not* assume that the same OID > isn't in use for different things in different system catalogs. They're > only guaranteed unique within a catalog. That's the main reason why > pg_depend has to include the classid. > > regards, tom lane Correct. That is why the regclass name (classid) is included in the unique name.
The function should take both classid and oid as input. I'll fix. Sent from my iPhone On 7 jan 2011, at 20:59, Joel Jacobson <joel@gluefinance.com> wrote: > Sent from my iPhone > > On 7 jan 2011, at 20:46, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Joel Jacobson <joel@gluefinance.com> writes: >>> The function obj_unique_identifier(oid) will return a unique name for _any_ oid. >> >> Surely this is broken by design? You can *not* assume that the same OID >> isn't in use for different things in different system catalogs. They're >> only guaranteed unique within a catalog. That's the main reason why >> pg_depend has to include the classid. >> >> regards, tom lane > > Correct. That is why the regclass name (classid) is included in the unique name.
2011/1/7 Jim Nasby <jim@nasby.net>: > BTW, if you're looking at making pg_depnd easier to use, see http://archives.postgresql.org/message-id/1290000774-sup-2218@alvh.no-ip.org I guess there are more than one ways to do it, C, sql, plperl, plpgsql. :) I guess at least one of the methods should be provided in the vanilla distro. :) -- Best regards, Joel Jacobson Glue Finance
On Sat, Jan 8, 2011 at 1:59 AM, Joel Jacobson <joel@gluefinance.com> wrote: > 2011/1/7 Jim Nasby <jim@nasby.net>: >> BTW, if you're looking at making pg_depnd easier to use, see http://archives.postgresql.org/message-id/1290000774-sup-2218@alvh.no-ip.org > > I guess there are more than one ways to do it, C, sql, plperl, plpgsql. :) > I guess at least one of the methods should be provided in the vanilla distro. :) I guess the point is that if this gets committed as a core function written in C, we don't need any other implementations. But I don't recall ever seeing a commit for that one go by... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > I guess the point is that if this gets committed as a core function > written in C, we don't need any other implementations. But I don't > recall ever seeing a commit for that one go by... http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=6cc2deb86e9183262493a6537700ee305fb3e096 Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
2011/1/8 Dimitri Fontaine <dimitri@2ndquadrant.fr>: > http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=6cc2deb86e9183262493a6537700ee305fb3e096 Nice! Has the patch been accepted and will be made available in future versions of pg? Also, why return NULL for pinned objects? They can also be described using a unique identifier. (+ /* for "pinned" items in pg_depend, return null */) It is useful to describe such objects to be able to diff different versions of pg, i.e. comparing which pinned objects exists, doing so can tell you the odds for an application depending on certain pinned objects being compatible with a specific version of the database. -- Best regards, Joel Jacobson Glue Finance
On Sat, Jan 8, 2011 at 14:05, Joel Jacobson <joel@gluefinance.com> wrote: > 2011/1/8 Dimitri Fontaine <dimitri@2ndquadrant.fr>: >> http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=6cc2deb86e9183262493a6537700ee305fb3e096 > > Nice! Has the patch been accepted and will be made available in future > versions of pg? Yes. Once things are committed to the main repository, they are only backed out if someone finds a major issue with them that is not fixable (ina reasonable timeframe). That almost never happens. We don't keep unapproved patches or development branches in the main repository - those are all in the personal repositories of the developers. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
2011/1/8 Magnus Hagander <magnus@hagander.net>: > Yes. Once things are committed to the main repository, they are only > backed out if someone finds a major issue with them that is not > fixable (ina reasonable timeframe). That almost never happens. We > don't keep unapproved patches or development branches in the main > repository - those are all in the personal repositories of the > developers. Thanks for clarifying. I found a bug in the pg_catalog.pg_describe_object function. The query below should not return any rows, because if it does, then there are oids with non-unique descriptions. While the description is good enough for a human to interpret, it cannot be used in an application as a unique identifier unless it is really unique. WITH all_objects AS ( SELECT classid, objid, objsubid FROM pg_depend UNION SELECT refclassid, refobjid, refobjsubid FROMpg_depend ) SELECT pg_catalog.pg_describe_object(classid,objid,objsubid) FROM all_objects GROUP BY pg_catalog.pg_describe_object(classid,objid,objsubid) HAVING COUNT(*) > 1 pg_describe_object ----------------------------------------------------------------------------------------------------------------------------------------function 2ginarrayextract(anyarray,internal) of operator family array_ops for access method ginfunction 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method ginfunction 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method ginfunction 1 network_cmp(inet,inet) of operator family array_ops for access method ginfunction 1 bttextcmp(text,text) of operator family array_ops for access method gin (5 rows) There are 94 objects such objects: classid | objid | objsubid | obj_unique_identifier | pg_describe_object ---------+-------+----------+--------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------- 2603 | 10606 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._text.pg_catalog._text.1 | function 1 bttextcmp(text,text) of operatorfamily array_ops for access method gin 2603 | 10610 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._varchar.pg_catalog._varchar.1 | function 1 bttextcmp(text,text) of operatorfamily array_ops for access method gin 2603 | 10650 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._inet.pg_catalog._inet.1 | function 1 network_cmp(inet,inet) ofoperator family array_ops for access method gin 2603 | 10654 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._cidr.pg_catalog._cidr.1 | function 1 network_cmp(inet,inet) ofoperator family array_ops for access method gin 2603 | 10631 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._bytea.pg_catalog._bytea.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10671 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._macaddr.pg_catalog._macaddr.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10667 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._interval.pg_catalog._interval.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10675 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._name.pg_catalog._name.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10719 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._tinterval.pg_catalog._tinterval.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10607 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._text.pg_catalog._text.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10611 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._varchar.pg_catalog._varchar.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10655 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._cidr.pg_catalog._cidr.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10707 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._timestamp.pg_catalog._timestamp.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10711 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._money.pg_catalog._money.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10663 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._int8.pg_catalog._int8.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10635 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._char.pg_catalog._char.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10703 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._varbit.pg_catalog._varbit.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10627 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._bpchar.pg_catalog._bpchar.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10695 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._timestamptz.pg_catalog._timestamptz.2 | function 2 ginarrayextract(anyarray,internal) of operator family array_ops for access method gin 2603 | 10603 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._int4.pg_catalog._int4.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10683 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._oid.pg_catalog._oid.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10715 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._reltime.pg_catalog._reltime.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10699 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._timetz.pg_catalog._timetz.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10615 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._abstime.pg_catalog._abstime.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10623 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._bool.pg_catalog._bool.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10639 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._date.pg_catalog._date.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10691 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._time.pg_catalog._time.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10687 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._oidvector.pg_catalog._oidvector.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10659 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._int2.pg_catalog._int2.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10647 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._float8.pg_catalog._float8.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10643 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._float4.pg_catalog._float4.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10651 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._inet.pg_catalog._inet.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10679 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._numeric.pg_catalog._numeric.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10619 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._bit.pg_catalog._bit.2 | function 2 ginarrayextract(anyarray,internal)of operator family array_ops for access method gin 2603 | 10660 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._int2.pg_catalog._int2.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10696 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._timestamptz.pg_catalog._timestamptz.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10648 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._float8.pg_catalog._float8.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10604 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._int4.pg_catalog._int4.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10712 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._money.pg_catalog._money.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10664 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._int8.pg_catalog._int8.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10652 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._inet.pg_catalog._inet.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10608 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._text.pg_catalog._text.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10636 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._char.pg_catalog._char.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10644 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._float4.pg_catalog._float4.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10612 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._varchar.pg_catalog._varchar.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10672 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._macaddr.pg_catalog._macaddr.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10620 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._bit.pg_catalog._bit.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10624 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._bool.pg_catalog._bool.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10704 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._varbit.pg_catalog._varbit.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10616 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._abstime.pg_catalog._abstime.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10656 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._cidr.pg_catalog._cidr.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10680 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._numeric.pg_catalog._numeric.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10716 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._reltime.pg_catalog._reltime.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10668 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._interval.pg_catalog._interval.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10720 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._tinterval.pg_catalog._tinterval.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10692 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._time.pg_catalog._time.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10676 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._name.pg_catalog._name.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10700 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._timetz.pg_catalog._timetz.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10628 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._bpchar.pg_catalog._bpchar.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10684 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._oid.pg_catalog._oid.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10640 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._date.pg_catalog._date.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10632 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._bytea.pg_catalog._bytea.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10708 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._timestamp.pg_catalog._timestamp.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10688 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._oidvector.pg_catalog._oidvector.3 | function 3 ginqueryarrayextract(anyarray,internal,smallint,internal,internal) of operator family array_ops for access method gin 2603 | 10609 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._text.pg_catalog._text.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10657 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._cidr.pg_catalog._cidr.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10717 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._reltime.pg_catalog._reltime.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10649 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._float8.pg_catalog._float8.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10713 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._money.pg_catalog._money.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10693 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._time.pg_catalog._time.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10669 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._interval.pg_catalog._interval.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10629 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._bpchar.pg_catalog._bpchar.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10709 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._timestamp.pg_catalog._timestamp.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10617 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._abstime.pg_catalog._abstime.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10665 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._int8.pg_catalog._int8.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10641 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._date.pg_catalog._date.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10605 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._int4.pg_catalog._int4.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10689 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._oidvector.pg_catalog._oidvector.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10721 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._tinterval.pg_catalog._tinterval.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10625 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._bool.pg_catalog._bool.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10681 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._numeric.pg_catalog._numeric.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10621 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._bit.pg_catalog._bit.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10701 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._timetz.pg_catalog._timetz.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10697 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._timestamptz.pg_catalog._timestamptz.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10673 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._macaddr.pg_catalog._macaddr.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10645 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._float4.pg_catalog._float4.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10661 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._int2.pg_catalog._int2.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10613 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._varchar.pg_catalog._varchar.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10653 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._inet.pg_catalog._inet.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10633 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._bytea.pg_catalog._bytea.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10705 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._varbit.pg_catalog._varbit.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10685 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._oid.pg_catalog._oid.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10677 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._name.pg_catalog._name.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin 2603 | 10637 | 0 | pg_amproc.gin.pg_catalog.array_ops.pg_catalog._char.pg_catalog._char.4 | function 4 ginarrayconsistent(internal,smallint,anyarray,integer,internal,internal) of operator family array_ops for access method gin (94 rows)
On Sat, Jan 8, 2011 at 12:41 PM, Joel Jacobson <joel@gluefinance.com> wrote: > The query below should not return any rows, because if it does, then > there are oids with non-unique descriptions. I don't think your analysis is correct. Each entry in pg_depend represents the fact that one object depends on another object, and an object could easily depend on more than one other object, or be depended upon by more than one other object, or depend on one object and be depended on by another. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
2011/1/8 Robert Haas <robertmhaas@gmail.com>: > I don't think your analysis is correct. Each entry in pg_depend > represents the fact that one object depends on another object, and an > object could easily depend on more than one other object, or be > depended upon by more than one other object, or depend on one object > and be depended on by another. What does that have to do with this? Two different oids represents two different objects, right? Two different objects should have two different descriptions, right? Otherwise I cannot see how one can argue the description being unique. The pg_describe_object returns unique descriptions for all object types, except for the 5 types I unexpectedly found.
On Sat, 2011-01-08 at 22:21 +0100, Joel Jacobson wrote: > 2011/1/8 Robert Haas <robertmhaas@gmail.com>: > > I don't think your analysis is correct. Each entry in pg_depend > > represents the fact that one object depends on another object, and an > > object could easily depend on more than one other object, or be > > depended upon by more than one other object, or depend on one object > > and be depended on by another. > > What does that have to do with this? > > Two different oids represents two different objects, right? > Two different objects should have two different descriptions, right? > Otherwise I cannot see how one can argue the description being unique. > > The pg_describe_object returns unique descriptions for all object > types, except for the 5 types I unexpectedly found. I can confirm it has nothing to do with pg_depend, and that it seems to be a bug with that descriptions do not seem to care about different amproclefttype and amprocrighttype. SELECT array_agg(oid), array_agg(amproclefttype) FROM pg_amproc GROUP BY pg_catalog.pg_describe_object(2603,oid,0) HAVING count(*) > 1; One example row produced by that query. array_agg | array_agg ---------------+-------------{10608,10612} | {1009,1015} (1 row) Regards, Andreas Karlsson
On Sat, Jan 8, 2011 at 4:21 PM, Joel Jacobson <joel@gluefinance.com> wrote: > 2011/1/8 Robert Haas <robertmhaas@gmail.com>: >> I don't think your analysis is correct. Each entry in pg_depend >> represents the fact that one object depends on another object, and an >> object could easily depend on more than one other object, or be >> depended upon by more than one other object, or depend on one object >> and be depended on by another. > > What does that have to do with this? Oops. I misread your query. I thought the duplicates were because you were feeding pg_describe_object the same classoid, objoid, objsubid pair more than once, but I see now that's not the case (UNION != UNION ALL). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Here is a patch, but I am not sure I am not sure if I like my idea for format. What do you think? SELECT pg_describe_object('pg_amproc'::regclass,oid,0) FROM pg_amproc WHERE oid IN (10608,10612); pg_describe_object ---------------------------------------------------------------------------------------------------------------------------------- function 1 bttextcmp(text,text) of operator family array_ops for access method gin for (text[],text[]) function 1 bttextcmp(text,text) of operator family array_ops for access method gin for (character varying[],character varying[]) (2 rows) Andreas
Attachment
2011/1/9 Andreas Karlsson <andreas@proxel.se>: > Here is a patch, but I am not sure I am not sure if I like my idea for > format. What do you think? > > SELECT pg_describe_object('pg_amproc'::regclass,oid,0) > FROM pg_amproc WHERE oid IN (10608,10612); > pg_describe_object > ---------------------------------------------------------------------------------------------------------------------------------- > function 1 bttextcmp(text,text) of operator family array_ops for access method gin for (text[],text[]) > function 1 bttextcmp(text,text) of operator family array_ops for access method gin for (character varying[],charactervarying[]) > (2 rows) Looks great! Many thanks for fixing the bug! > > Andreas > > -- Best regards, Joel Jacobson Glue Finance E: jj@gluefinance.com T: +46 70 360 38 01 Postal address: Glue Finance AB Box 549 114 11 Stockholm Sweden Visiting address: Glue Finance AB Birger Jarlsgatan 14 114 34 Stockholm Sweden
2011/1/9 Robert Haas <robertmhaas@gmail.com>: > Oops. I misread your query. I thought the duplicates were because > you were feeding pg_describe_object the same classoid, objoid, > objsubid pair more than once, but I see now that's not the case (UNION > != UNION ALL). Ah, I see, yes, the query should actually be UNION, it would produce the same result, but perhaps it would be a bit faster. -- Best regards, Joel Jacobson Glue Finance
On Sat, Jan 8, 2011 at 8:02 PM, Joel Jacobson <joel@gluefinance.com> wrote: > 2011/1/9 Robert Haas <robertmhaas@gmail.com>: >> Oops. I misread your query. I thought the duplicates were because >> you were feeding pg_describe_object the same classoid, objoid, >> objsubid pair more than once, but I see now that's not the case (UNION >> != UNION ALL). > > Ah, I see, yes, the query should actually be UNION, it would produce > the same result, but perhaps it would be a bit faster. You did use UNION - I think if you used UNION ALL you'd get spurious results. But maybe I'm still confused. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Bug in pg_describe_object (was: Re: [HACKERS] obj_unique_identifier(oid))
From
Andreas Karlsson
Date:
Here is the bug-fix patch again with a description of the context so I can add it to the commit fest. Joel Jacobson discovered a bug in the function pg_describe_object where it does not produce unique identifiers for some entries in pg_amproc. This patch fixes the bug where when two entries in pg_amproc only differ in amproclefttype or amprocrighttype the same description will be produced by pg_describe_object, by simply adding the two fields (amproclefttype, amprocrighttype) to the description. == Before patch SELECT pg_describe_object('pg_amproc'::regclass,oid,0) FROM pg_amproc WHERE oid IN (10608,10612); pg_describe_object ------------------------------------------------------------------------------------ function 1 bttextcmp(text,text) of operator family array_ops for access method gin function 1 bttextcmp(text,text) of operator family array_ops for access method gin (2 rows) == After patch SELECT pg_describe_object('pg_amproc'::regclass,oid,0) FROM pg_amproc WHERE oid IN (10608,10612); pg_describe_object ---------------------------------------------------------------------------------------------------------------------------------- function 1 bttextcmp(text,text) of operator family array_ops for access method gin for (text[],text[]) function 1 bttextcmp(text,text) of operator family array_ops for access method gin for (character varying[],character varying[]) (2 rows) Regards, Andreas
Attachment
Re: Bug in pg_describe_object (was: Re: [HACKERS] obj_unique_identifier(oid))
From
Joel Jacobson
Date:
2011/1/10 Andreas Karlsson <andreas@proxel.se>: > Here is the bug-fix patch again with a description of the context so I > can add it to the commit fest. Many thanks for fixing the bug! I also implemented the pg_describe_object in pure SQL, for those of us who have not yet switched to PostgreSQL 9 in the production. Very helpful function indeed! https://github.com/gluefinance/pov/blob/master/sql/schema/pov/functions/pg_describe_object.sql -- Best regards, Joel Jacobson Glue Finance
Andreas Karlsson <andreas@proxel.se> writes: > Here is the bug-fix patch again with a description of the context so I > can add it to the commit fest. > Joel Jacobson discovered a bug in the function pg_describe_object where > it does not produce unique identifiers for some entries in pg_amproc. There was never any intention that that code produce a guaranteed-unique identifier; it's only meant to be a humanly useful identifer, and this patch seems to me to mostly add noise. regards, tom lane
Re: Bug in pg_describe_object (was: Re: [HACKERS] obj_unique_identifier(oid))
From
Joel Jacobson
Date:
2011/1/10 Tom Lane <tgl@sss.pgh.pa.us>: > There was never any intention that that code produce a guaranteed-unique > identifier; it's only meant to be a humanly useful identifer, and this > patch seems to me to mostly add noise. For all objects, except for these pg_amproc regclass, the function does already generate unique strings. They are guaranteed to be unique thanks to every component of the unique constraints in alll pg_* tables are included in the unique text identifier. It makes a lot more sense to fix the function to return a unique string also for pg_amproc, than to introduce a entirely new function which returns a unique string identifier. It would hardly break anything and I think you exaggerate the noise factor. I can think of numerous reasons why it is absolutely necessary to provide a function generating unique identifiers for objects: a) To allow comparing all objects in two different databases, by comparing objects with the same identifier. This cannot be done using the oids, since they naturally differ between databases. b) To draw nice human readable digraphs in the .dot format , instead of drawing relations digraphs of classid.objid.subobjid. c) OIDs are probably misused in a lot of applications, due to misunderstandings of what they are and not are, I for one didn't know they are not necessarily unique, but only within their regclass. It would be better to encourage users to use a text string if they need to refer to a unique objects in their application, than to force them to use OIDs (or in combination with the regclass, almost as bad), in lack of something better. While you could build your own query to generate a unique string, based on all the columns defining the unique constraint for each class, doing so is very cumbersome and requires a lot of postgres-guru-knowledge. I think it would be a big improvement and increase the number of possible use cases of the existing pg_describe_object function if the documentation would say "the returned text is guaranteed to be unique for each object". -- Best regards, Joel Jacobson Glue Finance E: jj@gluefinance.com T: +46 70 360 38 01 Postal address: Glue Finance AB Box 549 114 11 Stockholm Sweden Visiting address: Glue Finance AB Birger Jarlsgatan 14 114 34 Stockholm Sweden
> There was never any intention that that code produce a guaranteed-unique > identifier; it's only meant to be a humanly useful identifer, and this > patch seems to me to mostly add noise. Would making the identifier unique do any *harm*? -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com
Josh Berkus <josh@agliodbs.com> writes: >> There was never any intention that that code produce a guaranteed-unique >> identifier; it's only meant to be a humanly useful identifer, and this >> patch seems to me to mostly add noise. > Would making the identifier unique do any *harm*? It would make dependency error messages significantly longer and less readable. Quite aside from the point at hand here, we elide schema names in many cases (and it looks like there are some code paths where getObjectDescription never bothers to print them at all). Another issue that might make it interesting to try to use the output for purposes other than human-readable descriptions is that we localize all the phrases involved. My point is that this isn't a bug fix, it's more like moving the goalposts on what getObjectDescription is supposed to do. And I'm not even very sure where they're being moved to. I haven't seen a specification for an intended use of pg_describe_object for which its existing behavior would be unsatisfactory. regards, tom lane
On Mon, Jan 10, 2011 at 7:52 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > It would make dependency error messages significantly longer and less > readable. Quite aside from the point at hand here, we elide schema > names in many cases (and it looks like there are some code paths where > getObjectDescription never bothers to print them at all). Another issue > that might make it interesting to try to use the output for purposes > other than human-readable descriptions is that we localize all the > phrases involved. > > My point is that this isn't a bug fix, it's more like moving the > goalposts on what getObjectDescription is supposed to do. And I'm not > even very sure where they're being moved to. I haven't seen a > specification for an intended use of pg_describe_object for which its > existing behavior would be unsatisfactory. I think that adding the types to the description string is a pretty sensible thing to do. Yeah, it makes the error messages longer, but it also tells you which objects you're actually operating on, a non-negligible advantage. It's fairly confusing that pg_amproc has a four part key, two members of which reference objects which in turn have compound names. But leaving out two out of the four parts in the key is not an improvement. People aren't going to hit dependencies on pg_amproc entries every day, but when they do they presumably want to uniquely identify the objects in question. Now, I agree that this is probably not quite adequate to the purpose to which the OP proposed to put it, but that's really another question. One gripe I do have is that we should put the operator types in the same place ALTER OPERATOR FAMILY puts them - immediately after the support number, and without the word "for" - rather than all the way at the end. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Jan 10, 2011 at 7:52 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> My point is that this isn't a bug fix, it's more like moving the >> goalposts on what getObjectDescription is supposed to do. > I think that adding the types to the description string is a pretty > sensible thing to do. Not really. AFAIR, there are two cases that exist in practice, depending on which AM you're talking about: 1. The recorded types match the input types of the operator/function (btree & hash). 2. The recorded types are always the same as the opclass's input type (gist & gin). In neither case does printing those types really add much information. That's why it's not there now. regards, tom lane
2011/1/11 Tom Lane <tgl@sss.pgh.pa.us>: > It would make dependency error messages significantly longer and less > readable. Quite aside from the point at hand here, we elide schema > names in many cases (and it looks like there are some code paths where > getObjectDescription never bothers to print them at all). Another issue > that might make it interesting to try to use the output for purposes > other than human-readable descriptions is that we localize all the > phrases involved. > > My point is that this isn't a bug fix, it's more like moving the > goalposts on what getObjectDescription is supposed to do. And I'm not > even very sure where they're being moved to. I haven't seen a > specification for an intended use of pg_describe_object for which its > existing behavior would be unsatisfactory. Thanks for some good arguments. I now agree with you it would be a bit counter productive to change the existing pg_describe_object. Due to the localization of the phrases and the lack of mandatory namespace inclusion, you lose the comparison ability anyway. I instead propose we introduce a new function named pg_get_object_unique_identifier( classid oid, objid oid, objsubid integer ) returns text. The name would make sense since we already have a pg_get_function_identity_arguments( func_oid ), for a similar purpose but solely for functions. -- Best regards, Joel Jacobson Glue Finance
On Mon, Jan 10, 2011 at 8:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Mon, Jan 10, 2011 at 7:52 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> My point is that this isn't a bug fix, it's more like moving the >>> goalposts on what getObjectDescription is supposed to do. > >> I think that adding the types to the description string is a pretty >> sensible thing to do. > > Not really. AFAIR, there are two cases that exist in practice, > depending on which AM you're talking about: > > 1. The recorded types match the input types of the operator/function > (btree & hash). > 2. The recorded types are always the same as the opclass's input type > (gist & gin). > > In neither case does printing those types really add much information. > That's why it's not there now. I don't get it. If two different items that exist in the system out of the box have the same description, it seems clear that relevant piece of disambiguating information exists nowhere in the description string. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
2011/1/11 Robert Haas <robertmhaas@gmail.com>: > I don't get it. If two different items that exist in the system out > of the box have the same description, it seems clear that relevant > piece of disambiguating information exists nowhere in the description > string. I guess it is a question of prioritization. If backwards compatibility is to be guaranteed, even for functions returning text intended to be read by humans, then the function cannot be modified, without violating that golden rule, if such a rule exists within the PostgreSQL development project? If it's not a golden rule, then it's a totally different story and there is no excuse why it should return the same descriptions for the same objects. Any other reasoning is just silly. -- Best regards, Joel Jacobson Glue Finance
On Jan 11, 2011, at 8:25 AM, Joel Jacobson <joel@gluefinance.com> wrote: > 2011/1/11 Robert Haas <robertmhaas@gmail.com>: >> I don't get it. If two different items that exist in the system out >> of the box have the same description, it seems clear that relevant >> piece of disambiguating information exists nowhere in the description >> string. > > I guess it is a question of prioritization. > If backwards compatibility is to be guaranteed, even for functions > returning text intended to be read by humans, then the function cannot > be modified, without violating that golden rule, if such a rule exists > within the PostgreSQL development project? > > If it's not a golden rule, then it's a totally different story and > there is no excuse why it should return the same descriptions for the > same objects. > Any other reasoning is just silly. Well, we shouldn't change them randomly or arbitrarily, but improving them is another thing altogether. I think the contentionthat any user or application anywhere is depending on the exact textual representation of a pg_amproc entry isexceedingly dubious. And I think the current messages are flat-out confusing. ...Robert
Joel Jacobson <joel@gluefinance.com> writes: > I instead propose we introduce a new function named > pg_get_object_unique_identifier( classid oid, objid oid, objsubid > integer ) returns text. Seems like concatenating the OIDs would accomplish that. (If you think not, well, you still haven't explained what problem you're trying to solve...) > The name would make sense since we already have a > pg_get_function_identity_arguments( func_oid ), for a similar purpose > but solely for functions. No, that does not exist for the purpose you claim it does. And it's far from obvious what the extension of that to all object classes would look like. regards, tom lane
Excerpts from Robert Haas's message of mar ene 11 10:52:12 -0300 2011: > Well, we shouldn't change them randomly or arbitrarily, but improving them is another thing altogether. I think the contentionthat any user or application anywhere is depending on the exact textual representation of a pg_amproc entry isexceedingly dubious. And I think the current messages are flat-out confusing. Tom also mentioned that the descriptions are translated, so an application built to parse anything out of the strings is already broken. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
2011/1/11 Tom Lane <tgl@sss.pgh.pa.us>: > Seems like concatenating the OIDs would accomplish that. (If you > think not, well, you still haven't explained what problem you're trying > to solve...) The can be different in two different databases sharing the same original schema, but of two different versions. In this case it is better to compare textual strings describing the objects than to compare based on oids. -- Best regards, Joel Jacobson Glue Finance
On Jan11, 2011, at 16:12 , Tom Lane wrote: > Joel Jacobson <joel@gluefinance.com> writes: >> I instead propose we introduce a new function named >> pg_get_object_unique_identifier( classid oid, objid oid, objsubid >> integer ) returns text. > > Seems like concatenating the OIDs would accomplish that. (If you > think not, well, you still haven't explained what problem you're trying > to solve...) I think the OP wants a unique identifier which changes neither on dump & reload nor on major version upgrades. I can see the value in that if you e.g. want to compare the structures of two different postgres databases. It seems impossible to guarantee the identifier to not change between major versions, though - if the structure of that catalog change, so will the identifier. @OP: Wouldn't it be sufficient to provide such a thing for structure objects that people are actually going to modify on a regular basis? I suggest restricting this to tables (including SQL/MED foreign tables) views sequences types servers (SQL/MED) tsearch configurations tsearch dictionaries for which <schema>.<name> suffices, triggers constraints columns for which <schema>.<table>.<name> should work, and functions operators which need to include the argument types. The reg* types already solve this for tables, views, sequences (regclass) tsearch configurations (regconfig) tsearch dictionaries (regdictionary) types (regtype) functions (regprocedure) operators (regoperator) which leaves servers triggers constraints columns best regards, Florian Pflug
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Jan 10, 2011 at 8:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Not really. �AFAIR, there are two cases that exist in practice, >> depending on which AM you're talking about: >> >> 1. The recorded types match the input types of the operator/function >> � (btree & hash). >> 2. The recorded types are always the same as the opclass's input type >> � (gist & gin). >> >> In neither case does printing those types really add much information. >> That's why it's not there now. > I don't get it. If two different items that exist in the system out > of the box have the same description, it seems clear that relevant > piece of disambiguating information exists nowhere in the description > string. The "relevant piece of disambiguating information" is the function name+parameters in the first case, or the opclass name in the second. regards, tom lane
Florian Pflug <fgp@phlo.org> writes: > @OP: Wouldn't it be sufficient to provide such a thing for structure > objects that people are actually going to modify on a regular basis? Yeah, I still don't see the need to argue over whether the elements of an operator class are uniquely identifiable or not. regards, tom lane
Joel Jacobson <joel@gluefinance.com> writes: > 2011/1/11 Tom Lane <tgl@sss.pgh.pa.us>: >> Seems like concatenating the OIDs would accomplish that. �(If you >> think not, well, you still haven't explained what problem you're trying >> to solve...) > The can be different in two different databases sharing the same > original schema, but of two different versions. > In this case it is better to compare textual strings describing the > objects than to compare based on oids. If that's what you're after, getObjectDescription is entirely unsuitable, because of the fact that its results are dependent on search path and language settings. regards, tom lane
On Tue, 2011-01-11 at 11:43 -0500, Tom Lane wrote: > If that's what you're after, getObjectDescription is entirely > unsuitable, because of the fact that its results are dependent > on search path and language settings. > > regards, tom lane Agreed, and as long as the additional information added to the description by my patch is not useful for any other purpose I see no reason for applying it. So would anyone be confused by a description of pg_amproc not including the types? I personally have no idea since I have not had to work with indexes enough to say. Andreas
Andreas Karlsson <andreas@proxel.se> writes: > So would anyone be confused by a description of pg_amproc not including > the types? It really shouldn't be useful to include those. Attend what it says in the fine manual for CREATE OPERATOR CLASS: In a FUNCTION clause, the operand data type(s) the function isintended to support, if different from the input data type(s)ofthe function (for B-tree and hash indexes) or the class's datatype (for GIN and GiST indexes). These defaults arealwayscorrect, so there is no point in specifying op_type in aFUNCTION clause in CREATE OPERATOR CLASS, but the optionisprovided for consistency with the comparable syntax in ALTEROPERATOR FAMILY. The reason the ALTER OPERATOR FAMILY DROP syntax has to include operand types is that it lacks the full name/types of the referenced function. Since getObjectDescription *does* provide those, it doesn't serve any real purpose to repeat the information. regards, tom lane
On Tue, 2011-01-11 at 14:01 -0500, Tom Lane wrote: > It really shouldn't be useful to include those. Attend what it says in > the fine manual for CREATE OPERATOR CLASS: > > In a FUNCTION clause, the operand data type(s) the function is > intended to support, if different from the input data type(s) of > the function (for B-tree and hash indexes) or the class's data > type (for GIN and GiST indexes). These defaults are always > correct, so there is no point in specifying op_type in a > FUNCTION clause in CREATE OPERATOR CLASS, but the option is > provided for consistency with the comparable syntax in ALTER > OPERATOR FAMILY. > > The reason the ALTER OPERATOR FAMILY DROP syntax has to include operand > types is that it lacks the full name/types of the referenced function. > Since getObjectDescription *does* provide those, it doesn't serve any > real purpose to repeat the information. > > regards, tom lane Hm, that is not what I see when reading the source. There can exist several entries in pg_amproc for one operator family with the same short_number and function (both name and types). The only difference is in lefttype and righttype. For example these two in array_ops. SELECT *, amproc::oid FROM pg_amproc WHERE oid IN (10608,10612);amprocfamily | amproclefttype | amprocrighttype | amprocnum| amproc | amproc --------------+----------------+-----------------+-----------+-----------+-------- 2745 | 1009 | 1009 | 1 | bttextcmp | 360 2745 | 1015 | 1015 | 1 | bttextcmp | 360 (2 rows) The reason you must specify the types in ALTER OPERATOR FAMILY DROP is that otherwise it would not know which row of these two to drop. And that information is not included in the current string returned by getObjectDescription. As I interpret it the pg_amproc entries belonging to the array_ops family really belong to one of the opclasses (_int2_ops, _text_ops, ...) and the lefttype and righttype are used to look up the amproc entries based on the opcintype of the opclass of the index class in IndexSupportInitialize. The comment in pg_amproc.h aslo seems to confirm my view.[1] So instead of function 1 bttextcmp(text,text) of operator family array_ops for access method gin or in a improved version of my stab at a patch function 1 (text[],text[]) bttextcmp(text,text) of operator family array_ops for access method gin it really is function 1 bttextcmp(text,text) of operator class _text_ops for access method gin The "imporved version" might be simpler to implement and more future proof though if we start using pg_amproc for other lookups. -- [1] From pg_amproc.h about lefttype and righttype. > The primary key for this table is <amprocfamily, amproclefttype, > amprocrighttype, amprocnum>. The "default" support functions for a > particular opclass within the family are those with amproclefttype = > amprocrighttype = opclass's opcintype. These are the ones loaded into the > relcache for an index and typically used for internal index operations. > Other support functions are typically used to handle cross-type indexable > operators with oprleft/oprright matching the entry's amproclefttype and > amprocrighttype. The exact behavior depends on the index AM, however, and > some don't pay attention to non-default functions at all. Regards, Andreas
Andreas Karlsson <andreas@proxel.se> writes: > On Tue, 2011-01-11 at 14:01 -0500, Tom Lane wrote: >> It really shouldn't be useful to include those. Attend what it says in >> the fine manual for CREATE OPERATOR CLASS: > Hm, that is not what I see when reading the source. > There can exist several entries in pg_amproc for one operator family > with the same short_number and function (both name and types). We're cheating in a small number of places by using a binary-compatible hash function to implement hashing for a datatype other than the one it's declared to work on. I don't think that the existence of that hack means that getObjectDescription should bloat the descriptions of every amproc entry with generally-useless information. regards, tom lane
On Tue, Jan 11, 2011 at 7:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andreas Karlsson <andreas@proxel.se> writes: >> On Tue, 2011-01-11 at 14:01 -0500, Tom Lane wrote: >>> It really shouldn't be useful to include those. Attend what it says in >>> the fine manual for CREATE OPERATOR CLASS: > >> Hm, that is not what I see when reading the source. > >> There can exist several entries in pg_amproc for one operator family >> with the same short_number and function (both name and types). > > We're cheating in a small number of places by using a binary-compatible > hash function to implement hashing for a datatype other than the one > it's declared to work on. I don't think that the existence of that hack > means that getObjectDescription should bloat the descriptions of every > amproc entry with generally-useless information. I don't see how you can claim that it's remotely sane for different objects to have the same description. The whole point is that someone is going to say "DROP something" and the system is going to say "no, there's an object that depends on it", and the description it gives won't uniquely identify which one. If the information is needed to distinguish which object is implicated, it's not useless. The fact that someone with an expert-level knowledge of PostgreSQL and a divining rod may be able to determine which object they should be worried about given only half of its primary key fields is not a good reason to omit the other half. In fact, there isn't any such reason, and you're the only one arguing otherwise. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Here is a very simple change of the patch to make the output look more like the syntax of ALTER OPERATOR FAMILY to improve consistency. Before patch: function 1 bttextcmp(text,text) of operator family array_ops for access method gin With the first version: function 1 bttextcmp(text,text) of operator family array_ops for access method gin for (text[],text[]) With this version: function 1 (text[],text[]) bttextcmp(text,text) of operator family array_ops for access method gin Andreas
Attachment
Andreas Karlsson <andreas@proxel.se> writes: > Here is a very simple change of the patch to make the output look more > like the syntax of ALTER OPERATOR FAMILY to improve consistency. IMO, what this patch needs is to not output the types unless they are actually different from the default (which can be inferred from the AM type and the function arguments). That would fix my concern about it emitting information that is 99.44% useless. regards, tom lane
On Wed, Jan 12, 2011 at 7:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andreas Karlsson <andreas@proxel.se> writes: >> Here is a very simple change of the patch to make the output look more >> like the syntax of ALTER OPERATOR FAMILY to improve consistency. > > IMO, what this patch needs is to not output the types unless they are > actually different from the default (which can be inferred from the AM > type and the function arguments). That would fix my concern about it > emitting information that is 99.44% useless. I guess we could do that, but I don't understand how you're supposed to infer them, which means probably a lot of other people won't either. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, Jan 12, 2011 at 7:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> IMO, what this patch needs is to not output the types unless they are >> actually different from the default (which can be inferred from the AM >> type and the function arguments). That would fix my concern about it >> emitting information that is 99.44% useless. > I guess we could do that, but I don't understand how you're supposed > to infer them, which means probably a lot of other people won't > either. Read the CREATE OPERATOR CLASS source code. (It likely would be best to refactor that a bit so it would expose some way to obtain the implied defaults --- I don't think that's done explicitly now, and it's certainly not exported from opclasscmds.c.) regards, tom lane
On Thu, Jan 13, 2011 at 1:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Wed, Jan 12, 2011 at 7:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> IMO, what this patch needs is to not output the types unless they are >>> actually different from the default (which can be inferred from the AM >>> type and the function arguments). That would fix my concern about it >>> emitting information that is 99.44% useless. > >> I guess we could do that, but I don't understand how you're supposed >> to infer them, which means probably a lot of other people won't >> either. > > Read the CREATE OPERATOR CLASS source code. (It likely would be best to > refactor that a bit so it would expose some way to obtain the implied > defaults --- I don't think that's done explicitly now, and it's > certainly not exported from opclasscmds.c.) I didn't say I *couldn't* take the time to understand this; I said I think it'd be more clear to more people if we just printed the types all the time. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Jan 13, 2011 at 1:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Read the CREATE OPERATOR CLASS source code. �(It likely would be best to >> refactor that a bit so it would expose some way to obtain the implied >> defaults --- I don't think that's done explicitly now, and it's >> certainly not exported from opclasscmds.c.) > I didn't say I *couldn't* take the time to understand this; I said I > think it'd be more clear to more people if we just printed the types > all the time. No, it would just be useless noise that would confuse most people, especially since the default behavior varies across AMs. Please observe that not one single one of the contrib modules has any occasion to specify non-default amprocleft/right --- that should give you a fix on how useful it is to worry about what they are. If it weren't for binary-compatibility kluges we wouldn't need the ability to specify these at all. But I can read the handwriting on the wall: if I want this done right, I'm going to have to do it myself. regards, tom lane
2011/1/15 Tom Lane <tgl@sss.pgh.pa.us>: > But I can read the handwriting on the wall: if I want this done right, > I'm going to have to do it myself. > > regards, tom lane Excellently put! I will with pride steal that phrase and use it whenever I run into the same situation myself. Quite often a proposed "good enough solution" introduces unnecessary complexity and awkwardness, while at the same time the "proper solution" requires an unproportionally amount of resources to accomplish. The model most people adapt is a mix of both. Spend time on details when it is absolutely necessary and accept "good enough solutions" until someone (not necessarily yourself, like you suggested) gets motivated enough to implement the "proper solution". I don't think and I hope it's not just one guy in the universe who can get it "done right". So, instead of fighting like war maniacs on this god-damn-two-lines-of-code-patch, could Tom please describe to the patch submitter what you mean with get it "done right", and perhaps the patch submitter will be motivated enough to invest a few hundred/thousands hours of his time to solve the problem the way hinted by Tom? -- Best regards, Joel Jacobson Glue Finance E: jj@gluefinance.com T: +46 70 360 38 01 Postal address: Glue Finance AB Box 549 114 11 Stockholm Sweden Visiting address: Glue Finance AB Birger Jarlsgatan 14 114 34 Stockholm Sweden
On Sat, 2011-01-15 at 10:36 -0500, Tom Lane wrote: > But I can read the handwriting on the wall: if I want this done right, > I'm going to have to do it myself. > > regards, tom lane Do I understand you correctly if I interpret what you would like to see is the same format used now in these cases? 1) When for btree and hash when lefttype and righttype are the same as the two function parameters. 2) When for GiST and GIN there is only one member of the operator family and the lefttype and righttype are the same as the type of the operator class that is a member of the operator family. That are the default rules in opclasscmds.c as I understood them If so I think I could try to take a stab at this and see once done if it looks like worth the additional code complexity. Regards, Andreas
Andreas Karlsson <andreas@proxel.se> writes: > On Sat, 2011-01-15 at 10:36 -0500, Tom Lane wrote: >> But I can read the handwriting on the wall: if I want this done right, >> I'm going to have to do it myself. > Do I understand you correctly if I interpret what you would like to see > is the same format used now in these cases? Attached is a patch that does what I would consider an acceptable job of printing these datatypes only when they're interesting. I still think that this is largely a waste of code, but if people want it, this is what to do. Testing this in the regression database, it fires on (a) the entries where a binary-compatible hash function is used, and (b) all the entries associated with the GIN operator family array_ops. The latter happens because we've more or less arbitrarily crammed a bunch of opclasses into the same opfamily. One other point here is that I find messages like this a mite unreadable: function 1 (oidvector[], oidvector[]) btoidvectorcmp(oidvector,oidvector) of operator family array_ops for access methodgin If we were to go with this, I'd be strongly tempted to rearrange all four of the messages involved to put the operator or function name at the end, eg function 1 (oidvector[], oidvector[]) of operator family array_ops for access method gin: btoidvectorcmp(oidvector,oidvector) Comments? regards, tom lane diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c index ec8eb74954a9cc0ec3623dc42dfed462dc1a3533..d02b58dcc2933a278959727f55d93e1e9101c1f1 100644 *** a/src/backend/catalog/dependency.c --- b/src/backend/catalog/dependency.c *************** getObjectDescription(const ObjectAddress *** 2337,2351 **** initStringInfo(&opfam); getOpFamilyDescription(&opfam, amopForm->amopfamily); ! /* ! * translator: %d is the operator strategy (a number), the ! * first %s is the textual form of the operator, and the ! * second %s is the description of the operator family. ! */ ! appendStringInfo(&buffer, _("operator %d %s of %s"), ! amopForm->amopstrategy, ! format_operator(amopForm->amopopr), ! opfam.data); pfree(opfam.data); systable_endscan(amscan); --- 2337,2372 ---- initStringInfo(&opfam); getOpFamilyDescription(&opfam, amopForm->amopfamily); ! if (opfamily_op_has_default_types(amopForm->amopfamily, ! amopForm->amoplefttype, ! amopForm->amoprighttype, ! amopForm->amopopr)) ! { ! /* ! * translator: %d is the operator strategy (a number), the ! * first %s is the textual form of the operator, and the ! * second %s is the description of the operator family. ! */ ! appendStringInfo(&buffer, _("operator %d %s of %s"), ! amopForm->amopstrategy, ! format_operator(amopForm->amopopr), ! opfam.data); ! } ! else ! { ! /* ! * translator: %d is the operator strategy (a number), the ! * first two %s's are data type names, the third %s is the ! * textual form of the operator, and the last %s is the ! * description of the operator family. ! */ ! appendStringInfo(&buffer, _("operator %d (%s, %s) %s of %s"), ! amopForm->amopstrategy, ! format_type_be(amopForm->amoplefttype), ! format_type_be(amopForm->amoprighttype), ! format_operator(amopForm->amopopr), ! opfam.data); ! } pfree(opfam.data); systable_endscan(amscan); *************** getObjectDescription(const ObjectAddress *** 2384,2398 **** initStringInfo(&opfam); getOpFamilyDescription(&opfam, amprocForm->amprocfamily); ! /* ! * translator: %d is the function number, the first %s is the ! * textual form of the function with arguments, and the second ! * %s is the description of the operator family. ! */ ! appendStringInfo(&buffer, _("function %d %s of %s"), ! amprocForm->amprocnum, ! format_procedure(amprocForm->amproc), ! opfam.data); pfree(opfam.data); systable_endscan(amscan); --- 2405,2441 ---- initStringInfo(&opfam); getOpFamilyDescription(&opfam, amprocForm->amprocfamily); ! if (opfamily_proc_has_default_types(amprocForm->amprocfamily, ! amprocForm->amproclefttype, ! amprocForm->amprocrighttype, ! amprocForm->amproc)) ! { ! /* ! * translator: %d is the function number, the first %s is ! * the textual form of the function with arguments, and ! * the second %s is the description of the operator ! * family. ! */ ! appendStringInfo(&buffer, _("function %d %s of %s"), ! amprocForm->amprocnum, ! format_procedure(amprocForm->amproc), ! opfam.data); ! } ! else ! { ! /* ! * translator: %d is the function number, the first two ! * %s's are data type names, the third %s is the textual ! * form of the function with arguments, and the last %s is ! * the description of the operator family. ! */ ! appendStringInfo(&buffer, _("function %d (%s, %s) %s of %s"), ! amprocForm->amprocnum, ! format_type_be(amprocForm->amproclefttype), ! format_type_be(amprocForm->amprocrighttype), ! format_procedure(amprocForm->amproc), ! opfam.data); ! } pfree(opfam.data); systable_endscan(amscan); diff --git a/src/backend/commands/opclasscmds.c b/src/backend/commands/opclasscmds.c index 662b9420387f90bf3c9b6e656ed7b90583319d98..48b641f35a1ab901c092feedc53b25abf4d329e3 100644 *** a/src/backend/commands/opclasscmds.c --- b/src/backend/commands/opclasscmds.c *************** static void AlterOpFamilyDrop(List *opfa *** 69,74 **** --- 69,75 ---- static void processTypesSpec(List *args, Oid *lefttype, Oid *righttype); static void assignOperTypes(OpFamilyMember *member, Oid amoid, Oid typeoid); static void assignProcTypes(OpFamilyMember *member, Oid amoid, Oid typeoid); + static Oid get_opfamily_input_type(Oid opfamily, Oid amoid); static void addFamilyMember(List **list, OpFamilyMember *member, bool isProc); static void storeOperators(List *opfamilyname, Oid amoid, Oid opfamilyoid, Oid opclassoid, *************** assignProcTypes(OpFamilyMember *member, *** 1209,1214 **** --- 1210,1374 ---- } /* + * Test whether a pg_amop entry has the default lefttype/righttype that + * would be deduced from its opfamily and operator. + * + * This is here because it must match the logic in assignOperTypes. + */ + bool + opfamily_op_has_default_types(Oid opfamily, Oid lefttype, Oid righttype, + Oid operatorId) + { + bool result = false; + Operator optup; + Form_pg_operator opform; + + /* Fetch the operator definition */ + optup = SearchSysCache1(OPEROID, ObjectIdGetDatum(operatorId)); + if (optup == NULL) + elog(ERROR, "cache lookup failed for operator %u", operatorId); + opform = (Form_pg_operator) GETSTRUCT(optup); + + /* + * Default types are the operator's input types + */ + if (opform->oprkind == 'b' && + lefttype == opform->oprleft && + righttype == opform->oprright) + result = true; + + ReleaseSysCache(optup); + + return result; + } + + /* + * Test whether a pg_amproc entry has the default lefttype/righttype that + * would be deduced from its opfamily and procedure. + * + * This is here because it must match the logic in assignProcTypes. + */ + bool + opfamily_proc_has_default_types(Oid opfamily, Oid lefttype, Oid righttype, + Oid procedureId) + { + bool result = false; + Oid amoid; + Oid opcintype; + HeapTuple opfTup; + Form_pg_opfamily opfForm; + HeapTuple proctup; + Form_pg_proc procform; + + /* We need to know the opfamily's AM to know which rules apply */ + opfTup = SearchSysCache1(OPFAMILYOID, ObjectIdGetDatum(opfamily)); + if (!HeapTupleIsValid(opfTup)) + elog(ERROR, "cache lookup failed for opfamily %u", opfamily); + opfForm = (Form_pg_opfamily) GETSTRUCT(opfTup); + + amoid = opfForm->opfmethod; + + ReleaseSysCache(opfTup); + + /* Fetch the procedure definition */ + proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(procedureId)); + if (proctup == NULL) + elog(ERROR, "cache lookup failed for function %u", procedureId); + procform = (Form_pg_proc) GETSTRUCT(proctup); + + if (amoid == BTREE_AM_OID) + { + /* Default types are the comparison procedure's input types */ + if (procform->pronargs == 2 && + lefttype == procform->proargtypes.values[0] && + righttype == procform->proargtypes.values[1]) + result = true; + } + else if (amoid == HASH_AM_OID) + { + /* Default types are the hash procedure's (single) input type */ + if (procform->pronargs == 1 && + lefttype == procform->proargtypes.values[0] && + righttype == procform->proargtypes.values[0]) + result = true; + } + else + { + /* + * For GiST and GIN, the default (and actually the only useful case) + * is for lefttype/righttype to be the opcintype of the associated + * opclass. So first we have to look that up. The support function's + * signature may not contain any instance of the opcintype. + */ + opcintype = get_opfamily_input_type(opfamily, amoid); + if (OidIsValid(opcintype) && + lefttype == opcintype && + righttype == opcintype) + result = true; + } + + ReleaseSysCache(proctup); + + return result; + } + + /* + * If the specified opfamily contains exactly one opclass, return that + * opclass's opcintype. Otherwise, return InvalidOid. + * + * GiST and GIN indexes have no particular use for opfamilies, since there + * is no expectation of common behavior across different opclasses. So + * generally this can be expected to succeed on a GiST or GIN opfamily. + */ + static Oid + get_opfamily_input_type(Oid opfamily, Oid amoid) + { + Oid result = InvalidOid; + int nmatch = 0; + Relation rel; + ScanKeyData skey[1]; + SysScanDesc scan; + HeapTuple tup; + + /* + * We must scan through all the opclasses available for the access method, + * since there's no index that would let us look at just those for the + * opfamily. + */ + rel = heap_open(OperatorClassRelationId, AccessShareLock); + + ScanKeyInit(&skey[0], + Anum_pg_opclass_opcmethod, + BTEqualStrategyNumber, F_OIDEQ, + ObjectIdGetDatum(amoid)); + + scan = systable_beginscan(rel, OpclassAmNameNspIndexId, true, + SnapshotNow, 1, skey); + + while (HeapTupleIsValid(tup = systable_getnext(scan))) + { + Form_pg_opclass opclass = (Form_pg_opclass) GETSTRUCT(tup); + + if (opclass->opcfamily == opfamily) + { + if (++nmatch > 1) + { + /* more than one opclass in family, so fail */ + result = InvalidOid; + break; + } + result = opclass->opcintype; + } + } + + systable_endscan(scan); + + heap_close(rel, AccessShareLock); + + return result; + } + + /* * Add a new family member to the appropriate list, after checking for * duplicated strategy or proc number. */ diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h index 01f271bff43f3c3d8560cb9169e85afe9545d41b..fb48530c32a9f93aea106bf9ce2147b99b8d89e4 100644 *** a/src/include/commands/defrem.h --- b/src/include/commands/defrem.h *************** extern void AlterOpFamilyNamespace(List *** 106,111 **** --- 106,115 ---- extern Oid get_am_oid(const char *amname, bool missing_ok); extern Oid get_opclass_oid(Oid amID, List *opclassname, bool missing_ok); extern Oid get_opfamily_oid(Oid amID, List *opfamilyname, bool missing_ok); + extern bool opfamily_op_has_default_types(Oid opfamily, Oid lefttype, Oid righttype, + Oid operatorId); + extern bool opfamily_proc_has_default_types(Oid opfamily, Oid lefttype, Oid righttype, + Oid procedureId); /* commands/tsearchcmds.c */ extern void DefineTSParser(List *names, List *parameters);
2011/1/16 Tom Lane <tgl@sss.pgh.pa.us>: > Comments? I think it's great you undertook the challenge of solving this problem the "proper way". I think your desire to achieve perfection in every little detail is admirable. Your patch is according to me, not far from perfect, but could be improved in a faw ways: a) pg_describe_object should always include the schema in the name, even for object in public and pg_catalog. Otherwise it's not explicitly stated whether an object relies in pg_catalog or public. If you would create a function in the public schema in a 8.3 database, with a name which does not exist in pg_catalog, when converting to version 9 in the future, which provides a few more functions in the pg_catalog schema than 8.3, there is a risk your function would conflict with the new function with the same name in version 9. The pg_catalog function would then be selected by default, unless explicitly calling it with the fully qualified name. This might or might not be what you want. Let's say you are unfortunate and unaware it's bad to name functions pg_* in the public schema, since it's possible, it could possibly lead to unexpected results. Now, let's go back to the discussion on what pg_describe_object(oid,oid,int) should return. I strongly believe it's wiser to include the schema in the description, even though it's not important nor interesting in most cases. It's good to explicitly inform the end-user (who's understanding of SQL or PostgreSQL might be limited) there is a distinct difference between public.myfunc and pg_catalog.myfunc, and it's better to always include the schema, than to "auto-detect" if there are two functions with the same name, because it's highly confusing the description depends on your "SET search_path TO" setting. Conclusions in summary: *) It would be a lot more user-friendly if pg_describe_object always returned the same text string, independent of your "search_path" setting. *) It would be a lot more "programmer-friendly" if pg_describe_object returned the same text string, independent of your "search_path" setting and at the same time always included all the unique columns of the object (including the schema would fix it in >99% of the cases, and your patch would fix the remaining <1% of the cases). *) Alternatively, it could possibly be better to introduce a new function, only returning a text string composed using only the unique columns (and some constants to do the formatting of the nice text string, resolved using the unique columns), if it's not feasible to append the schema name to the description, considering people might (unlikely, but not impossibly) programtically rely on pg_describe_object already and parse the text string, which would break a lot of things. Just some thoughts... -- Best regards, Joel Jacobson Glue Finance
Joel Jacobson <joel@gluefinance.com> writes: > a) pg_describe_object should always include the schema in the name, > even for object in public and pg_catalog. I knew you were going to demand that next, as soon as you figured out that it was an obstacle for using pg_describe_object output as a globally unique identifier. But I'm going to reply, once again, that pg_describe_object is not meant to guarantee that and we're not going to make it so. regards, tom lane
On Sun, 2011-01-16 at 14:28 -0500, Tom Lane wrote: > One other point here is that I find messages like this a mite > unreadable: > > function 1 (oidvector[], oidvector[]) btoidvectorcmp(oidvector,oidvector) of operator family array_ops for access methodgin > > If we were to go with this, I'd be strongly tempted to rearrange all > four of the messages involved to put the operator or function name > at the end, eg > > function 1 (oidvector[], oidvector[]) of operator family array_ops for access method gin: btoidvectorcmp(oidvector,oidvector) Yes, I agree with you that the second is much more readable with out without the lefttype and righttype. function 1 of operator family array_ops for access method gin: btoidvectorcmp(oidvector,oidvector) is more readable in my opinion than, function 1 btoidvectorcmp(oidvector,oidvector) of operator family array_ops for access method gin Regards, Andreas
Tom Lane <tgl@sss.pgh.pa.us> writes: > Joel Jacobson <joel@gluefinance.com> writes: >> a) pg_describe_object should always include the schema in the name, >> even for object in public and pg_catalog. > > I knew you were going to demand that next, as soon as you figured out > that it was an obstacle for using pg_describe_object output as a > globally unique identifier. But I'm going to reply, once again, that > pg_describe_object is not meant to guarantee that and we're not going to > make it so. And it's easy to pretend it's already coded this way if you're motivated enough. Just find a schema name that's not already existing: dim=# set search_path to public, utils; SET dim=# select pg_describe_object('pg_proc'::regclass, 16602, 0); pg_describe_object ---------------------------------------------------------------function unaccent_lexize(internal,internal,internal,internal) (1 row) dim=# begin; create schema dummy; set local search_path to 'dummy'; select pg_describe_object('pg_proc'::regclass, 16602,0); rollback; BEGIN CREATE SCHEMA SET pg_describe_object ---------------------------------------------------------------------function utils.unaccent_lexize(internal,internal,internal,internal) (1 row) ROLLBACK Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
2011/1/17 Tom Lane <tgl@sss.pgh.pa.us>: > Joel Jacobson <joel@gluefinance.com> writes: >> a) pg_describe_object should always include the schema in the name, >> even for object in public and pg_catalog. > > I knew you were going to demand that next, as soon as you figured out > that it was an obstacle for using pg_describe_object output as a > globally unique identifier. But I'm going to reply, once again, that > pg_describe_object is not meant to guarantee that and we're not going to > make it so. I knew you were going to say that, but I'm going to ask, once again, is it possible to introduce another function, possibly returning a differently formatted string, perhaps in JSON, or any other structured format, suitable to be parsed by an application and which keys are sorted in a way which guarantees uniqueness? Perhaps it could be named pg_get_object_identifier or something like that. It's a huge limitation when designing the type of application I'm designing, when you need to invent your own solution to the problem, to avoid the dependency on oids, which are not possible to use across different databases. Perhaps I'm the only one working with a project where a unique identifier for each object is an absolute requirement, but it sounds unlikely, quite a lot of developers must have thought about these things before. -- Best regards, Joel Jacobson Glue Finance
On Sun, Jan 16, 2011 at 2:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andreas Karlsson <andreas@proxel.se> writes: >> On Sat, 2011-01-15 at 10:36 -0500, Tom Lane wrote: >>> But I can read the handwriting on the wall: if I want this done right, >>> I'm going to have to do it myself. > >> Do I understand you correctly if I interpret what you would like to see >> is the same format used now in these cases? > > Attached is a patch that does what I would consider an acceptable job of > printing these datatypes only when they're interesting. I still think > that this is largely a waste of code, but if people want it, this is > what to do. Testing this in the regression database, it fires on > (a) the entries where a binary-compatible hash function is used, and > (b) all the entries associated with the GIN operator family array_ops. > The latter happens because we've more or less arbitrarily crammed a > bunch of opclasses into the same opfamily. > > One other point here is that I find messages like this a mite > unreadable: > > function 1 (oidvector[], oidvector[]) btoidvectorcmp(oidvector,oidvector) of operator family array_ops for access methodgin > > If we were to go with this, I'd be strongly tempted to rearrange all > four of the messages involved to put the operator or function name > at the end, eg > > function 1 (oidvector[], oidvector[]) of operator family array_ops for access method gin: btoidvectorcmp(oidvector,oidvector) > > Comments? I kind of wonder if it wouldn't be even better to just *delete* that from the thing altogether and write: function 1 (oidvector[], oidvector[]) of operator family array_ops for access method gin We're trying to represent the pg_amproc entry here, and including a bunch of details of the pg_proc entry to which it happens to point seems almost better to be confusing the issue. The key for pg_amproc is the operator family, the left and right argument types, and the proc number, so it seems like those are the things we ought to be printing. But if you want to keep it, by all means let's put it at the end. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Sun, Jan 16, 2011 at 2:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> If we were to go with this, I'd be strongly tempted to rearrange all >> four of the messages involved to put the operator or function name >> at the end, eg >> >> function 1 (oidvector[], oidvector[]) of operator family array_ops for access method gin: btoidvectorcmp(oidvector,oidvector) > I kind of wonder if it wouldn't be even better to just *delete* that > from the thing altogether and write: > function 1 (oidvector[], oidvector[]) of operator family array_ops for > access method gin > We're trying to represent the pg_amproc entry here, and including a > bunch of details of the pg_proc entry to which it happens to point > seems almost better to be confusing the issue. Yeah, that occurred to me too. However, the CREATE OPERATOR CLASS syntax doesn't really draw a distinction between the referenced function/operator and its reference in the opclass, and I'm not sure users do either. So I don't want to give up the details of the function or operator. But sticking them at the end after a colon might make it clearer that the func/operator is referenced by the amproc or amop entry, but is not the same thing. regards, tom lane
On Sat, Jan 22, 2011 at 9:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Sun, Jan 16, 2011 at 2:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> If we were to go with this, I'd be strongly tempted to rearrange all >>> four of the messages involved to put the operator or function name >>> at the end, eg >>> >>> function 1 (oidvector[], oidvector[]) of operator family array_ops for access method gin: btoidvectorcmp(oidvector,oidvector) > >> I kind of wonder if it wouldn't be even better to just *delete* that >> from the thing altogether and write: > >> function 1 (oidvector[], oidvector[]) of operator family array_ops for >> access method gin > >> We're trying to represent the pg_amproc entry here, and including a >> bunch of details of the pg_proc entry to which it happens to point >> seems almost better to be confusing the issue. > > Yeah, that occurred to me too. However, the CREATE OPERATOR CLASS > syntax doesn't really draw a distinction between the referenced > function/operator and its reference in the opclass, and I'm not sure > users do either. Well, I think from a user perspective the operator class machinery is approximately clear as mud, but the CREATE OPERATOR CLASS syntax surely seems to draw a distinction. It seems pretty clear we're mapping a function support number, with optional types, onto a pre-existing function. > So I don't want to give up the details of the function > or operator. But sticking them at the end after a colon might make it > clearer that the func/operator is referenced by the amproc or amop > entry, but is not the same thing. That seems like an improvement over the status quo, so do you want to go ahead and do that? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Sat, Jan 22, 2011 at 9:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> So I don't want to give up the details of the function >> or operator. �But sticking them at the end after a colon might make it >> clearer that the func/operator is referenced by the amproc or amop >> entry, but is not the same thing. > That seems like an improvement over the status quo, so do you want to > go ahead and do that? Are you voting to commit the patch I presented plus that change, or do something different? regards, tom lane
On Sat, Jan 22, 2011 at 10:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Sat, Jan 22, 2011 at 9:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> So I don't want to give up the details of the function >>> or operator. But sticking them at the end after a colon might make it >>> clearer that the func/operator is referenced by the amproc or amop >>> entry, but is not the same thing. > >> That seems like an improvement over the status quo, so do you want to >> go ahead and do that? > > Are you voting to commit the patch I presented plus that change, Yeah. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
I wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> We're trying to represent the pg_amproc entry here, and including a >> bunch of details of the pg_proc entry to which it happens to point >> seems almost better to be confusing the issue. > Yeah, that occurred to me too. However, the CREATE OPERATOR CLASS > syntax doesn't really draw a distinction between the referenced > function/operator and its reference in the opclass, and I'm not sure > users do either. So I don't want to give up the details of the function > or operator. But sticking them at the end after a colon might make it > clearer that the func/operator is referenced by the amproc or amop > entry, but is not the same thing. And yet ... and yet ... if you adopt the position that what we're going to print is "amproc item: referenced procedure", then it's not entirely clear why the amproc item description shouldn't be complete. The argument that it's redundant with the procedure description gets a lot weaker as soon as you look at them as two separate items. Ditto amop. And having to add a lot of otherwise-useless code to suppress the redundancy surely isn't very attractive. So I guess I'm coming around to the idea that we want something not too much bigger than Andreas' original patch, but applying to both amop and amproc, and putting the operator/function description at the end. Comments? regards, tom lane
On Sun, Jan 23, 2011 at 11:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > And yet ... and yet ... if you adopt the position that what we're going > to print is "amproc item: referenced procedure", then it's not entirely > clear why the amproc item description shouldn't be complete. The > argument that it's redundant with the procedure description gets a lot > weaker as soon as you look at them as two separate items. Ditto amop. > And having to add a lot of otherwise-useless code to suppress the > redundancy surely isn't very attractive. I couldn't agree more. Sorry if I didn't explain that concern clearly enough upthread. > So I guess I'm coming around to the idea that we want something not too > much bigger than Andreas' original patch, but applying to both amop and > amproc, and putting the operator/function description at the end. That's fine with me. I think the principal argument for failing to remove it entirely is that we've traditionally had it there, but IMHO moving in the direction of treating them as separate objects is much more clear and an altogether better approach. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Sun, Jan 23, 2011 at 11:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> So I guess I'm coming around to the idea that we want something not too >> much bigger than Andreas' original patch, but applying to both amop and >> amproc, and putting the operator/function description at the end. > That's fine with me. OK, committed that way. > I think the principal argument for failing to > remove it entirely is that we've traditionally had it there, but IMHO > moving in the direction of treating them as separate objects is much > more clear and an altogether better approach. I think there's a usability argument in addition to just plain "we always did it that way". But anyway, this patch has now officially been discussed to death. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > But anyway, this patch has now officially > been discussed to death. +1 :) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
2011/1/23 Dimitri Fontaine <dimitri@2ndquadrant.fr>: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> But anyway, this patch has now officially >> been discussed to death. > > +1 :) +∞ :) In the aftermath, I realized I was almost about to feel a bit ashamed about the fact my original forum post probably gave birth to the most long lived discussion in the history of PostgreSQL. Having realized this, I realized a secondly even more important fact, namely the importance of details, making the whole difference between a sloppy software project and a highly successful project with the ambition of achieving perfection in every little detail. I'm proud we can conclude ProstgreSQL is apparently a project of the second category. -- Best regards, Joel Jacobson Glue Finance
On Sun, Jan 23, 2011 at 4:24 PM, Joel Jacobson <joel@gluefinance.com> wrote: > In the aftermath, I realized I was almost about to feel a bit ashamed > about the fact my original forum post probably gave birth to the most > long lived discussion in the history of PostgreSQL. I think you'd need another order of magnitude to achieve that exalted position. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company