Re: machine-parseable object descriptions - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: machine-parseable object descriptions
Date
Msg-id 20130320034625.GB3747@alvh.no-ip.org
Whole thread Raw
In response to Re: machine-parseable object descriptions  (Dimitri Fontaine <dimitri@2ndQuadrant.fr>)
Responses Re: machine-parseable object descriptions
Re: machine-parseable object descriptions
Re: machine-parseable object descriptions
List pgsql-hackers
Dimitri Fontaine wrote:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
> > I could also live with keeping the schema column as proposed, if people
> > think it has a use, but letting it be redundant with a schema name
> > included in the identity string.  But it seems like a bad idea to try to
> > shear schema off of identity.
>
> +1
>
> Use case for keeping the extra column: replication to a federated
> database where you want to tweak the target schema.

If I understood our IM discussion correctly, you were saying that for
federated database replication you wanted a separate "name" column, from
which you could extract a table name easily; not that you wanted a
separate schema column.  Anyway the schema column is also present.  We
can easily remove columns before commit, if we decide we don't want
them.

In the attached patch, we have these three columns: a "name" column,
which is the object's bare name; a "schema" column, which is the schema;
and an "identity" column, which is the whole thing, with all the schema
qualifications that apply.  There's also the type, of course.

I added the name column because it seems to me that it is genuinely
useful for some use cases.  However, there are two problems: first, the
original implementation had a slight bug in the case of column objects
(it displayed the name of the relation, not the name of the column), and
two I was afraid it'd be easily misused.  One way to attack both things
at once would to be make it NULL except in the cases where it's a truly
unique identifier (tables, schemas, etc).  To avoid this being a
standalone "whitelist" of catalogs (which would get outdated quickly, I
fear), I propose to add a new boolean option in ObjectProperty, which
specifies whether the name can be used as an identifier.  I have
implemented it that way in the attached patch.


The new identity column is amazingly verbose on things like pg_amproc entries:
10650 | 1 (pg_catalog.point, pg_catalog.point) of pg_catalog.point_ops for gist:
pg_catalog.gist_point_consistent(pg_catalog.internal,pg_catalog.point,integer,pg_catalog.oid,pg_catalog.internal)10651|
2(pg_catalog.point, pg_catalog.point) of pg_catalog.point_ops for gist:
pg_catalog.gist_box_union(pg_catalog.internal,pg_catalog.internal)10652| 3 (pg_catalog.point, pg_catalog.point) of
pg_catalog.point_opsfor gist: pg_catalog.gist_point_compress(pg_catalog.internal)10653 | 4 (pg_catalog.point,
pg_catalog.point)of pg_catalog.point_ops for gist: pg_catalog.gist_box_decompress(pg_catalog.internal)10654 | 5
(pg_catalog.point,pg_catalog.point) of pg_catalog.point_ops for gist:
pg_catalog.gist_box_penalty(pg_catalog.internal,pg_catalog.internal,pg_catalog.internal)10655| 6 (pg_catalog.point,
pg_catalog.point)of pg_catalog.point_ops for gist:
pg_catalog.gist_box_picksplit(pg_catalog.internal,pg_catalog.internal)10656| 7 (pg_catalog.point, pg_catalog.point) of
pg_catalog.point_opsfor gist: pg_catalog.gist_box_same(pg_catalog.box,pg_catalog.box,pg_catalog.internal)10657 | 8
(pg_catalog.point,pg_catalog.point) of pg_catalog.point_ops for gist:
pg_catalog.gist_point_distance(pg_catalog.internal,pg_catalog.point,integer,pg_catalog.oid)

Also, note how types with standard-specified names ("integer") are not
qualified (which is the right thing, AFAICT).   Here's another interesting
example:

alvherre=# create type public.integer as enum ('uno', 'dos');
CREATE TYPE

alvherre=# select * from pg_identify_object('pg_type'::regclass, 'integer'::regtype, 0);type |   schema   | name |
identity 
------+------------+------+----------type | pg_catalog | int4 | integer
(1 fila)

alvherre=# select * from pg_identify_object('pg_type'::regclass, 'public.integer'::regtype, 0);type | schema |   name
|     identity      
------+--------+-----------+------------------type | public | "integer" | public."integer"
(1 fila)

If you create a public.int4 type, there's no confusion either, so it's
all consistent.

Here's another bit of sample output, from pg_depend contents (at the
left there's the referencing object, at the right the referenced
object):

alvherre=# select deptype, refd.*, ref.* from pg_depend, lateral (select * from pg_identify_object(classid, objid,
objsubid)) refd, lateral (select * from pg_identify_object(refclassid, refobjid, refobjsubid)) ref where classid <> 0
andrefd.schema <> 'pg_catalog' and ref.schema <> 'information_schema' and refd.schema <> 'pg_toast';deptype |
type       |    schema    |     name     |                           identity                           |     type
|   schema    |    name     |                 identity                   

---------+-------------------+--------------+--------------+--------------------------------------------------------------+--------------+--------------+-------------+-------------------------------------------a
     | domain constraint | public       |              | "my constr" on public.mydom                                  |
type        | public       | mydom       | public.mydomi       | type              | "the schema" | "the table"  | "the
schema"."thetable"                                     | table        | "the schema" | "the table" | "the schema"."the
table"i      | type              | "the schema" | "_the table" | "the schema"."the table"[]
     | type         | "the schema" | "the table" | "the schema"."the table"i       | type              | public       |
qx          | public.qx                                                    | table        | public       | qx
|public.qxi       | type              | public       | _qx          | public.qx[]
          | type         | public       | qx          | public.qxa       | table constraint  | "the schema" |
  | "the table_another column_check" on "the schema"."the table" | table column | "the schema" | "the table" | "the
schema"."thetable"."another column"n       | table constraint  | "the schema" |              | "the table_another
column_check"on "the schema"."the table" | table column | "the schema" | "the table" | "the schema"."the
table"."anothercolumn"i       | type              | public       | _integer     | public."integer"[]
                      | type         | public       | "integer"   | public."integer"i       | type              |
public      | _int4        | public.int4[]                                                | type         | public
|int4        | public.int4 
(9 filas)


alvherre=# \d "the schema"."the table"       Tabla «the schema.the table»   Columna     |  Tipo   | Modificadores
----------------+---------+---------------the column     | integer | another column | integer |
Restricciones CHECK:   "the table_another column_check" CHECK ("another column" > 0)

All in all, I'm happy with this and I'm considering committing it as
soon as we agree on the set of columns.  I'm mildly on the side of
removing the separate schema column and keeping name, so we'd have
type/name/identity.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Greg Smith
Date:
Subject: Re: Enabling Checksums
Next
From: Alvaro Herrera
Date:
Subject: Re: machine-parseable object descriptions