Thread: BUG #5926: information schema dtd_identifier for element_types, columns, parameters views inconsistent

The following bug has been logged online:

Bug reference:      5926
Logged by:          Ingmar Brouns
Email address:      swingi@gmail.com
PostgreSQL version: 9.0.3
Operating system:   Ubuntu 9.0.4
Description:        information schema dtd_identifier for element_types,
columns, parameters views inconsistent
Details:

Hi,

I am writing a function that needs to retrieve information with respect to
the types of parameters of functions. I use the information schema for that.
The parameters view documentation states:

data_type: Data type of the parameter, if it is a built-in type, or ARRAY if
it is some array (in that case, see the view element_types)

So for arrays I will have to join with information_schema.element_types

http://www.postgresql.org/docs/9.0/static/infoschema-element-types.html

At the top op that documentation is some example code, it joins on
dtd_identifier, the code does not work:

create table public.test_table(a varchar array, b integer, c integer
array);

SELECT c.column_name, c.data_type, e.data_type AS element_type
FROM information_schema.columns c
LEFT JOIN information_schema.element_types e
ON ((c.table_catalog, c.table_schema, c.table_name, 'TABLE',
c.dtd_identifier)
   = (e.object_catalog, e.object_schema, e.object_name, e.object_type,
e.dtd_identifier))
WHERE c.table_schema = 'public' AND c.table_name = 'test_table'
ORDER BY c.ordinal_position;

 column_name | data_type | element_type
-------------+-----------+--------------
 a           | ARRAY     |
 b           | integer   |
 c           | ARRAY     |
(3 rows)

The same holds when joining with the parameters view. The reason seems to be
that the dtd_identifier of the element_types view has prepended 'a's whereas
the dtd_identifiers of the columns and parameter views do not:

select column_name,dtd_identifier
from information_schema.columns c
where c.table_schema = 'public'
and   c.table_name   = 'test_table';

 column_name | dtd_identifier
-------------+----------------
 a           | 1
 b           | 2
 c           | 3
(3 rows)

select dtd_identifier
from information_schema.element_types e
where e.object_schema = 'public'
and e.object_name = 'test_table';

 dtd_identifier
----------------
 a1
 a3
(2 rows)

The element_types view has a column 'collection_type_identifier', this
column is not present in the documentation. It is defined exactly as the
dtd_identifier, only then without the prepended 'a':

('a'::text || x.objdtdid::text)::information_schema.sql_identifier AS
dtd_identifier
             x.objdtdid       ::information_schema.sql_identifier AS
collection_type_identifier

When I modify the example code to join on this column instead, I get the
expected results:

SELECT c.column_name, c.data_type, e.data_type AS element_type
FROM information_schema.columns c LEFT JOIN information_schema.element_types
e
    ON ((c.table_catalog, c.table_schema, c.table_name, 'TABLE',
c.dtd_identifier)
      = (e.object_catalog, e.object_schema, e.object_name, e.object_type,
e.collection_type_identifier))
WHERE c.table_schema = 'public' AND c.table_name = 'test_table'
ORDER BY c.ordinal_position;

 column_name | data_type |   element_type
-------------+-----------+-------------------
 a           | ARRAY     | character varying
 b           | integer   |
 c           | ARRAY     | integer
(3 rows)



Kind regards,

Ingmar Brouns
On 11.03.2011 14:18, Ingmar Brouns wrote:
> The following bug has been logged online:
>
> Bug reference:      5926
> Logged by:          Ingmar Brouns
> Email address:      swingi@gmail.com
> PostgreSQL version: 9.0.3
> Operating system:   Ubuntu 9.0.4
> Description:        information schema dtd_identifier for element_types,
> columns, parameters views inconsistent
> Details:
>
> Hi,
>
> I am writing a function that needs to retrieve information with respect to
> the types of parameters of functions. I use the information schema for that.
> The parameters view documentation states:
>
> data_type: Data type of the parameter, if it is a built-in type, or ARRAY if
> it is some array (in that case, see the view element_types)
>
> So for arrays I will have to join with information_schema.element_types
>
> http://www.postgresql.org/docs/9.0/static/infoschema-element-types.html
>
> At the top op that documentation is some example code, it joins on
> dtd_identifier, the code does not work:
>
> create table public.test_table(a varchar array, b integer, c integer
> array);
>
> SELECT c.column_name, c.data_type, e.data_type AS element_type
> FROM information_schema.columns c
> LEFT JOIN information_schema.element_types e
> ON ((c.table_catalog, c.table_schema, c.table_name, 'TABLE',
> c.dtd_identifier)
>     = (e.object_catalog, e.object_schema, e.object_name, e.object_type,
> e.dtd_identifier))
> WHERE c.table_schema = 'public' AND c.table_name = 'test_table'
> ORDER BY c.ordinal_position;
>
>   column_name | data_type | element_type
> -------------+-----------+--------------
>   a           | ARRAY     |
>   b           | integer   |
>   c           | ARRAY     |
> (3 rows)
>
> The same holds when joining with the parameters view. The reason seems to be
> that the dtd_identifier of the element_types view has prepended 'a's whereas
> the dtd_identifiers of the columns and parameter views do not:
>
> select column_name,dtd_identifier
> from information_schema.columns c
> where c.table_schema = 'public'
> and   c.table_name   = 'test_table';
>
>   column_name | dtd_identifier
> -------------+----------------
>   a           | 1
>   b           | 2
>   c           | 3
> (3 rows)
>
> select dtd_identifier
> from information_schema.element_types e
> where e.object_schema = 'public'
> and e.object_name = 'test_table';
>
>   dtd_identifier
> ----------------
>   a1
>   a3
> (2 rows)
>
> The element_types view has a column 'collection_type_identifier', this
> column is not present in the documentation. It is defined exactly as the
> dtd_identifier, only then without the prepended 'a':
>
> ('a'::text || x.objdtdid::text)::information_schema.sql_identifier AS
> dtd_identifier
>               x.objdtdid       ::information_schema.sql_identifier AS
> collection_type_identifier
>
> When I modify the example code to join on this column instead, I get the
> expected results:
>
> SELECT c.column_name, c.data_type, e.data_type AS element_type
> FROM information_schema.columns c LEFT JOIN information_schema.element_types
> e
>      ON ((c.table_catalog, c.table_schema, c.table_name, 'TABLE',
> c.dtd_identifier)
>        = (e.object_catalog, e.object_schema, e.object_name, e.object_type,
> e.collection_type_identifier))
> WHERE c.table_schema = 'public' AND c.table_name = 'test_table'
> ORDER BY c.ordinal_position;
>
>   column_name | data_type |   element_type
> -------------+-----------+-------------------
>   a           | ARRAY     | character varying
>   b           | integer   |
>   c           | ARRAY     | integer
> (3 rows)
>
>

Many thanks for figuring this out.

I think we should fix the documentation here.

Best Regards,

Susanne Ebrecht

--
Susanne Ebrecht - 2ndQuadrant
PostgreSQL Development, 24x7 Support, Training and Services
www.2ndQuadrant.com


On Wed, May 25, 2011 at 7:29 AM, Susanne Ebrecht
<susanne@2ndquadrant.com> wrote:
> I think we should fix the documentation here.

Patch?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

> On 11.03.2011 14:18, Ingmar Brouns wrote:
>> ... The reason seems to be
>> that the dtd_identifier of the element_types view has prepended 'a's whereas
>> the dtd_identifiers of the columns and parameter views do not:
>>
>> select column_name,dtd_identifier
>> from information_schema.columns c
>> where c.table_schema = 'public'
>> and   c.table_name   = 'test_table';
>>
>> column_name | dtd_identifier
>> -------------+----------------
>> a           | 1
>> b           | 2
>> c           | 3
>> (3 rows)
>>
>> select dtd_identifier
>> from information_schema.element_types e
>> where e.object_schema = 'public'
>> and e.object_name = 'test_table';
>>
>> dtd_identifier
>> ----------------
>> a1
>> a3
>> (2 rows)
>>
>> The element_types view has a column 'collection_type_identifier', this
>> column is not present in the documentation. It is defined exactly as the
>> dtd_identifier, only then without the prepended 'a':

The omission of collection_type_identifier from the docs is clearly a
doc bug.  However, it looks to me like you've identified an error in the
view definition, not only a doc bug.  I think the values of the
dtd_identifier and collection_type_identifier columns are swapped, ie,
we ought to be prepending 'a' to the collection_type_identifier not the
dtd_identifier.  As far as I can tell from the spec, dtd_identifier
ought to be the identifier of the element type, while
collection_type_identifier should be a made-up identifier for the array
type.  That would make the sample query given in the docs correct.

If my analysis is correct, we really ought to try to fix this in time
for beta2, since there's no way to fix it without a forced initdb.

Comments?

            regards, tom lane
On ons, 2011-06-08 at 11:09 -0400, Tom Lane wrote:
> The omission of collection_type_identifier from the docs is clearly a
> doc bug.  However, it looks to me like you've identified an error in the
> view definition, not only a doc bug.  I think the values of the
> dtd_identifier and collection_type_identifier columns are swapped, ie,
> we ought to be prepending 'a' to the collection_type_identifier not the
> dtd_identifier.  As far as I can tell from the spec, dtd_identifier
> ought to be the identifier of the element type, while
> collection_type_identifier should be a made-up identifier for the array
> type.  That would make the sample query given in the docs correct.

Yes, we need to switch those two columns around and change the
documentation.

> If my analysis is correct, we really ought to try to fix this in time
> for beta2, since there's no way to fix it without a forced initdb.

I can take care of this later today.
On ons, 2011-06-08 at 20:38 +0300, Peter Eisentraut wrote:
> On ons, 2011-06-08 at 11:09 -0400, Tom Lane wrote:
> > The omission of collection_type_identifier from the docs is clearly a
> > doc bug.  However, it looks to me like you've identified an error in the
> > view definition, not only a doc bug.  I think the values of the
> > dtd_identifier and collection_type_identifier columns are swapped, ie,
> > we ought to be prepending 'a' to the collection_type_identifier not the
> > dtd_identifier.  As far as I can tell from the spec, dtd_identifier
> > ought to be the identifier of the element type, while
> > collection_type_identifier should be a made-up identifier for the array
> > type.  That would make the sample query given in the docs correct.
>
> Yes, we need to switch those two columns around and change the
> documentation.
>
> > If my analysis is correct, we really ought to try to fix this in time
> > for beta2, since there's no way to fix it without a forced initdb.
>
> I can take care of this later today.

On fifth reading, I think the implementation of the information schema
is correct, but the documentation is wrong.

It was broken in commit 8e1ccad5:

commit 8e1ccad51901e83916dae297cd9afa450957a36c
Author: Bruce Momjian <bruce@momjian.us>
Date:   Tue Feb 20 18:47:25 2007 +0000

    Update information_schema documentation to match system tables.
    Backpatch to 8.2.X.

diff --git a/doc/src/sgml/information_schema.sgml b/doc/src/sgml/information_schema.sgml
index 67ce709..8d0b8e4 100644
--- a/doc/src/sgml/information_schema.sgml
+++ b/doc/src/sgml/information_schema.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/information_schema.sgml,v 1.31 2007/02/01 00:28:17 momjian Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/information_schema.sgml,v 1.32 2007/02/20 18:47:25 momjian Exp $ -->

 <chapter id="information-schema">
  <title>The Information Schema</title>
@@ -1876,7 +1876,7 @@
 SELECT c.column_name, c.data_type, e.data_type AS element_type
 FROM information_schema.columns c LEFT JOIN information_schema.element_types e
      ON ((c.table_catalog, c.table_schema, c.table_name, 'TABLE', c.dtd_identifier)
-       = (e.object_catalog, e.object_schema, e.object_name, e.object_type, e.array_type_identifier))
+       = (e.object_catalog, e.object_schema, e.object_name, e.object_type, e.dtd_identifier))
 WHERE c.table_schema = '...' AND c.table_name = '...'
 ORDER BY c.ordinal_position;
 </programlisting>
@@ -1936,13 +1936,11 @@ ORDER BY c.ordinal_position;
      </row>

      <row>
-      <entry><literal>array_type_identifier</literal></entry>
+      <entry><literal>dtd_identifier</literal></entry>
       <entry><type>sql_identifier</type></entry>
       <entry>
        The identifier of the data type descriptor of the array being
-       described.  Use this to join with the
-       <literal>dtd_identifier</literal> columns of other information
-       schema views.
+       described
       </entry>
      </row>

@@ -2097,13 +2095,6 @@ ORDER BY c.ordinal_position;
       <entry>Always null, because arrays always have unlimited maximum cardinality in
<productname>PostgreSQL</></entry>
      </row>

-     <row>
-      <entry><literal>dtd_identifier</literal></entry>
-      <entry><type>sql_identifier</type></entry>
-      <entry>
-       An identifier of the data type descriptor of the element.  This
-       is currently not useful.
-      </entry>
      </row>
     </tbody>
    </tgroup>


This needs to be reverted and array_type_identifier (the SQL:1999 name)
updated to collection_type_identifier.
On ons, 2011-06-08 at 11:09 -0400, Tom Lane wrote:
> As far as I can tell from the spec, dtd_identifier
> ought to be the identifier of the element type, while
> collection_type_identifier should be a made-up identifier for the
> array type.  That would make the sample query given in the docs
> correct.

But it's collection_type_identifier that you join against, say, columns,
so you can resolve the element type of column types.  So
collection_type_identifier must match the dtd_identifier schema of
columns etc., whereas element_types.dtd_identifier is not joinable
against anything.