Thread: issue about information_schema REFERENTIAL_CONSTRAINTS

issue about information_schema REFERENTIAL_CONSTRAINTS

From
Fabien COELHO
Date:
Hello,

I haven't found a bug management system about postgresql, so here is a
mail. Maybe this issue was already reported, sorry if it is the case.
I have seen anything about the information_schema in pg todo list.

This is tested on postgresql 8.4.4.

The REFERENTIAL_CONSTRAINTS table in the information_schema references a
constaint through its database/schema/name, but this information is not
unique, so it may identify several constraints, thus the information
derived may not be consistent.

CREATE TABLE destination(id SERIAL PRIMARY KEY);

CREATE TABLE source1
  (id SERIAL PRIMARY KEY,
   fk INTEGER CONSTRAINT to_destination REFERENCES destination);

CREATE TABLE source2
  (id SERIAL PRIMARY KEY,
   fk INTEGER CONSTRAINT to_destination REFERENCES destination);

SELECT * FROM information_schema.REFERENTIAL_CONSTRAINTS;
-- contains two identical lines

Other tables about constraints may have the same issue.

The direct result is that this table leads to false result on joins, thus
is pretty useless. Usually I have plenty of "$1" constraints.

Suggestion: constraint names could be systematically prefixed with their
corresponding table so that they are indeed unique, possibly with some
escaping: '"<double-quoted-table-name>"."constraint_name"'. I'm not sure
about the possible consequences of changing the constraint names, but as
the information is a set of views on top of pg_catalog, there may be none.

--
Fabien.

Re: issue about information_schema REFERENTIAL_CONSTRAINTS

From
Tom Lane
Date:
Fabien COELHO <coelho@cri.ensmp.fr> writes:
> The REFERENTIAL_CONSTRAINTS table in the information_schema references a
> constaint through its database/schema/name, but this information is not
> unique, so it may identify several constraints, thus the information
> derived may not be consistent.

Postgres does not enforce that constraints have unique names within a
schema.  The SQL spec does say that they should be unique per-schema,
and the information_schema views are designed on that assumption.

If you use spec-compliant names for your constraints, you won't have a
problem.  If you don't, well, the information_schema views will be of
limited use to you.

> Suggestion: constraint names could be systematically prefixed with their
> corresponding table so that they are indeed unique,

We are not going to try to enforce uniqueness.  This has been debated
before, and most people like the current behavior just fine, or at least
better than the alternatives.

            regards, tom lane

Re: issue about information_schema REFERENTIAL_CONSTRAINTS

From
Fabien COELHO
Date:
Dear Tom,

>> The REFERENTIAL_CONSTRAINTS table in the information_schema references a
>> constaint through its database/schema/name, but this information is not
>> unique, so it may identify several constraints, thus the information
>> derived may not be consistent.
>
> Postgres does not enforce that constraints have unique names within a
> schema.  The SQL spec does say that they should be unique per-schema,
> and the information_schema views are designed on that assumption.

Hence a contradiction.

> If you use spec-compliant names for your constraints, you won't have a
> problem.  If you don't, well, the information_schema views will be of
> limited use to you.

I'm writing a schema analyzer which gives false results. I do not write
the constraints, I'm analyzing existing schemas. I cannot change it.

>> Suggestion: constraint names could be systematically prefixed with their
>> corresponding table so that they are indeed unique,
>
> We are not going to try to enforce uniqueness.

I'm not asking for uniqueness in "pg_catalog", esp as that would break
existing applications.

I'm suggesting uniqueness in the "information_schema", which can be
provided independently by some tweaking in the view construction, I think,
for instance by adding the oid of the constraint or maybe the table_name.

> This has been debated before, and most people like the current behavior
> just fine, or at least better than the alternatives.

I do not know "most people". I guess "most people" just do not use the
"information_schema", so they really do not care!

For the "few people" who do use the information_schema, I can assure you
that having a false information is a severe drawback, and it is called a
"bug".

So at least please fill in this as a "bug" somewhere, even if you do not
want to fix it.

--
Fabien.

Re: issue about information_schema REFERENTIAL_CONSTRAINTS

From
Peter Eisentraut
Date:
On ons, 2010-09-01 at 16:22 +0200, Fabien COELHO wrote:
> I'm suggesting uniqueness in the "information_schema", which can be
> provided independently by some tweaking in the view construction, I
> think, for instance by adding the oid of the constraint or maybe the
> table_name.

The view is defined by the SQL standard.  We cannot change it.

Re: issue about information_schema REFERENTIAL_CONSTRAINTS

From
Fabien COELHO
Date:
Dear Peter,

>> I'm suggesting uniqueness in the "information_schema", which can be
>> provided independently by some tweaking in the view construction, I
>> think, for instance by adding the oid of the constraint or maybe the
>> table_name.
>
> The view is defined by the SQL standard.

No. The result of the view (the definition of the expected attributes) is
defined in the standard. But it is really a view on top of "pg_catalog".

You are right that some views of the information_schema are defined in the
standard, but they deal with restrictions of other relations, say the
privileges for the current user...

> We cannot change it.

Yes we can!  It, it is 100% postgresql:

  \d information_schema.referential_constraints
   View "information_schema.referential_constraints"
   ...
    FROM pg_namespace ncon
    JOIN pg_constraint con ON ncon.oid = con.connamespace
    JOIN pg_class c ON con.conrelid = c.oid

--
Fabien Coelho - CRI, Maths & Systèmes, MINES ParisTech

Re: issue about information_schema REFERENTIAL_CONSTRAINTS

From
Peter Eisentraut
Date:
On tor, 2010-09-02 at 07:54 +0200, Fabien COELHO wrote:
> > We cannot change it.
>
> Yes we can!  It, it is 100% postgresql:
>
>   \d information_schema.referential_constraints
>    View "information_schema.referential_constraints"
>    ...
>     FROM pg_namespace ncon
>     JOIN pg_constraint con ON ncon.oid = con.connamespace
>     JOIN pg_class c ON con.conrelid = c.oid

Well, for the sake of argument, how would you propose to change it?

Re: issue about information_schema REFERENTIAL_CONSTRAINTS

From
Fabien COELHO
Date:
Hallo,

>>   \d information_schema.referential_constraints
>>    View "information_schema.referential_constraints"
>>    ...
>>     FROM pg_namespace ncon
>>     JOIN pg_constraint con ON ncon.oid = con.connamespace
>>     JOIN pg_class c ON con.conrelid = c.oid
>
> Well, for the sake of argument, how would you propose to change it?

For the sake of argument:

For the information_schema to be useful, expected keys & foreign keys must
work properly so that the constraints can be joined meaningfully
(otherwise, what is the point?). So any change would have to be be
consistent over all definitions.

The standard expects the triplet <catalog,schema,constraint> to be unique,
but in Pg, <catalog,schema,table,constraint> is unique (I think). Too late
to change that, obviously. So that let building a unique constraint name
just for the information_schema, which is fine with me.

(1) use the OID as the "constraint name" everywhere, it would work, it
would not look so good for display, but it is simple and fast.

(2) otherwise something built on top of <table,constraint>. To be on the
safe side, I would build a string (sql_identifier?) with something like:

     quote_ident(<table name>) || '.' || quote_ident(<constraint name>)

And the display would be reasonnable, like : "TableName"."$1" instead of a
big bunch of $1.

The affected information_schema views would be, as far as I can see:
  - check_constraint_routine_usage
  - check_constraints
  - constraint_column_usage
  - constraint_table_usage
  - domain_constraints
  - key_column_usage
  - referential_constraints
  - table_constraints
With one or two affected columns each.

Note that there may be the same issue with 'triggers' which are also
identified by a <catalog,schema,trigger> triplet. Maybe others.

I can send a patch on the information_schema definition in the coming days
if someone feel that this may deserve a "test". Otherwise, ISTM that it is
a "bug" to be recorded somewhere.

Have a nice day,

--
Fabien.

Re: issue about information_schema REFERENTIAL_CONSTRAINTS

From
Tom Lane
Date:
Fabien COELHO <coelho@cri.ensmp.fr> writes:
> (1) use the OID as the "constraint name" everywhere, it would work, it
> would not look so good for display, but it is simple and fast.

> (2) otherwise something built on top of <table,constraint>. To be on the
> safe side, I would build a string (sql_identifier?) with something like:

>      quote_ident(<table name>) || '.' || quote_ident(<constraint name>)

Note that (2) fails for long names; you have to do something to
compress to NAMEDATALEN.

The big problem with either of these is that it's no longer easily
possible to extract the actual constraint name from the view.

In any case, I am fairly sure that not having the constraint_name column
show the actual constraint name is a violation of the spirit of the SQL
spec, whether or not you can claim that it meets the letter.

            regards, tom lane

Re: issue about information_schema REFERENTIAL_CONSTRAINTS

From
Fabien COELHO
Date:
Dear Tom,

Still for the sake of argument:

> Note that (2) fails for long names; you have to do something to
> compress to NAMEDATALEN.

Indeed.

What if the type is changed to TEXT? It is just a view after all.
How important is it to stick to "sql_identifier"?


> The big problem with either of these is that it's no longer easily
> possible to extract the actual constraint name from the view.

Sure. A function is provided to do so, say
information_schema_constraint_name_to_pg_constraint_name.

> In any case, I am fairly sure that not having the constraint_name column
> show the actual constraint name is a violation of the spirit of the SQL
> spec, whether or not you can claim that it meets the letter.

Well, one must choose between to evil:

  (1) the constraint_name is changed in the view to be unique as expected
      by the spec, and the data can be joined meaningfully, and some reliable
      information can be derived.

  (2) the constraint_name looks nice but is not unique, and
      the information in the view is ambiguous and cannot be relied upon,
      so one is back to square "postgresql supports the information_schema,
      but there is no point to query it and expecting the results to
      reflect the contents of the catalogs".

If you want to stick to both the letter and the spirit of the spec, that
would mean enforcing unique constraint names in pg and break every
applications. Not good.

ISTM that the "spirit" of the information schema is more to be useful (1)
than to look beautiful (2).

Another technical proposal, a little more subtle and with possible
underlying issues I cannot foresee: have the constraint_name be a "pair of
sql_identifiers".

--
Fabien.

Re: issue about information_schema REFERENTIAL_CONSTRAINTS

From
Tom Lane
Date:
Fabien COELHO <coelho@cri.ensmp.fr> writes:
>> Note that (2) fails for long names; you have to do something to
>> compress to NAMEDATALEN.

> Indeed.

> What if the type is changed to TEXT? It is just a view after all.
> How important is it to stick to "sql_identifier"?

It's a view defined by the SQL standard, and one of the properties
defined by the standard is the type of that column.

> Well, one must choose between to evil:

Yeah, exactly.  I think that the current tradeoff is just fine.
If you want SQL-standard behavior, pick SQL-standard constraint names,
and there you are.

            regards, tom lane

Re: issue about information_schema REFERENTIAL_CONSTRAINTS

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Yeah, exactly.  I think that the current tradeoff is just fine.
> If you want SQL-standard behavior, pick SQL-standard constraint
> names, and there you are.

I see that as the crux if it -- the current implementation *allows*
standard-conforming behavior, even though it doesn't *enforce*
conforming naming.  The proposed alternative does not allow
standard-conforming behavior.  If you're going to use something
which is PostgreSQL-specific, you may as well write your own views
or use the "native" tables and views directly.

-Kevin

Re: issue about information_schema REFERENTIAL_CONSTRAINTS

From
Fabien COELHO
Date:
>> Well, one must choose between to evil:
>
> Yeah, exactly.  I think that the current tradeoff is just fine.

Hmmm. I think exactly the contrary. There is no point in having a non
reliable feature.

ISTM that very few people use the information schema, and if the query
results is not reliable, it will stay this way. If you have zero user, no
one will complain, which prooves that everything is fine:-) QED.

By the way, do you use the information schema?


> If you want SQL-standard behavior, pick SQL-standard constraint names,
> and there you are.

I tried to explained that I'm analyzing other people's schemas. I cannot
ask all other people on the planet to rewrite their schemas, I pick them
as they are.


Could you register this "bug" somewhere please?

Thanks for your time.

--
Fabien.

Re: issue about information_schema REFERENTIAL_CONSTRAINTS

From
Alvaro Herrera
Date:
Excerpts from Fabien COELHO's message of vie sep 03 13:39:19 -0400 2010:

> I tried to explained that I'm analyzing other people's schemas. I cannot
> ask all other people on the planet to rewrite their schemas, I pick them
> as they are.

Maybe you shouldn't be using the information_schema in the first place.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: issue about information_schema REFERENTIAL_CONSTRAINTS

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Excerpts from Fabien COELHO's message of vie sep 03 13:39:19 -0400 2010:
>> I tried to explained that I'm analyzing other people's schemas. I cannot
>> ask all other people on the planet to rewrite their schemas, I pick them
>> as they are.

> Maybe you shouldn't be using the information_schema in the first place.

It would probably be reasonable to put something into chapter 34 of the
docs pointing out this limitation.  I'm not sure exactly where though.
Should we invent a "compatibility issues" section in that chapter,
analogous to the ones we have for individual SQL commands?  Are there
other issues worth documenting for the information_schema views?

            regards, tom lane

Re: issue about information_schema REFERENTIAL_CONSTRAINTS

From
Fabien COELHO
Date:
> Maybe you shouldn't be using the information_schema in the first place.

Sure, I could write non standard code for every database instead of trying
to write a portable code which work on all of them directly:-)

I think that trying to do the portable way, under the assumption that the
standard implementation would be okay, was a reasonnable choice, even if
proved wrong afterwards.

--
Fabien.

Re: issue about information_schema REFERENTIAL_CONSTRAINTS

From
Fabien COELHO
Date:
> If you're going to use something which is PostgreSQL-specific, you may
> as well write your own views or use the "native" tables and views
> directly.

I wish I could write portable code, if possible:-)

I'm basically writing views on top of the information_schema under the
assumption that what is expected to be a key is a key. The information
schema is *useless* otherwise as wrong tuples are built on join, and
derived information is not reliable.

I guess I must the only actual user of the information_schema, and it will
soon be back to zero user, which will be fine from the developers point of
view.

--
Fabien.

Re: issue about information_schema REFERENTIAL_CONSTRAINTS

From
Peter Eisentraut
Date:
On fre, 2010-09-03 at 13:53 -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > Excerpts from Fabien COELHO's message of vie sep 03 13:39:19 -0400 2010:
> >> I tried to explained that I'm analyzing other people's schemas. I cannot
> >> ask all other people on the planet to rewrite their schemas, I pick them
> >> as they are.
>
> > Maybe you shouldn't be using the information_schema in the first place.
>
> It would probably be reasonable to put something into chapter 34 of the
> docs pointing out this limitation.  I'm not sure exactly where though.
> Should we invent a "compatibility issues" section in that chapter,
> analogous to the ones we have for individual SQL commands?  Are there
> other issues worth documenting for the information_schema views?

A very similar issue is already documented here:

http://developer.postgresql.org/pgdocs/postgres/infoschema-triggers.html

Re: issue about information_schema REFERENTIAL_CONSTRAINTS

From
Robert Haas
Date:
On Tue, Aug 31, 2010 at 10:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> We are not going to try to enforce uniqueness. =A0This has been debated
> before, and most people like the current behavior just fine, or at least
> better than the alternatives.

Really?  I thought the issue was that no one had figured out how to do
it, or that no one had written the patch, not that anyone thought the
current behavior was particularly desirable.  What happens if you say
ALTER TABLE .. DROP CONSTRAINT or COMMENT ON CONSTRAINT?  You just
pick one at random?  That's really what most people want?

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: issue about information_schema REFERENTIAL_CONSTRAINTS

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Aug 31, 2010 at 10:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> We are not going to try to enforce uniqueness.  This has been debated
>> before, and most people like the current behavior just fine, or at least
>> better than the alternatives.

> Really?  I thought the issue was that no one had figured out how to do
> it, or that no one had written the patch, not that anyone thought the
> current behavior was particularly desirable.  What happens if you say
> ALTER TABLE .. DROP CONSTRAINT or COMMENT ON CONSTRAINT?  You just
> pick one at random?

No, because those syntaxes constrain the choice to one single
constraint.  Perhaps if the SQL committee had designed 'em,
there'd be an issue; but they are Postgres-isms.

            regards, tom lane

Re: issue about information_schema REFERENTIAL_CONSTRAINTS

From
Robert Haas
Date:
On Sun, Sep 12, 2010 at 12:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Tue, Aug 31, 2010 at 10:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> We are not going to try to enforce uniqueness. =A0This has been debated
>>> before, and most people like the current behavior just fine, or at least
>>> better than the alternatives.
>
>> Really? =A0I thought the issue was that no one had figured out how to do
>> it, or that no one had written the patch, not that anyone thought the
>> current behavior was particularly desirable. =A0What happens if you say
>> ALTER TABLE .. DROP CONSTRAINT or COMMENT ON CONSTRAINT? =A0You just
>> pick one at random?
>
> No, because those syntaxes constrain the choice to one single
> constraint. =A0Perhaps if the SQL committee had designed 'em,
> there'd be an issue; but they are Postgres-isms.

Hrm.  I was thinking of this old thread, but maybe that's not the same issu=
e.

http://archives.postgresql.org/pgsql-hackers/2008-10/msg00256.php

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: issue about information_schema REFERENTIAL_CONSTRAINTS

From
Bruce Momjian
Date:
Fabien COELHO wrote:
>
> > If you're going to use something which is PostgreSQL-specific, you may
> > as well write your own views or use the "native" tables and views
> > directly.
>
> I wish I could write portable code, if possible:-)
>
> I'm basically writing views on top of the information_schema under the
> assumption that what is expected to be a key is a key. The information
> schema is *useless* otherwise as wrong tuples are built on join, and
> derived information is not reliable.
>
> I guess I must the only actual user of the information_schema, and it will
> soon be back to zero user, which will be fine from the developers point of
> view.

Based on this report and later discussion, I have applied the attached
documentation patch to warn users about the Postgres behavior of
information_schema.referential_constraints.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/information_schema.sgml b/doc/src/sgml/information_schema.sgml
index 52407a7..91c2dd4 100644
*** a/doc/src/sgml/information_schema.sgml
--- b/doc/src/sgml/information_schema.sgml
*************** ORDER BY c.ordinal_position;
*** 3211,3216 ****
--- 3211,3229 ----
      </tbody>
     </tgroup>
    </table>
+
+   <note>
+    <para>
+     The SQL standard requires constraint names to be unique within a
+     schema;  <productname>PostgreSQL</productname>, however, does not
+     enforce this restriction.  If duplicate-named constraints are
+     stored in the same <productname>PostgreSQL</productname> schema, a
+     standard-compliant query that expects to return one row might
+     return several, one for each matching constraint stored in the
+     specified schema.
+    </para>
+   </note>
+
   </sect1>

   <sect1 id="infoschema-role-column-grants">