Thread: issue about information_schema REFERENTIAL_CONSTRAINTS
Hello, I haven't found a bug management system about postgresql, so here is a mail. Maybe this issue was already reported, sorry if it is the case. I have seen anything about the information_schema in pg todo list. This is tested on postgresql 8.4.4. The REFERENTIAL_CONSTRAINTS table in the information_schema references a constaint through its database/schema/name, but this information is not unique, so it may identify several constraints, thus the information derived may not be consistent. CREATE TABLE destination(id SERIAL PRIMARY KEY); CREATE TABLE source1 (id SERIAL PRIMARY KEY, fk INTEGER CONSTRAINT to_destination REFERENCES destination); CREATE TABLE source2 (id SERIAL PRIMARY KEY, fk INTEGER CONSTRAINT to_destination REFERENCES destination); SELECT * FROM information_schema.REFERENTIAL_CONSTRAINTS; -- contains two identical lines Other tables about constraints may have the same issue. The direct result is that this table leads to false result on joins, thus is pretty useless. Usually I have plenty of "$1" constraints. Suggestion: constraint names could be systematically prefixed with their corresponding table so that they are indeed unique, possibly with some escaping: '"<double-quoted-table-name>"."constraint_name"'. I'm not sure about the possible consequences of changing the constraint names, but as the information is a set of views on top of pg_catalog, there may be none. -- Fabien.
Fabien COELHO <coelho@cri.ensmp.fr> writes: > The REFERENTIAL_CONSTRAINTS table in the information_schema references a > constaint through its database/schema/name, but this information is not > unique, so it may identify several constraints, thus the information > derived may not be consistent. Postgres does not enforce that constraints have unique names within a schema. The SQL spec does say that they should be unique per-schema, and the information_schema views are designed on that assumption. If you use spec-compliant names for your constraints, you won't have a problem. If you don't, well, the information_schema views will be of limited use to you. > Suggestion: constraint names could be systematically prefixed with their > corresponding table so that they are indeed unique, We are not going to try to enforce uniqueness. This has been debated before, and most people like the current behavior just fine, or at least better than the alternatives. regards, tom lane
Dear Tom, >> The REFERENTIAL_CONSTRAINTS table in the information_schema references a >> constaint through its database/schema/name, but this information is not >> unique, so it may identify several constraints, thus the information >> derived may not be consistent. > > Postgres does not enforce that constraints have unique names within a > schema. The SQL spec does say that they should be unique per-schema, > and the information_schema views are designed on that assumption. Hence a contradiction. > If you use spec-compliant names for your constraints, you won't have a > problem. If you don't, well, the information_schema views will be of > limited use to you. I'm writing a schema analyzer which gives false results. I do not write the constraints, I'm analyzing existing schemas. I cannot change it. >> Suggestion: constraint names could be systematically prefixed with their >> corresponding table so that they are indeed unique, > > We are not going to try to enforce uniqueness. I'm not asking for uniqueness in "pg_catalog", esp as that would break existing applications. I'm suggesting uniqueness in the "information_schema", which can be provided independently by some tweaking in the view construction, I think, for instance by adding the oid of the constraint or maybe the table_name. > This has been debated before, and most people like the current behavior > just fine, or at least better than the alternatives. I do not know "most people". I guess "most people" just do not use the "information_schema", so they really do not care! For the "few people" who do use the information_schema, I can assure you that having a false information is a severe drawback, and it is called a "bug". So at least please fill in this as a "bug" somewhere, even if you do not want to fix it. -- Fabien.
On ons, 2010-09-01 at 16:22 +0200, Fabien COELHO wrote: > I'm suggesting uniqueness in the "information_schema", which can be > provided independently by some tweaking in the view construction, I > think, for instance by adding the oid of the constraint or maybe the > table_name. The view is defined by the SQL standard. We cannot change it.
Dear Peter, >> I'm suggesting uniqueness in the "information_schema", which can be >> provided independently by some tweaking in the view construction, I >> think, for instance by adding the oid of the constraint or maybe the >> table_name. > > The view is defined by the SQL standard. No. The result of the view (the definition of the expected attributes) is defined in the standard. But it is really a view on top of "pg_catalog". You are right that some views of the information_schema are defined in the standard, but they deal with restrictions of other relations, say the privileges for the current user... > We cannot change it. Yes we can! It, it is 100% postgresql: \d information_schema.referential_constraints View "information_schema.referential_constraints" ... FROM pg_namespace ncon JOIN pg_constraint con ON ncon.oid = con.connamespace JOIN pg_class c ON con.conrelid = c.oid -- Fabien Coelho - CRI, Maths & Systèmes, MINES ParisTech
On tor, 2010-09-02 at 07:54 +0200, Fabien COELHO wrote: > > We cannot change it. > > Yes we can! It, it is 100% postgresql: > > \d information_schema.referential_constraints > View "information_schema.referential_constraints" > ... > FROM pg_namespace ncon > JOIN pg_constraint con ON ncon.oid = con.connamespace > JOIN pg_class c ON con.conrelid = c.oid Well, for the sake of argument, how would you propose to change it?
Hallo, >> \d information_schema.referential_constraints >> View "information_schema.referential_constraints" >> ... >> FROM pg_namespace ncon >> JOIN pg_constraint con ON ncon.oid = con.connamespace >> JOIN pg_class c ON con.conrelid = c.oid > > Well, for the sake of argument, how would you propose to change it? For the sake of argument: For the information_schema to be useful, expected keys & foreign keys must work properly so that the constraints can be joined meaningfully (otherwise, what is the point?). So any change would have to be be consistent over all definitions. The standard expects the triplet <catalog,schema,constraint> to be unique, but in Pg, <catalog,schema,table,constraint> is unique (I think). Too late to change that, obviously. So that let building a unique constraint name just for the information_schema, which is fine with me. (1) use the OID as the "constraint name" everywhere, it would work, it would not look so good for display, but it is simple and fast. (2) otherwise something built on top of <table,constraint>. To be on the safe side, I would build a string (sql_identifier?) with something like: quote_ident(<table name>) || '.' || quote_ident(<constraint name>) And the display would be reasonnable, like : "TableName"."$1" instead of a big bunch of $1. The affected information_schema views would be, as far as I can see: - check_constraint_routine_usage - check_constraints - constraint_column_usage - constraint_table_usage - domain_constraints - key_column_usage - referential_constraints - table_constraints With one or two affected columns each. Note that there may be the same issue with 'triggers' which are also identified by a <catalog,schema,trigger> triplet. Maybe others. I can send a patch on the information_schema definition in the coming days if someone feel that this may deserve a "test". Otherwise, ISTM that it is a "bug" to be recorded somewhere. Have a nice day, -- Fabien.
Fabien COELHO <coelho@cri.ensmp.fr> writes: > (1) use the OID as the "constraint name" everywhere, it would work, it > would not look so good for display, but it is simple and fast. > (2) otherwise something built on top of <table,constraint>. To be on the > safe side, I would build a string (sql_identifier?) with something like: > quote_ident(<table name>) || '.' || quote_ident(<constraint name>) Note that (2) fails for long names; you have to do something to compress to NAMEDATALEN. The big problem with either of these is that it's no longer easily possible to extract the actual constraint name from the view. In any case, I am fairly sure that not having the constraint_name column show the actual constraint name is a violation of the spirit of the SQL spec, whether or not you can claim that it meets the letter. regards, tom lane
Dear Tom, Still for the sake of argument: > Note that (2) fails for long names; you have to do something to > compress to NAMEDATALEN. Indeed. What if the type is changed to TEXT? It is just a view after all. How important is it to stick to "sql_identifier"? > The big problem with either of these is that it's no longer easily > possible to extract the actual constraint name from the view. Sure. A function is provided to do so, say information_schema_constraint_name_to_pg_constraint_name. > In any case, I am fairly sure that not having the constraint_name column > show the actual constraint name is a violation of the spirit of the SQL > spec, whether or not you can claim that it meets the letter. Well, one must choose between to evil: (1) the constraint_name is changed in the view to be unique as expected by the spec, and the data can be joined meaningfully, and some reliable information can be derived. (2) the constraint_name looks nice but is not unique, and the information in the view is ambiguous and cannot be relied upon, so one is back to square "postgresql supports the information_schema, but there is no point to query it and expecting the results to reflect the contents of the catalogs". If you want to stick to both the letter and the spirit of the spec, that would mean enforcing unique constraint names in pg and break every applications. Not good. ISTM that the "spirit" of the information schema is more to be useful (1) than to look beautiful (2). Another technical proposal, a little more subtle and with possible underlying issues I cannot foresee: have the constraint_name be a "pair of sql_identifiers". -- Fabien.
Fabien COELHO <coelho@cri.ensmp.fr> writes: >> Note that (2) fails for long names; you have to do something to >> compress to NAMEDATALEN. > Indeed. > What if the type is changed to TEXT? It is just a view after all. > How important is it to stick to "sql_identifier"? It's a view defined by the SQL standard, and one of the properties defined by the standard is the type of that column. > Well, one must choose between to evil: Yeah, exactly. I think that the current tradeoff is just fine. If you want SQL-standard behavior, pick SQL-standard constraint names, and there you are. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > Yeah, exactly. I think that the current tradeoff is just fine. > If you want SQL-standard behavior, pick SQL-standard constraint > names, and there you are. I see that as the crux if it -- the current implementation *allows* standard-conforming behavior, even though it doesn't *enforce* conforming naming. The proposed alternative does not allow standard-conforming behavior. If you're going to use something which is PostgreSQL-specific, you may as well write your own views or use the "native" tables and views directly. -Kevin
>> Well, one must choose between to evil: > > Yeah, exactly. I think that the current tradeoff is just fine. Hmmm. I think exactly the contrary. There is no point in having a non reliable feature. ISTM that very few people use the information schema, and if the query results is not reliable, it will stay this way. If you have zero user, no one will complain, which prooves that everything is fine:-) QED. By the way, do you use the information schema? > If you want SQL-standard behavior, pick SQL-standard constraint names, > and there you are. I tried to explained that I'm analyzing other people's schemas. I cannot ask all other people on the planet to rewrite their schemas, I pick them as they are. Could you register this "bug" somewhere please? Thanks for your time. -- Fabien.
Excerpts from Fabien COELHO's message of vie sep 03 13:39:19 -0400 2010: > I tried to explained that I'm analyzing other people's schemas. I cannot > ask all other people on the planet to rewrite their schemas, I pick them > as they are. Maybe you shouldn't be using the information_schema in the first place. -- Ãlvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Excerpts from Fabien COELHO's message of vie sep 03 13:39:19 -0400 2010: >> I tried to explained that I'm analyzing other people's schemas. I cannot >> ask all other people on the planet to rewrite their schemas, I pick them >> as they are. > Maybe you shouldn't be using the information_schema in the first place. It would probably be reasonable to put something into chapter 34 of the docs pointing out this limitation. I'm not sure exactly where though. Should we invent a "compatibility issues" section in that chapter, analogous to the ones we have for individual SQL commands? Are there other issues worth documenting for the information_schema views? regards, tom lane
> Maybe you shouldn't be using the information_schema in the first place. Sure, I could write non standard code for every database instead of trying to write a portable code which work on all of them directly:-) I think that trying to do the portable way, under the assumption that the standard implementation would be okay, was a reasonnable choice, even if proved wrong afterwards. -- Fabien.
> If you're going to use something which is PostgreSQL-specific, you may > as well write your own views or use the "native" tables and views > directly. I wish I could write portable code, if possible:-) I'm basically writing views on top of the information_schema under the assumption that what is expected to be a key is a key. The information schema is *useless* otherwise as wrong tuples are built on join, and derived information is not reliable. I guess I must the only actual user of the information_schema, and it will soon be back to zero user, which will be fine from the developers point of view. -- Fabien.
On fre, 2010-09-03 at 13:53 -0400, Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Excerpts from Fabien COELHO's message of vie sep 03 13:39:19 -0400 2010: > >> I tried to explained that I'm analyzing other people's schemas. I cannot > >> ask all other people on the planet to rewrite their schemas, I pick them > >> as they are. > > > Maybe you shouldn't be using the information_schema in the first place. > > It would probably be reasonable to put something into chapter 34 of the > docs pointing out this limitation. I'm not sure exactly where though. > Should we invent a "compatibility issues" section in that chapter, > analogous to the ones we have for individual SQL commands? Are there > other issues worth documenting for the information_schema views? A very similar issue is already documented here: http://developer.postgresql.org/pgdocs/postgres/infoschema-triggers.html
On Tue, Aug 31, 2010 at 10:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > We are not going to try to enforce uniqueness. =A0This has been debated > before, and most people like the current behavior just fine, or at least > better than the alternatives. Really? I thought the issue was that no one had figured out how to do it, or that no one had written the patch, not that anyone thought the current behavior was particularly desirable. What happens if you say ALTER TABLE .. DROP CONSTRAINT or COMMENT ON CONSTRAINT? You just pick one at random? That's really what most people want? --=20 Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Aug 31, 2010 at 10:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> We are not going to try to enforce uniqueness. This has been debated >> before, and most people like the current behavior just fine, or at least >> better than the alternatives. > Really? I thought the issue was that no one had figured out how to do > it, or that no one had written the patch, not that anyone thought the > current behavior was particularly desirable. What happens if you say > ALTER TABLE .. DROP CONSTRAINT or COMMENT ON CONSTRAINT? You just > pick one at random? No, because those syntaxes constrain the choice to one single constraint. Perhaps if the SQL committee had designed 'em, there'd be an issue; but they are Postgres-isms. regards, tom lane
On Sun, Sep 12, 2010 at 12:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Tue, Aug 31, 2010 at 10:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> We are not going to try to enforce uniqueness. =A0This has been debated >>> before, and most people like the current behavior just fine, or at least >>> better than the alternatives. > >> Really? =A0I thought the issue was that no one had figured out how to do >> it, or that no one had written the patch, not that anyone thought the >> current behavior was particularly desirable. =A0What happens if you say >> ALTER TABLE .. DROP CONSTRAINT or COMMENT ON CONSTRAINT? =A0You just >> pick one at random? > > No, because those syntaxes constrain the choice to one single > constraint. =A0Perhaps if the SQL committee had designed 'em, > there'd be an issue; but they are Postgres-isms. Hrm. I was thinking of this old thread, but maybe that's not the same issu= e. http://archives.postgresql.org/pgsql-hackers/2008-10/msg00256.php --=20 Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Fabien COELHO wrote: > > > If you're going to use something which is PostgreSQL-specific, you may > > as well write your own views or use the "native" tables and views > > directly. > > I wish I could write portable code, if possible:-) > > I'm basically writing views on top of the information_schema under the > assumption that what is expected to be a key is a key. The information > schema is *useless* otherwise as wrong tuples are built on join, and > derived information is not reliable. > > I guess I must the only actual user of the information_schema, and it will > soon be back to zero user, which will be fine from the developers point of > view. Based on this report and later discussion, I have applied the attached documentation patch to warn users about the Postgres behavior of information_schema.referential_constraints. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/doc/src/sgml/information_schema.sgml b/doc/src/sgml/information_schema.sgml index 52407a7..91c2dd4 100644 *** a/doc/src/sgml/information_schema.sgml --- b/doc/src/sgml/information_schema.sgml *************** ORDER BY c.ordinal_position; *** 3211,3216 **** --- 3211,3229 ---- </tbody> </tgroup> </table> + + <note> + <para> + The SQL standard requires constraint names to be unique within a + schema; <productname>PostgreSQL</productname>, however, does not + enforce this restriction. If duplicate-named constraints are + stored in the same <productname>PostgreSQL</productname> schema, a + standard-compliant query that expects to return one row might + return several, one for each matching constraint stored in the + specified schema. + </para> + </note> + </sect1> <sect1 id="infoschema-role-column-grants">