Re: Distributed keys - Mailing list pgsql-sql
From | Chris Travers |
---|---|
Subject | Re: Distributed keys |
Date | |
Msg-id | 008c01c3ca91$9b3b2880$1d00053d@winxp Whole thread Raw |
In response to | Distributed keys (Michael Glaesemann <grzm@myrealbox.com>) |
List | pgsql-sql |
It seems to me that if the inheritance/fireign key behavior was changed so that foreign key constraints could exist within the entire inheritance tree, this problem would be solved. According to previous posts, the behavior will probably change at some point but does not appear to be a priority at the moment. If it were possible to allow FK constraints to work against the inheritance tree rather than a single table therein you could have managers, teachers, and subs as tables inherited tables from employees and the problem would be solved. Currently a workaround I can see is: Hide the actual tables in a shadow schema, and inherit as above. Have each table be represented as a view in the public schema joining the table to another table storing the employee unique identifiers. Place unique constraints on the unique identifiers table. Create rules for inserting, updating, and deleting the records. Have the Employee view search the entire inheritance tree. However, this is assuming that the data you are storing for the employees differs substantially depending on position. If this is not the case, you would do better by having a single employee table and include a field indicating whether the employee is a manager, teacher, or sub. Best Wishes, Chris Travers ----- Original Message ----- From: "Michael Glaesemann" <grzm@myrealbox.com> To: <pgsql-sql@postgresql.org> Sent: Wednesday, December 24, 2003 4:42 AM Subject: [SQL] Distributed keys Hello all! An area of the relational database model that is not included in the SQL standard are distributed keys and distributed foreign keys. A quick example (Case 1): employees (id, name); schools (name, location); teachers (employee_id, school_name); subs (employee_id); managers (employee_id, school_name); with constraints unique employees(id) teachers(employee_id) references employees(id) teachers(school_name) references schools(name) subs(employee_id) references employees(id) managers(employee_id) references employees(id) The idea is that employees must be in one (and only one) of either teachers, subs, or managers. Currently, this might be represented in something like (Case 2) employees (id, name, employee_type, school_name); schools (name, location); employee_types (type); with constraints employees(employee_type) references employee_types(type) employees(school_name) references schools(name) where employee_types includes "teacher", "sub", and "manager" Or it might be represented with a number of rules or triggers that perform all of the necessary checking. employees(school_name) can't have a not null constraint because if the employee is a sub, they aren't associated with a school. Using the terms "distributed key" and "foreign distributed key", in the first case employee_id is a "distributed key" in that it must occur in only one of the tables teachers, subs, or managers. Distributed keys are similar in concept to primary keys—they must be unique. This guarantees an employee_id in teachers is not found in subs or managers, an employee_id in subs is not found in managers or teachers, and an employee_id in managers is not found in subs or teachers. employees(id) is a foreign distributed key in teachers, subs, and managers (as employee_id). Foreign distributed keys are similar in concept to foreign keys in that employees(id) must be referenced by a single tuple in one of teachers, subs, or managers. Another use would be in this situation (something I'm working on right now): I want to link comments by employees by employee_id, but comments from non-employees by name (as they don't have an id). comments(id, comment); comments_nonemployees(comment_id, name); comments_employees(comment_id, employee_id); with constraints comments_nonemployees(comment_id) references comments(id) comments_employees(comment_id) references comments(id) and comments(id) must be listed in either comments_nonemployees(comment_id) or comments_employees(comment_id) I haven't looked very far into how to implement distributed keys and foreign distributed keys in PostgreSQL beyond briefly looking at the pg_constraint system table, thinking a distributed key would be something making employee_id unique in teachers(employee_id) UNION subs(employee_id) UNION managers(employee_id). A distributed key is distributed over a number of tables, rather than a single one, so there'd have to be a list of relid-attnum pairs, rather than a single relid-attnum pair, such as conrelid and conkey in pg_constraint. Here's a brief sketch of the idea: pg_distributed distname name the name of the distributed key constraint distrelid oid the relid of one of the tables involved in the distributed keys distkey int2[] a list of the attnum of the columns of the table with oid distrelid involved in the distributed key distforkey bool true if foreign distributed key distfrelid oid if a foreign distributed key, the relid of the the referenced table, else 0 distfkey int2[] if a foreign distributed key, a list of the attnum of the columns of the table with oid distfrelid referenced by the foreign distributed key, else 0 In pg_distributed, distname, distfrelid, distfkey would be the same in every tuple involved in the distributed key, while distrelid and distkey would vary. Basically a one-to-many relation. These are just some thoughts, and the first time I've looked at the PostgreSQL internals, so I wouldn't be suprised if this approach is wrongheaded. I know PostgreSQL endeavors to adhere closely to the SQL standard, and I think this is important as SQL *is* a standard. One of the things that makes PostgreSQL great in my opinion is that it goes beyond the standard in some areas where users and developers have found useful, such as the PostgreSQL rule system. I think distributed key support would extend the usefulness of PostgreSQL without hindering SQL conformance. I'm interested in hearing what others have to say, especially along the lines of implementation. I haven't seen much discussion of distributed keys on the lists (other than Josh Berkus), so perhaps there isn't much interest. Would there be any foreseeable opposition if I (and/or others) worked on this? Regards, Michael Glaesemann grzm myrealbox com ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)