Fwd: [SQL] Distributed keys - Mailing list pgsql-hackers
From | Michael Glaesemann |
---|---|
Subject | Fwd: [SQL] Distributed keys |
Date | |
Msg-id | C0D1F8A2-3643-11D8-97F0-000A95C88220@myrealbox.com Whole thread Raw |
List | pgsql-hackers |
Begin forwarded message: > From: Michael Glaesemann <grzm@myrealbox.com> > Date: December 23, 2003 3:42:24 PM CST > To: "'pgsql-sql@postgresql.org'" <pgsql-sql@postgresql.org> > Subject: [SQL] Distributed keys > > Hello all! > > An area of the relational database model that is not included in the > SQL standard are distributed keys and distributed foreign keys. A > quick example (Case 1): > > employees (id, name); > schools (name, location); > teachers (employee_id, school_name); > subs (employee_id); > managers (employee_id, school_name); > > with constraints > unique employees(id) > teachers(employee_id) references employees(id) > teachers(school_name) references schools(name) > subs(employee_id) references employees(id) > managers(employee_id) references employees(id) > > The idea is that employees must be in one (and only one) of either > teachers, subs, or managers. Currently, this might be represented in > something like (Case 2) > > employees (id, name, employee_type, school_name); > schools (name, location); > employee_types (type); > > with constraints > employees(employee_type) references employee_types(type) > employees(school_name) references schools(name) > > where employee_types includes "teacher", "sub", and "manager" > > Or it might be represented with a number of rules or triggers that > perform all of the necessary checking. > > employees(school_name) can't have a not null constraint because if the > employee is a sub, they aren't associated with a school. > > Using the terms "distributed key" and "foreign distributed key", in > the first case employee_id is a "distributed key" in that it must > occur in only one of the tables teachers, subs, or managers. > Distributed keys are similar in concept to primary keys—they must be > unique. This guarantees an employee_id in teachers is not found in > subs or managers, an employee_id in subs is not found in managers or > teachers, and an employee_id in managers is not found in subs or > teachers. > > employees(id) is a foreign distributed key in teachers, subs, and > managers (as employee_id). Foreign distributed keys are similar in > concept to foreign keys in that employees(id) must be referenced by a > single tuple in one of teachers, subs, or managers. > > Another use would be in this situation (something I'm working on right > now): I want to link comments by employees by employee_id, but > comments from non-employees by name (as they don't have an id). > > comments(id, comment); > comments_nonemployees(comment_id, name); > comments_employees(comment_id, employee_id); > > with constraints > comments_nonemployees(comment_id) references comments(id) > comments_employees(comment_id) references comments(id) > and comments(id) must be listed in either > comments_nonemployees(comment_id) or comments_employees(comment_id) > > I haven't looked very far into how to implement distributed keys and > foreign distributed keys in PostgreSQL beyond briefly looking at the > pg_constraint system table, thinking a distributed key would be > something making employee_id unique in teachers(employee_id) UNION > subs(employee_id) UNION managers(employee_id). A distributed key is > distributed over a number of tables, rather than a single one, so > there'd have to be a list of relid-attnum pairs, rather than a single > relid-attnum pair, such as conrelid and conkey in pg_constraint. > Here's a brief sketch of the idea: > > pg_distributed > distname name the name of the distributed key constraint > distrelid oid the relid of one of the tables involved in the > distributed keys > distkey int2[] a list of the attnum of the columns of the table with > oid distrelid involved in the distributed key > distforkey bool true if foreign distributed key > distfrelid oid if a foreign distributed key, the relid of the the > referenced table, else 0 > distfkey int2[] if a foreign distributed key, a list of the attnum of > the columns of the table with oid distfrelid referenced by the foreign > distributed key, else 0 > > In pg_distributed, distname, distfrelid, distfkey would be the same in > every tuple involved in the distributed key, while distrelid and > distkey would vary. Basically a one-to-many relation. These are just > some thoughts, and the first time I've looked at the PostgreSQL > internals, so I wouldn't be suprised if this approach is wrongheaded. > > I know PostgreSQL endeavors to adhere closely to the SQL standard, and > I think this is important as SQL *is* a standard. One of the things > that makes PostgreSQL great in my opinion is that it goes beyond the > standard in some areas where users and developers have found useful, > such as the PostgreSQL rule system. I think distributed key support > would extend the usefulness of PostgreSQL without hindering SQL > conformance. > > I'm interested in hearing what others have to say, especially along > the lines of implementation. I haven't seen much discussion of > distributed keys on the lists (other than Josh Berkus), so perhaps > there isn't much interest. Would there be any foreseeable opposition > if I (and/or others) worked on this? > > Regards, > > Michael Glaesemann > grzm myrealbox com > > > ---------------------------(end of > broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >
pgsql-hackers by date: