Thread: Fwd: [SQL] Distributed keys

Fwd: [SQL] Distributed keys

From
Michael Glaesemann
Date:

Begin forwarded message:

> From: Michael Glaesemann <grzm@myrealbox.com>
> Date: December 23, 2003 3:42:24 PM CST
> To: "'pgsql-sql@postgresql.org'" <pgsql-sql@postgresql.org>
> Subject: [SQL] Distributed keys
>
> Hello all!
>
> An area of the relational database model that is not included in the
> SQL standard are distributed keys and distributed foreign keys. A
> quick example (Case 1):
>
> employees (id, name);
> schools (name, location);
> teachers (employee_id, school_name);
> subs (employee_id);
> managers (employee_id, school_name);
>
> with constraints
> unique employees(id)
> teachers(employee_id) references employees(id)
> teachers(school_name) references schools(name)
> subs(employee_id) references employees(id)
> managers(employee_id) references employees(id)
>
> The idea is that employees must be in one (and only one) of either
> teachers, subs, or managers. Currently, this might be represented in
> something like (Case 2)
>
> employees (id, name, employee_type, school_name);
> schools (name, location);
> employee_types (type);
>
> with constraints
> employees(employee_type) references employee_types(type)
> employees(school_name) references schools(name)
>
> where employee_types includes "teacher", "sub", and "manager"
>
> Or it might be represented with a number of rules or triggers that
> perform all of the necessary checking.
>
> employees(school_name) can't have a not null constraint because if the
> employee is a sub, they aren't associated with a school.
>
> Using the terms "distributed key" and "foreign distributed key", in
> the first case employee_id is a "distributed key" in that it must
> occur in only one of the tables teachers, subs, or managers.
> Distributed keys are similar in concept to primary keys—they must be
> unique. This guarantees an employee_id in teachers is not found in
> subs or managers, an employee_id in subs is not found in managers or
> teachers, and an employee_id in managers is not found in subs or
> teachers.
>
> employees(id) is a foreign distributed key in teachers, subs, and
> managers (as employee_id). Foreign distributed keys are similar in
> concept to foreign keys in that employees(id) must be referenced by a
> single tuple in one of teachers, subs, or managers.
>
> Another use would be in this situation (something I'm working on right
> now): I want to link comments by employees by employee_id, but
> comments from non-employees by name (as they don't have an id).
>
> comments(id, comment);
> comments_nonemployees(comment_id, name);
> comments_employees(comment_id, employee_id);
>
> with constraints
> comments_nonemployees(comment_id) references comments(id)
> comments_employees(comment_id) references comments(id)
> and comments(id) must be listed in either
> comments_nonemployees(comment_id) or comments_employees(comment_id)
>
> I haven't looked very far into how to implement distributed keys and
> foreign distributed keys in PostgreSQL beyond briefly looking at the
> pg_constraint system table, thinking a distributed key would be
> something making employee_id unique in teachers(employee_id) UNION
> subs(employee_id) UNION managers(employee_id). A distributed key is
> distributed over a number of tables, rather than a single one, so
> there'd have to be a list of relid-attnum pairs, rather than a single
> relid-attnum pair, such as conrelid and conkey in pg_constraint.
> Here's a brief sketch of the idea:
>
> pg_distributed
>     distname    name    the name of the distributed key constraint
>     distrelid    oid    the relid of one of the tables involved in the
> distributed keys
>     distkey    int2[]    a list of the attnum of the columns of the table with
> oid distrelid involved in the distributed key
>     distforkey    bool    true if foreign distributed key
>     distfrelid    oid    if a foreign distributed key, the relid of the the
> referenced table, else 0
>     distfkey    int2[]    if a foreign distributed key, a list of the attnum of
> the columns of the table with oid distfrelid referenced by the foreign
> distributed key, else 0
>
> In pg_distributed, distname, distfrelid, distfkey would be the same in
> every tuple involved in the distributed key, while distrelid and
> distkey would vary. Basically a one-to-many relation. These are just
> some thoughts, and the first time I've looked at the PostgreSQL
> internals, so I wouldn't be suprised if this approach is wrongheaded.
>
> I know PostgreSQL endeavors to adhere closely to the SQL standard, and
> I think this is important as SQL *is* a standard. One of the things
> that makes PostgreSQL great in my opinion is that it goes beyond the
> standard in some areas where users and developers have found useful,
> such as the PostgreSQL rule system. I think distributed key support
> would extend the usefulness of PostgreSQL without hindering SQL
> conformance.
>
> I'm interested in hearing what others have to say, especially along
> the lines of implementation. I haven't seen much discussion of
> distributed keys on the lists (other than Josh Berkus), so perhaps
> there isn't much interest. Would there be any foreseeable opposition
> if I (and/or others) worked on this?
>
> Regards,
>
> Michael Glaesemann
> grzm myrealbox com
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>