Thread: Schema design question
I originally was planning on asking about Postgresql inheritance, but after reading the docs again I do not think that it applies to my design. So, now I'm asking a rather basic schema design question. I have two related objects where one inherits column values from another. No, that's incorrect. The "child" receives default values from the "parent" when the child is created. A more concrete example: tables that represent classes taught at a school. The parent object is a general course description, and the child object is a specific instance of a course -- a "class" -- which is a course taught at a given time and location. A course can be taught multiple times, obviously. A course (and thus a class) can have multiple instructors -- a many-to-many relationship. So I have a link table for that. A class normally uses the course's default instructors, but may be different for specific classes instance. How would you layout the tables for somethings like this? I can think (out loud) of three ways to set this up: 1) Separate tables for "course" and "class" and when a class is created simply copy column data from the course to the class. Pro: Selects are simple Con: Column duplication in the two tables -- two tables look a lot alike Need to have duplicate link tables (one pointing to each table) 2) Create a third "common_values" table that both "course" and "class" tables reference. Then when creating a class from a course clone the common values row to a new row that the class can reference. Pro: No duplication of columns in multiple tables. Only need one linking table for instructors (but still need to create new links when creating the new row) Con: Need to always do joins on selects (not really a problem) 3) Create a single table with a flag to indicate if the row is a "course" or a "class". Pro: Simple selects and no column duplication between tables Con: Columns for a course might be ok as NULL, but would be required for a specific class. Again, a "course" and "class" are very similar. But, once a class is created from a course it really is its own entity. For example, if the course description changes in the future I don't want it to change on previous classes. There also needs to be a link between the two. For example, you might want to show a list of courses, and then see what classes are scheduled for a given course, so a class should reference its parent course. Thanks very much, -- Bill Moseley moseley@hank.org
On Wed, 2005-08-17 at 10:49 -0700, Bill Moseley wrote: > The parent object is a general course description, and the > child object is a specific instance of a course > ... > tables that represent classes taught at a > school. The parent object is a general course ... the > child object is ... a "class" -- which > is a course taught at a given time and location. A course can be > taught multiple times ... A course (and thus a class) can have > multiple instructors > > How would you layout the tables for somethings like this? create table course (id serial primary key, description varchar); create table teacher (id serial primary key, name varchar); create table course_teacher (course_id integer not null, teacher_id integer not null); alter table course_teacher add primary key (course_id, teacher_id); alter table course_teacher add foreign key (course_id) references course deferrable initially deferred; create index course_teacher_teacher_ix on course_teacher (teacher_id); alter table course_teacher add foreign key (teacher_id) references teacher deferrable initially deferred; create table class (id serial primary key, course_id integer not null, teacher_id integer not null, starts_on date, location varchar); create index class_course_ix on class (course_id); alter table class add foreign key (course_id) references course deferrable initially deferred; create index class_teacher_ix on class (teacher_id); alter table class add foreign key (teacher_id) references teacher deferrable initially deferred; > A class > normally uses the course's default instructors, but may be different > for specific classes instance. When a class is created the user first specifies course_id. At that point the app can look at course_teacher and offer the list of default teachers. In case a non-default teacher is desired the app also offers a lookup into teacher to see all available teachers. The teacher_id column of class is thus populated. Set the "start_on" date and the "location," and you're done. > I can think (out loud) of three ways to set this up: I'm sure there are many ways to get there. To me, the way I've described is the most-direct way to represent the relationships you've described.
Bill Moseley schrieb: >On Wed, Aug 17, 2005 at 10:05:39PM +0200, Sebastian Hennebrueder wrote: > > >>> Con: Column duplication in the two tables -- two tables look a lot alike >>> Need to have duplicate link tables (one pointing to each table) >>> >>> >>> >>> >>They are not duplicated. As you say later in your explanation the course >>settings may change. So the data may be the same right at the beginning >>but can differ by the time. => It is not the same data!! >> >> > >I meant that I would have two tables that look like they hold very >similar data. That's not really a problem -- just bugs me to see >duplication. Plus, it could introduce errors if I ever modified, say, >a columns type in one table and not the matching column in the other >table. > >Any opinions on which table layout you would use? > >Thanks, > > > It is still not the same data. When my name is Sebastian Hennebrueder and your name is Sebastian Hennebrueder, than we are not the same person. Class and Course is not the same, they only have accidentally the same data right at the beginning. And as they have a reference to each other you should put them in two separate tables, so that you can create a foreign key relation to enforce the relation. I do not now a good online tutorial for database normalisation but just try Google or ask here. -- Best Regards / Viele Grüße Sebastian Hennebrueder ---- http://www.laliluna.de Tutorials for JSP, JavaServer Faces, Struts, Hibernate and EJB Get support, education and consulting for these technologies - uncomplicated and cheap.
On Wed, Aug 17, 2005 at 07:41:20PM +0000, Matt Miller wrote: Thanks for responding, Matt: > create table course (id serial primary key, > description varchar); > create table teacher (id serial primary key, > name varchar); > create table course_teacher (course_id integer not null, > teacher_id integer not null); [...] > create table class (id serial primary key, > course_id integer not null, > teacher_id integer not null, > starts_on date, > location varchar); There may be more than one teacher in the class so instead I'd need another "class_teacher" link table. I guess what "bugged" me about this type of layout is that a course and class share so many columns. Duplication just looks wrong -- and I worry about changing a column type on one table and forgetting to change it on the other table. Also have to remember to copy all columns every time a specific class is created. On the other hand, if I used a single table to represent both types of entities, then selects are always going to have something like WHERE type = 'course' added onto the WHERE. That's extra processing for no good reason. > I'm sure there are many ways to get there. To me, the way I've > described is the most-direct way to represent the relationships you've > described. And thanks very much for you help. -- Bill Moseley moseley@hank.org
On Wed, 2005-08-17 at 13:40 -0700, Bill Moseley wrote: > a course > and class share so many columns. ...and > I worry about changing a column type on one table and forgetting to > change it on the other table. Postgres types might help here. You could probably create a type that contains the common columns, and then embed that type where you need it.
On Wed, 2005-08-17 at 13:40 -0700, Bill Moseley wrote: > I guess what "bugged" me about this type of layout is that a course > and class share so many columns. Duplication just looks wrong -- and > I worry about changing a column type on one table and forgetting to > change it on the other table. If your design leads you to many shared attributes between class and course, then ask yourself if the class-specific versions must always be equal to the course-defaulted versions. If yes, then consider factoring out the common columns into a third table and using foreign keys to reference that table from the other two. However, if the course merely suggests defaults for the class, and if any of these defaults can be overridden by any class, then I would definitely duplicate the columns in class. I would do this even if a large percentage of classes will never override the defaults. A similar situation arises if the child table is supposed to remember what the values of the parent were at the time the child was inserted. If the parent can change values over time but the child should be unaffected then, as before, I would copy the info into each child record.