Thread: triggers vs b-tree
Hello I have a design question: I have a table representing Families, and a table representing Persons. The table Family have a row family_id as primary key. The table Person have a row person_id as primary key and contain also a row family_id. As you can understand, the row family_id in a table ficha_person is not unique, I mean is the same for all the family person's. So my question is: If most of the time I need to find all the persons for one asked family what is the best way to do that? I think about two methods: 1- Making a b-tree index in ficha_person with the rows family_id and person_id. 2 - Adding an array in the table ficha_family containing the persons of this family. And creating a Trigger that update this array for each person insert/delete in the table ficha_family. So ..., what do you think? There are a better solution or what of the above solutions is better ?? Thank you in advance, Gustavo. Tables: CREATE SEQUENCE ficha_person_id_seq; CREATE TABLE ficha_person ( person_id integer DEFAULT nextval('ficha_person_id_seq') CONSTRAINT the_pers on_id PRIMARY KEY, family_id integer ,--CONSTRAINT the_family_id ..... ) WITH OIDS; CREATE SEQUENCE ficha_family_id_seq; CREATE TABLE ficha_family ( family_id integer DEFAULT nextval('ficha_family_id_seq') CONSTRAINT the_fami ly_id PRIMARY KEY, person_id integer[], --- Optionally, instead of using b-tree index. ..... ) WITH OIDS;
I believe you should design it in a slightly different way: >-----Original Message----- >From: pgsql-general-owner@postgresql.org >[mailto:pgsql-general-owner@postgresql.org] On Behalf Of >gustavo halperin >Sent: donderdag 25 januari 2007 21:34 >To: pgsql-general@postgresql.org >Subject: [GENERAL] triggers vs b-tree > >Hello I have a design question: > > I have a table representing Families, and a table >representing Persons. > > So my question is: If most of the time I need to find all the >persons for one asked family what is the best way to do that? > I think about two methods: > 1- Making a b-tree index in ficha_person with the >rows family_id and person_id. > 2 - Adding an array in the table ficha_family >containing the persons of this family. And creating a Trigger >that update this array for each person insert/delete in the >table ficha_family. It seems you are quite new (or unfamiliar) to databases. Are you familiar with "Database Normalization"? Read this: http://databases.about.com/od/specificproducts/a/normalization.htm (Handles all you need) http://www.phlonx.com/resources/nf3/ (Gives better examples up to 3th normal form) http://en.wikipedia.org/wiki/Database_normalization (Wikipedia does not seem to be too good, but it works) After reading this carefully you should know what option you should take. The answer is right in there, but there is a lot more useful information too. - Joris
> Hello I have a design question: > > I have a table representing Families, and a table representing Persons. > The table Family have a row family_id as primary key. > The table Person have a row person_id as primary key and contain also a > row family_id. > As you can understand, the row family_id in a table ficha_person is not > unique, I mean is the same for all the family person's. > > So my question is: If most of the time I need to find all the persons > for one asked family what is the best way to do that? > I think about two methods: > 1- Making a b-tree index in ficha_person with the rows > family_id and person_id. > 2 - Adding an array in the table ficha_family containing the > persons of this family. And creating a Trigger that update this array > for each person insert/delete in the table ficha_family. > > So ..., what do you think? There are a better solution or what of the > above solutions is better ?? > > Thank you in advance, > Gustavo. > The trigger/array seems to me as a premature optimization - if you are not sure the index is 'too slow' (and there's no other way to speed it up) don't do it. You should always have a foreign key in Person(family_id) referencing the Family(family_id) as you need to reinforce data integrity between these two tables, and the 'rule of a thumb' is to have indexes on foreign keys in large tables. The reason is pretty simple - the PostgreSQL does a simple query when checking the foreign key. So if the Person table is 'small' (less than for example 1000 rows) and it will not grow too much, there's no need to use an index (as it won't be used for small tables) and the queries to get all the family members will be very fast too. On the other side, if the Person table is 'large' (say more than 10.000 rows), then there should be an index on Person(family_id). Then it depends on your requirements - the most important things to consider are these: 1) Will the application be mostly used to select or update? The trigger adds (small) overhead to modifications, but if you do mostly selects this may not be a problem. 2) Do you need only IDs of the family members, or do you need all the data from Person table? The trigger/array solution gives you only IDs and you'll have to fetch the data in a separate query (most probably). The array simply complicates the queries. Anyway, I would try to stick with the foreign key / index solution as long as possible. If you are not happy with the speed do some benchmarks with the trigger / array solution and compare them to the foreign key / index. Try to do some other optimizations too (for example cluster the Person table along the family_id column - that usually means a serious performance boost). Tomas