Thread: triggers vs b-tree

triggers vs b-tree

From

gustavo halperin

Date:

26 January 2007, 00:34:48

Hello I have a design question:

  I have a table representing Families, and a table representing Persons.
 The table Family have a row family_id as primary key.
 The table Person have a row person_id as primary key and contain also a
row family_id.
 As you can understand, the row family_id in a table ficha_person is not
unique, I mean is the same for all the family person's.

 So my question is: If most of the time I need to find all the persons
for one asked family what is the best way to do that?
 I think about two methods:
         1-  Making a b-tree index in ficha_person with the rows
family_id and person_id.
         2 - Adding an array in the table ficha_family containing the
persons of this family. And creating a Trigger that update this array
for each  person insert/delete in the table ficha_family.

 So ..., what do you think? There are a better solution or what of the
above solutions is better ??

  Thank you in advance,
           Gustavo.

Tables:

CREATE SEQUENCE ficha_person_id_seq;
CREATE TABLE ficha_person (
  person_id    integer         DEFAULT nextval('ficha_person_id_seq')
CONSTRAINT the_pers
on_id PRIMARY KEY,
  family_id    integer         ,--CONSTRAINT the_family_id
 .....
 ) WITH OIDS;


 CREATE SEQUENCE ficha_family_id_seq;
 CREATE TABLE ficha_family (
  family_id    integer         DEFAULT nextval('ficha_family_id_seq')
CONSTRAINT the_fami
ly_id PRIMARY KEY,
  person_id   integer[],  --- Optionally, instead of using b-tree index.
 .....
 ) WITH OIDS;

Re: triggers vs b-tree

From

"Joris Dobbelsteen"

Date:

26 January 2007, 09:57:25

I believe you should design it in a slightly different way:

>-----Original Message-----
>From: pgsql-general-owner@postgresql.org
>[mailto:pgsql-general-owner@postgresql.org] On Behalf Of
>gustavo halperin
>Sent: donderdag 25 januari 2007 21:34
>To: pgsql-general@postgresql.org
>Subject: [GENERAL] triggers vs b-tree
>
>Hello I have a design question:
>
>  I have a table representing Families, and a table
>representing Persons.
>
> So my question is: If most of the time I need to find all the
>persons for one asked family what is the best way to do that?
> I think about two methods:
>         1-  Making a b-tree index in ficha_person with the
>rows family_id and person_id.
>         2 - Adding an array in the table ficha_family
>containing the persons of this family. And creating a Trigger
>that update this array for each  person insert/delete in the
>table ficha_family.

It seems you are quite new (or unfamiliar) to databases.
Are you familiar with "Database Normalization"?

Read this:
http://databases.about.com/od/specificproducts/a/normalization.htm
(Handles all you need)
http://www.phlonx.com/resources/nf3/
(Gives better examples up to 3th normal form)
http://en.wikipedia.org/wiki/Database_normalization
(Wikipedia does not seem to be too good, but it works)

After reading this carefully you should know what option you should
take. The answer is right in there, but there is a lot more useful
information too.

- Joris

Re: triggers vs b-tree

From

Tomas Vondra

Date:

26 January 2007, 15:25:58

> Hello I have a design question:
>
>  I have a table representing Families, and a table representing Persons.
> The table Family have a row family_id as primary key.
> The table Person have a row person_id as primary key and contain also a
> row family_id.
> As you can understand, the row family_id in a table ficha_person is not
> unique, I mean is the same for all the family person's.
>
> So my question is: If most of the time I need to find all the persons
> for one asked family what is the best way to do that?
> I think about two methods:
>         1-  Making a b-tree index in ficha_person with the rows
> family_id and person_id.
>         2 - Adding an array in the table ficha_family containing the
> persons of this family. And creating a Trigger that update this array
> for each  person insert/delete in the table ficha_family.
>
> So ..., what do you think? There are a better solution or what of the
> above solutions is better ??
>
>  Thank you in advance,
>           Gustavo.
>

The trigger/array seems to me as a premature optimization - if you are
not sure the index is 'too slow' (and there's no other way to speed it
up) don't do it.

You should always have a foreign key in Person(family_id) referencing
the Family(family_id) as you need to reinforce data integrity between
these two tables, and the 'rule of a thumb' is to have indexes on
foreign keys in large tables. The reason is pretty simple - the
PostgreSQL does a simple query when checking the foreign key.

So if the Person table is 'small' (less than for example 1000 rows) and
it will not grow too much, there's no need to use an index (as it won't
be used for small tables) and the queries to get all the family members
will be very fast too.

On the other side, if the Person table is 'large' (say more than 10.000
rows), then there should be an index on Person(family_id). Then it
depends on your requirements - the most important things to consider are
these:

1) Will the application be mostly used to select or update?

   The trigger adds (small) overhead to modifications, but if you do
   mostly selects this may not be a problem.

2) Do you need only IDs of the family members, or do you need all the
   data from Person table?

   The trigger/array solution gives you only IDs and you'll have to
   fetch the data in a separate query (most probably). The array simply
   complicates the queries.

Anyway, I would try to stick with the foreign key / index solution as
long as possible. If you are not happy with the speed do some benchmarks
with the trigger / array solution and compare them to the foreign key /
index. Try to do some other optimizations too (for example cluster the
Person table along the family_id column - that usually means a serious
performance boost).

Tomas