Re: Foreign Keys Constraints, perforamance analysis - Mailing list pgsql-general

From Stephan Szabo
Subject Re: Foreign Keys Constraints, perforamance analysis
Date
Msg-id 001301c0fcff$ec2d36a0$02de010a@myst.com
Whole thread Raw
In response to Foreign Keys Constraints, perforamance analysis  (Daniel Åkerud <zilch@home.se>)
List pgsql-general
> > There's a BIG difference between
> > delete from person where name='foo' compared to
> > delete from person where name='foo'; delete from married where ... ;
> delete
> > from child where ...;
> > and
> > delete from person; compared to
> > delete from person; delete from married; delete from child;
>
> I can see that,
> In the first case there are a hell lot of overhead sending the queries.
>
> > In the first case, the system sees either 1 statement that expands into
3
> > statements effectively versus 3 statements.  Not too different.
>
> ok...
>
> > In the second case the system sees 1 statement + 1 statement per row
> versus
> > 3 statements.
>
> I can't see what you mean here... "+ 1 statement per row"... there is only
> one row?

Well, if the parent table has 100 rows, the delete from table turns into 101
statements pretty much.
It basically turns into:
delete from parent;
delete from married where <keys from first person row>
delete from married where <keys from second person row>
...
etc...
In your example with N*2 rows in person and N rows in married_fkc and
N*2 rows in child_fkc, with the bulk delete in person is going to go N*2
deletes from married_fkc and N deletes in child_fkc.

> What is the difference between these two (only comparing the tables with
> foreign keys constraits now):
>
> DELETE FROM PERSON;
>
> and
>
> DELETE FROM PERSON where id = 1;
> DELETE FROM PERSON where id = 2;
>
> The only thing I can see (which I assume is what I do wrong here), is that
> there is a lot of overhead sending the queries. If we ignore the overhead
in
> our conversation, what is the difference?

It's not the cost on the delete from person, it's the cost on the delete
from married,
and it's not the sending overhead, it's the processing overhead to find the
particular
row you want to delete over and over again. It costs more to find a row by
index
value than to just get the next row in the table (generally).  So, in the
first case
you can get all the rows sequentially.  In the second you have to pay the
find
a row by index cost for every row.  [Either delete 100 rows sequentially or
delete 1 row by index 100 times.  The latter is more expensive.]

The difference for purposes of "real world" testing is that you generally
don't do bulk
deletes like that very often and isn't at all what's being optimized for,
and the cost of (taking only the first level):
delete from person where id=1;
delete from married_fkc where person1ID=1;
delete from married_fkc where person2ID=1;
(or even where person1ID=1 or person2ID=1)
and the fk version as:
delete from person where id=1;
*should* be reasonably similar.  If it's not, that's a definate performance
problem.



_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


pgsql-general by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Harddisk performance degrading over time?
Next
From: Paul
Date:
Subject: Re[6]: Postgres is too slow?