Thread: FOREIGN KEYS vs PERFORMANCE

FOREIGN KEYS vs PERFORMANCE

From
"Rodrigo Sakai"
Date:
  Hi,
 
  I think this is an old question, but I want to know if it really is well worth to not create some foreign keys an deal with the referential integrity at application-level?????
  Specifically, the system we are developing is a server/cliente architecture that the server is the database and the fat client is an application developed in DELPHI!!!
 
  Thanks in advance!!

Re: FOREIGN KEYS vs PERFORMANCE

From
Scott Marlowe
Date:
On Tue, 2006-04-11 at 14:13, Rodrigo Sakai wrote:
>   Hi,
>
>   I think this is an old question, but I want to know if it really is
> well worth to not create some foreign keys an deal with the
> referential integrity at application-level?????
>   Specifically, the system we are developing is a server/cliente
> architecture that the server is the database and the fat client is an
> application developed in DELPHI!!!
>

If ref integrity is important, you'll have to do it either in the app or
the database.

Almost always, it's faster to let the database do it, as there's less
traffic across the wire required to maintain ref integrity, plus, the
guys who wrote the database have spent years making sure race conditions
won't scram your data.

For simple, straight forward FK->PK relationships, you will likely NOT
be able to beat the database in terms of either reliability or
performance with your own code.

Re: FOREIGN KEYS vs PERFORMANCE

From
Michael Glaesemann
Date:
On Apr 12, 2006, at 4:13 , Rodrigo Sakai wrote:

>   I think this is an old question, but I want to know if it really
> is well worth to not create some foreign keys an deal with the
> referential integrity at application-level?????

If I had to choose between one or the other, I'd leave all
referential integrity in the database and deal with the errors thrown
when referential integrity is violated in the application. PostgreSQL
is designed to handle these kinds of issues. Anything you code in
your application is more likely to contain bugs or miss corner cases
that would allow referential integrity to be violated. PostgreSQL has
been pounded on for years by a great many users and developers,
making the likelihood of bugs still remaining much smaller.

Of course, you can add some referential integrity checks in your
application code, but those should be in addition to your database-
level checks.

Michael Glaesemann
grzm myrealbox com




Re: FOREIGN KEYS vs PERFORMANCE

From
"Jim C. Nasby"
Date:
On Wed, Apr 12, 2006 at 08:06:17AM +0900, Michael Glaesemann wrote:
>
> On Apr 12, 2006, at 4:13 , Rodrigo Sakai wrote:
>
> >  I think this is an old question, but I want to know if it really
> >is well worth to not create some foreign keys an deal with the
> >referential integrity at application-level?????
>
> If I had to choose between one or the other, I'd leave all
> referential integrity in the database and deal with the errors thrown
> when referential integrity is violated in the application. PostgreSQL
> is designed to handle these kinds of issues. Anything you code in
> your application is more likely to contain bugs or miss corner cases
> that would allow referential integrity to be violated. PostgreSQL has
> been pounded on for years by a great many users and developers,
> making the likelihood of bugs still remaining much smaller.

It's also pretty unlikely that you can make RI in the application
perform better than in the database.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: FOREIGN KEYS vs PERFORMANCE

From
Tom Lane
Date:
"Jim C. Nasby" <jnasby@pervasive.com> writes:
> On Wed, Apr 12, 2006 at 08:06:17AM +0900, Michael Glaesemann wrote:
>> ... Anything you code in
>> your application is more likely to contain bugs or miss corner cases
>> that would allow referential integrity to be violated. PostgreSQL has
>> been pounded on for years by a great many users and developers,
>> making the likelihood of bugs still remaining much smaller.

> It's also pretty unlikely that you can make RI in the application
> perform better than in the database.

I think the traditional assumption among the "you should do RI in the
application" crowd is that the application has higher-level knowledge
that lets it understand when it can skip doing an RI check entirely.
Skipping an RI check is always faster than doing it --- so that's right,
it's faster.  As long as you don't make any mistakes.

The question you have to ask yourself is whether you are really that
smart ... not just today, but every single time.  To quote Clint
Eastwood: "Do you feel lucky punk?  Well, do you?"

            regards, tom lane

Re: FOREIGN KEYS vs PERFORMANCE

From
PFC
Date:
>>   I think this is an old question, but I want to know if it really is
>> well worth to not create some foreign keys an deal with the referential
>> integrity at application-level?????

    Trust me : do it in the application and you'll enter a world of hurt. I'm
doing it with some mysql apps, and it's a nightmare ; doing cascaded
delete's by hand, etc, you always forget something, you have to modify a
million places in your code everytime you add a new table, your ORM
bloats, you get to write cleanup cron scripts which take forever to run,
your website crashes etc.


Re: FOREIGN KEYS vs PERFORMANCE

From
Markus Schaber
Date:
Hi, Michael,
Hi, Rodrigo,

Michael Glaesemann wrote:

> If I had to choose between one or the other, I'd leave all  referential
> integrity in the database and deal with the errors thrown  when
> referential integrity is violated in the application. PostgreSQL  is
> designed to handle these kinds of issues. Anything you code in  your
> application is more likely to contain bugs or miss corner cases  that
> would allow referential integrity to be violated. PostgreSQL has  been
> pounded on for years by a great many users and developers,  making the
> likelihood of bugs still remaining much smaller.

I strictly agree with Michael here.

> Of course, you can add some referential integrity checks in your
> application code, but those should be in addition to your database-
> level checks.

Agree. It does make sense to have reference checks in the UI or
application level for the sake of better error handling, but the
database should be the mandatory judge.

There's another advantage of database based checking: Should there ever
be the need of a different application working on the same database (e.
G. an "expert level UI", or some connector that connects / synchronizes
to another software, or a data import tool), database based constraints
cannot be broken opposed to application based ones.

HTH,
Markus
--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

Re: FOREIGN KEYS vs PERFORMANCE

From
"Craig A. James"
Date:
>>>   I think this is an old question, but I want to know if it really
>>> is  well worth to not create some foreign keys an deal with the
>>> referential  integrity at application-level?????
>
>
>     Trust me : do it in the application and you'll enter a world of
> hurt. I'm  doing it with some mysql apps, and it's a nightmare ; doing
> cascaded  delete's by hand, etc, you always forget something, you have
> to modify a  million places in your code everytime you add a new table,
> your ORM  bloats, you get to write cleanup cron scripts which take
> forever to run,  your website crashes etc.

All good advice, but... there are no absolutes in this world.  Application-enforced referential integrity makes sense
if(and probably ONLY if): 

1. You have only one application that modifies the data.  (Otherwise, you have to duplicate the rules across many
applications,leading to a code-maintenance nightmare). 

2. If your application crashes and leaves a mess, it's not a catastrophe, and you have a good way to clean it up.  For
example,a bank shouldn't do this, but it might be OK for a computer-aided-design application, or the backend of a news
website. 

3. You have application-specific knowledge about when you can skip referential integrity and thereby greatly improve
performance. For example, you may have batch operations where large numbers of rows are temporarily inconsistent. 

If your application doesn't meet ALL of these criteria, you probably should use the database for referential integrity.

Craig

Re: FOREIGN KEYS vs PERFORMANCE

From
"Rodrigo Sakai"
Date:
  Thanks for all help!! But my problem is with performance, I agree with all
of you, the RI must be maintained by the database, because a bunch of
reasons that everyone knows!
  But, I'm dealing with a very huge database that servers more than 200
clientes at the same time, and because of it, each manipulation (delete,
insert, update, select) on the database have a poor performance. So, if we
deal with RI in each client station, we take this work off the database!
  The application is an ERP developed with DELPHI + (postgresql or oracle or
sql server)!!

  Thanks again!!

----- Original Message -----
From: "Markus Schaber" <schabi@logix-tt.com>
To: <pgsql-performance@postgresql.org>
Cc: "Rodrigo Sakai" <rodrigo.sakai@zanthus.com.br>
Sent: Wednesday, April 12, 2006 10:18 AM
Subject: Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE


> Hi, Michael,
> Hi, Rodrigo,
>
> Michael Glaesemann wrote:
>
> > If I had to choose between one or the other, I'd leave all  referential
> > integrity in the database and deal with the errors thrown  when
> > referential integrity is violated in the application. PostgreSQL  is
> > designed to handle these kinds of issues. Anything you code in  your
> > application is more likely to contain bugs or miss corner cases  that
> > would allow referential integrity to be violated. PostgreSQL has  been
> > pounded on for years by a great many users and developers,  making the
> > likelihood of bugs still remaining much smaller.
>
> I strictly agree with Michael here.
>
> > Of course, you can add some referential integrity checks in your
> > application code, but those should be in addition to your database-
> > level checks.
>
> Agree. It does make sense to have reference checks in the UI or
> application level for the sake of better error handling, but the
> database should be the mandatory judge.
>
> There's another advantage of database based checking: Should there ever
> be the need of a different application working on the same database (e.
> G. an "expert level UI", or some connector that connects / synchronizes
> to another software, or a data import tool), database based constraints
> cannot be broken opposed to application based ones.
>
> HTH,
> Markus
> --
> Markus Schaber | Logical Tracking&Tracing International AG
> Dipl. Inf.     | Software Development GIS
>
> Fight against software patents in EU! www.ffii.org
www.nosoftwarepatents.org
>


Re: FOREIGN KEYS vs PERFORMANCE

From
"Merlin Moncure"
Date:
On 4/11/06, Rodrigo Sakai <rodrigo.sakai@zanthus.com.br> wrote:
>
>   Hi,
>
>   I think this is an old question, but I want to know if it really is well
> worth to not create some foreign keys an deal with the referential integrity
> at application-level?????
>   Specifically, the system we are developing is a server/cliente
> architecture that the server is the database and the fat client is an
> application developed in DELPHI!!!
>
>   Thanks in advance!!

Delphi IMO is the best RAD win32 IDE ever invented (especially when
paired the very excellent Zeos connection objects).  However, for
purposes of data management ,imperative languages, Delphi included,
simply suck.  Great form editor though.

merlin

Re: FOREIGN KEYS vs PERFORMANCE

From
Michael Glaesemann
Date:
On Apr 12, 2006, at 23:49 , Rodrigo Sakai wrote:

>   Thanks for all help!! But my problem is with performance, I agree
> with all
> of you, the RI must be maintained by the database, because a bunch of
> reasons that everyone knows!

You've gotten a variety of good advice from a number of people. For
more specific advice (e.g., for your particular situation), it would
be very helpful if you could provide examples of queries that aren't
performing well for you (including table schema and explain analyze
output).

Michael Glaesemann
grzm myrealbox com




Re: FOREIGN KEYS vs PERFORMANCE

From
PFC
Date:
    What kinds of operations are made slow by foreign key checks ? Is it :
    - Simple checks on INSERT ?
    - Simple checks on UPDATE ?
    - Cascaded deletes ?
    - Cascaded updates ?
    - Locks ?
    - Something else ?

    Foreign keys are to ensure that the value in a column is always part of a
specific set (the referenced table). If this set changes very rarely, like
the list of countries in the world, or the list of states in a country, or
the various possible states an order can be in (received, processed,
shipped...) then it's alright to skip the check if you're sure your
application will insert a valid value. For some other situations, doing
the check in the application will need some queries and could be slower
(and certainly will be more complicated...)

    Are you sure you're not missing a few indexes, which would then force
fkey checks to use sequential scans ?

>   Thanks for all help!! But my problem is with performance, I agree with
> all
> of you, the RI must be maintained by the database, because a bunch of
> reasons that everyone knows!
>   But, I'm dealing with a very huge database that servers more than 200
> clientes at the same time, and because of it, each manipulation (delete,
> insert, update, select) on the database have a poor performance. So, if
> we
> deal with RI in each client station, we take this work off the database!
>   The application is an ERP developed with DELPHI + (postgresql or
> oracle or
> sql server)!!

Re: FOREIGN KEYS vs PERFORMANCE

From
Scott Marlowe
Date:
On Wed, 2006-04-12 at 09:49, Rodrigo Sakai wrote:
>   Thanks for all help!! But my problem is with performance, I agree with all
> of you, the RI must be maintained by the database, because a bunch of
> reasons that everyone knows!
>   But, I'm dealing with a very huge database that servers more than 200
> clientes at the same time, and because of it, each manipulation (delete,
> insert, update, select) on the database have a poor performance. So, if we
> deal with RI in each client station, we take this work off the database!
>   The application is an ERP developed with DELPHI + (postgresql or oracle or
> sql server)!!

These are separate issues.

One is performance of PostgreSQL handling FK->PK relationships.
PostgreSQL, in my experience, is quite fast at this.  However, there are
ways you can set up FK->PK relationships that are non-optimal and will
result in poor performance.  FK->PK relationships are generally fastest
when they are 1-1 and based on integer types.  If there's a type
mismatch, or you use slower types, like large text fields, or numerics,
you may have poor performance.  Give us a sample of your schema where
you're having problems, let us help you troubleshoot your performance.

High parallel load is another issue.  No matter where you put your
FK->PK relationship handling, having 200+ users connected at the same
time and manipulating your database is a heavy load.

Handling FK->PK relationships in software often is vulnerable to race
conditions.  Like so:  (T1 and T2 are different "threads)

T1: select id from mastertable where id=99; -- check for row
T2: delete from mastertable where id=99; -- delete a row
T1: insert into slavetable values (....); -- whoops!  No master

If we change the T1 to select for update, we now have the overhead that
most FK->PK relationships have.

What version of PostgreSQL are you running.  Older versions had much
poorer performance than newer versions when updating FK->PK
relationships.

Don't assume that application level FK->PK relationships will be faster
AND as good as the ones at database level.  It's quite possible that
they're faster for you because you're cutting corners, referentially
speaking, and your data will wind up incoherent over time.

Also, you may be dealing with a database that is IO bound, and moving
the FK checks to software is only a short stop gap, and as the machine
hits the IO performance ceiling, you'll have the same problem again,
need a bigger machine, and have incoherent data.  I.e. the same problem,
plus a few more, and have spent a lot of time spinning your wheels going
a short distance.

Re: FOREIGN KEYS vs PERFORMANCE

From
"Jim C. Nasby"
Date:
On Wed, Apr 12, 2006 at 09:22:52AM +0200, PFC wrote:
>
> >>  I think this is an old question, but I want to know if it really is
> >>well worth to not create some foreign keys an deal with the referential
> >>integrity at application-level?????
>
>     Trust me : do it in the application and you'll enter a world of
>     hurt. I'm  doing it with some mysql apps, and it's a nightmare ; doing
> cascaded  delete's by hand, etc, you always forget something, you have to
> modify a  million places in your code everytime you add a new table, your
> ORM  bloats, you get to write cleanup cron scripts which take forever to
> run,  your website crashes etc.

Well, yeah, thats typical for MySQL sites, but what's that have to do
with RI?

Sorry, couldn't resist. :P

Re: FOREIGN KEYS vs PERFORMANCE

From
"Jim C. Nasby"
Date:
On Wed, Apr 12, 2006 at 07:45:17AM -0700, Craig A. James wrote:
> All good advice, but... there are no absolutes in this world.
> Application-enforced referential integrity makes sense if (and probably
> ONLY if):
>
> 1. You have only one application that modifies the data.  (Otherwise, you
> have to duplicate the rules across many applications, leading to a
> code-maintenance nightmare).

You forgot something:

1a: You know that there will never, ever, ever, ever, be any other
application that wants to talk to the database.

I know tons of people that get burned because they go with something
that's "good enough for now", and then regret that decision for years to
come.

> 2. If your application crashes and leaves a mess, it's not a catastrophe,
> and you have a good way to clean it up.  For example, a bank shouldn't do
> this, but it might be OK for a computer-aided-design application, or the
> backend of a news web site.
>
> 3. You have application-specific knowledge about when you can skip
> referential integrity and thereby greatly improve performance.  For
> example, you may have batch operations where large numbers of rows are
> temporarily inconsistent.
>
> If your application doesn't meet ALL of these criteria, you probably should
> use the database for referential integrity.
>
> Craig
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>               http://www.postgresql.org/docs/faq
>

--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: FOREIGN KEYS vs PERFORMANCE

From
"Craig A. James"
Date:
Jim C. Nasby wrote:
>>1. You have only one application that modifies the data.  (Otherwise, you
>>have to duplicate the rules across many applications, leading to a
>>code-maintenance nightmare).
>
> You forgot something:
>
> 1a: You know that there will never, ever, ever, ever, be any other
> application that wants to talk to the database.
>
> I know tons of people that get burned because they go with something
> that's "good enough for now", and then regret that decision for years to
> come.

No, I don't agree with this.  Too many people waste time designing for "what if..." scenarios that never happen.  You
don'twant to be dumb and design something that locks out a foreseeable and likely future need, but referential
integritydoesn't meet this criterion.  There's nothing to keep you from changing from app-managed to database-managed
referentialintegrity if your needs change. 

Design for your current requirements.


Let us be of good cheer, remembering that the misfortunes hardest to bear are
those which never happen.        - James Russell Lowell (1819-1891)

Therefore do not be anxious about tomorrow, for tomorrow will be anxious for
itself.  Let the day's own trouble be sufficient for the day.
                    - Matthew 6:34

Craig

Re: FOREIGN KEYS vs PERFORMANCE

From
"Jim C. Nasby"
Date:
On Wed, Apr 12, 2006 at 10:36:28AM -0700, Craig A. James wrote:
> Jim C. Nasby wrote:
> >>1. You have only one application that modifies the data.  (Otherwise, you
> >>have to duplicate the rules across many applications, leading to a
> >>code-maintenance nightmare).
> >
> >You forgot something:
> >
> >1a: You know that there will never, ever, ever, ever, be any other
> >application that wants to talk to the database.
> >
> >I know tons of people that get burned because they go with something
> >that's "good enough for now", and then regret that decision for years to
> >come.
>
> No, I don't agree with this.  Too many people waste time designing for
> "what if..." scenarios that never happen.  You don't want to be dumb and
> design something that locks out a foreseeable and likely future need, but
> referential integrity doesn't meet this criterion.  There's nothing to keep
> you from changing from app-managed to database-managed referential
> integrity if your needs change.

In this case your argument makes no sense, because you will spend far
more time re-creating RI capability inside an application than if you
just use what the database offers natively.

It's certainly true that you don't want to over-engineer for no reason,
but many times choices are made to save a very small amount of time or
hassle up-front, and those choices become extremely painful later.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: FOREIGN KEYS vs PERFORMANCE

From
"Craig A. James"
Date:
Jim C. Nasby wrote:
>>No, I don't agree with this.  Too many people waste time designing for
>>"what if..." scenarios that never happen.  You don't want to be dumb and
>>design something that locks out a foreseeable and likely future need, but
>>referential integrity doesn't meet this criterion.  There's nothing to keep
>>you from changing from app-managed to database-managed referential
>>integrity if your needs change.
>
> In this case your argument makes no sense, because you will spend far
> more time re-creating RI capability inside an application than if you
> just use what the database offers natively.

But one of the specific conditions in my original response was, "You have application-specific knowledge about when you
canskip referential integrity and thereby greatly improve performance."  If you can't do that, I agree with you. 

Anyway, this discussion is probably going on too long, and I'm partly to blame.  I think we all agree that in almost
allsituations, using the database to do referential integrity is the right choice, and that you should only violate
thisrule if you have a really, really good reason, and you've thought out the implications carefully, and you know you
mayhave to pay a steep price later if your requirements change. 

Craig

Re: FOREIGN KEYS vs PERFORMANCE

From
"Rodrigo Sakai"
Date:
  Thanks for all responses! I agree with most of you, and say that the RI is
best maintened by Database ! Performance must be improved in other ways
(indexes, hardware, etc)!


----- Original Message -----
From: "Jim C. Nasby" <jnasby@pervasive.com>
To: "Craig A. James" <cjames@modgraph-usa.com>
Cc: "PFC" <lists@peufeu.com>; "Michael Glaesemann" <grzm@myrealbox.com>;
"Rodrigo Sakai" <rodrigo.sakai@zanthus.com.br>;
<pgsql-performance@postgresql.org>
Sent: Wednesday, April 12, 2006 5:59 PM
Subject: Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE


> On Wed, Apr 12, 2006 at 10:36:28AM -0700, Craig A. James wrote:
> > Jim C. Nasby wrote:
> > >>1. You have only one application that modifies the data.  (Otherwise,
you
> > >>have to duplicate the rules across many applications, leading to a
> > >>code-maintenance nightmare).
> > >
> > >You forgot something:
> > >
> > >1a: You know that there will never, ever, ever, ever, be any other
> > >application that wants to talk to the database.
> > >
> > >I know tons of people that get burned because they go with something
> > >that's "good enough for now", and then regret that decision for years
to
> > >come.
> >
> > No, I don't agree with this.  Too many people waste time designing for
> > "what if..." scenarios that never happen.  You don't want to be dumb and
> > design something that locks out a foreseeable and likely future need,
but
> > referential integrity doesn't meet this criterion.  There's nothing to
keep
> > you from changing from app-managed to database-managed referential
> > integrity if your needs change.
>
> In this case your argument makes no sense, because you will spend far
> more time re-creating RI capability inside an application than if you
> just use what the database offers natively.
>
> It's certainly true that you don't want to over-engineer for no reason,
> but many times choices are made to save a very small amount of time or
> hassle up-front, and those choices become extremely painful later.
> --
> Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
> Pervasive Software      http://pervasive.com    work: 512-231-6117
> vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly
>