Thread: Foreign keys

Foreign keys

From
"MAR - Secretariado Geral"
Date:
Hi everybody,
 
First of all i'de like to apolagize cause my poor english. After this, i shuould say that i beleavee a year ago i brought this problem to the community but i donn't remember some answering about it. The problem is:
 
Every time a users misses a external refrenced key the PGSql raises an exception.
Well as far as i realise if we had 5 or 10 Foreign keys during an Insert/Update transaction only exception should be raised reporting all erros/messages after last external refrenced field missed at one time,not one by one.
Well, in order to implement this idea we will need to desable the built-in refencial integrety and build it all by your self- all the validation (look-ups etc..) before insert/update If tg_op='insert' or tg_op='update'  then as people do with non relational Databases - all hand-made. Well, this is very hard to beleave!!! I must be missing something.
 
Please i'must be wrong can some one explain me what i'm missing?
 
Thanks in Advance 
 
Mário Reis

Re: Foreign keys

From
Chris Mair
Date:
> First of all i'de like to apolagize cause my poor english. After this,
> i shuould say that i beleavee a year ago i brought this problem to the
> community but i donn't remember some answering about it. The problem
> is:
>  
> Every time a users misses a external refrenced key the PGSql raises an
> exception. 
> Well as far as i realise if we had 5 or 10 Foreign keys
> during an Insert/Update transaction only exception should be raised
> reporting all erros/messages after last external refrenced field
> missed at one time,not one by one.
> Well, in order to implement this idea we will need to desable the
> built-in refencial integrety and build it all by your self- all the
> validation (look-ups etc..) before insert/update If tg_op='insert' or
> tg_op='update'  then as people do with non relational Databases - all
> hand-made. Well, this is very hard to beleave!!! I must be missing
> something.
>  
> Please i'must be wrong can some one explain me what i'm missing?

When there is a constraint violation, the current transaction is rolled
back anyhow. 

What's the purpose of letting you insert 1000 records, then, at the end
say: "hah, all is rolled back becauase the 2nd record was invalid".
PG justly throws the exception immediately to let you know it's futile
inserting 998 more records.

I don't see a problem here.

Bye,
Chris.



-- 

Chris Mair
http://www.1006.org



Re: Foreign keys

From
Stefan Kaltenbrunner
Date:
MAR - Secretariado Geral wrote:
> Hi everybody,
>  
> First of all i'de like to apolagize cause my poor english. After this, i
> shuould say that i beleavee a year ago i brought this problem to the
> community but i donn't remember some answering about it. The problem is:
>  
> Every time a users misses a external refrenced key the PGSql raises an
> exception.
> Well as far as i realise if we had 5 or 10 Foreign keys
> during an Insert/Update transaction only exception should be raised
> reporting all erros/messages after last external refrenced field missed
> at one time,not one by one.
> Well, in order to implement this idea we will need to desable the
> built-in refencial integrety and build it all by your self- all the
> validation (look-ups etc..) before insert/update If tg_op='insert' or
> tg_op='update'  then as people do with non relational Databases - all
> hand-made. Well, this is very hard to beleave!!! I must be missing
> something.
>  
> Please i'must be wrong can some one explain me what i'm missing?

I'm not sure what you are complining about exactly but maybe you want to
declare your FK as DEFERRABLE INITIALLY DEFERRED ?
That way the constraint checking happens at the end of the transaction
and not immediately


Stefan


Re: Foreign keys

From
Gregory Stark
Date:
Chris Mair <chrisnospam@1006.org> writes:

> What's the purpose of letting you insert 1000 records, then, at the end
> say: "hah, all is rolled back becauase the 2nd record was invalid".
> PG justly throws the exception immediately to let you know it's futile
> inserting 998 more records.

Well there's plenty of cases where people want that and we support it with
deferred constraints.

However the OP sounds like he wants something else. I think what he wants is
when he inserts a record and it fails due to foreign key constraints to report
all the violated constraints, not just the first one found.

I never run into this problem myself because I think of foreign key
constraints as more akin to C assertions. They're a backstop to make sure the
application is working correctly. I never write code that expects foreign key
constraint errors and tries to handle them.

But there's nothing saying that's the only approach. The feature request seems
pretty reasonable to me. I'm not sure how hard it would be with the ri
triggers as written. I'm not sure there's anywhere for triggers to store their
"return values" so I'm unclear this can even be done using triggers.

But to answer his original question: yes that's the way Postgres works and if
you want to report all the violations together you'll have to check them
yourself.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com


Re: Foreign keys

From
Stephan Szabo
Date:
On Sun, 10 Sep 2006, Gregory Stark wrote:

> Chris Mair <chrisnospam@1006.org> writes:
>
> > What's the purpose of letting you insert 1000 records, then, at the end
> > say: "hah, all is rolled back becauase the 2nd record was invalid".
> > PG justly throws the exception immediately to let you know it's futile
> > inserting 998 more records.
>
> Well there's plenty of cases where people want that and we support it with
> deferred constraints.
>
> However the OP sounds like he wants something else. I think what he wants is
> when he inserts a record and it fails due to foreign key constraints to report
> all the violated constraints, not just the first one found.
>
> I never run into this problem myself because I think of foreign key
> constraints as more akin to C assertions. They're a backstop to make sure the
> application is working correctly. I never write code that expects foreign key
> constraint errors and tries to handle them.
>
> But there's nothing saying that's the only approach. The feature request seems
> pretty reasonable to me. I'm not sure how hard it would be with the ri
> triggers as written. I'm not sure there's anywhere for triggers to store their
> "return values" so I'm unclear this can even be done using triggers.

I think if we were going to do this that all the constraint violations for
unique, not null, check and foreign keys should be handled similarly, so
we'd probably want something more general than just a way for the ri
triggers to do this. I don't have a good idea of the right solution for
that though.


Re: Foreign keys

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> I think if we were going to do this that all the constraint violations for
> unique, not null, check and foreign keys should be handled similarly, so
> we'd probably want something more general than just a way for the ri
> triggers to do this. I don't have a good idea of the right solution for
> that though.

It seems pretty unwieldy to me: it's not hard to imagine a long INSERT
throwing millions of separate foreign-key errors before it's done, for
instance.  And then there's the cascading-errors problem, ie, bogus
reports that happen because some prior step failed ... not least being
your client crashing and failing to tell you anything about what
happened because it ran out of memory for the error list.

My advice is to rethink the client code that wants such a behavior.
        regards, tom lane


Re: Foreign keys

From
Gregory Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> > I think if we were going to do this that all the constraint violations for
> > unique, not null, check and foreign keys should be handled similarly, so
> > we'd probably want something more general than just a way for the ri
> > triggers to do this. I don't have a good idea of the right solution for
> > that though.
> 
> It seems pretty unwieldy to me: it's not hard to imagine a long INSERT
> throwing millions of separate foreign-key errors before it's done, for
> instance.  And then there's the cascading-errors problem, ie, bogus
> reports that happen because some prior step failed ... not least being
> your client crashing and failing to tell you anything about what
> happened because it ran out of memory for the error list.
> 
> My advice is to rethink the client code that wants such a behavior.

Well you're still talking about the case of multiple queries deferring all
constraint checks to the end of the transaction. I'm not sure what the
original poster had in mind but that's not how I read it and it wasn't what I
was speculating about.

I was only thinking of situations like:

INSERT INTO TABLE USER (name, department, zipcode) VALUES ('Tom', 0, '00000');

We'll fail with an error like "violates foreign key constraint
user_department_fkey". It won't say anything about the zipcode also being
invalid.

I sure hate UIs that give you one error at a time so you have to keep fixing
one problem, clicking ok again, only to have yet another error pop up, rinse,
lather, repeat until you finally get all the problems sorted out.

Now I've never actually run into this because as I mention I always treated
the database constraints as assertion checks independent of the application
which usually enforces stricter conditions anyways. But I could see someone
arguing that having two independent sets of code implementing the same set of
conditions is poor.

In any case the same logic that leads to it being desirable to report all the
errors to the user in a UI and not just report them one by one also applies to
the database. I'm not sure it's the most important issue in the world, but it
does seem like a "it would be nice" feature if it reported all the errors in
the statement, not just the first one it finds.

-- 
greg



Re: Foreign keys

From
"Joshua D. Drake"
Date:
> In any case the same logic that leads to it being desirable to report all the
> errors to the user in a UI and not just report them one by one also applies to
> the database. I'm not sure it's the most important issue in the world, but it
> does seem like a "it would be nice" feature if it reported all the errors in
> the statement, not just the first one it finds.
> 

Seems kind of extraneous to me. I am guessing it would cause yet further 
overhead with our foreign key checks.

My testing shows that the use of foreign keys on high velocity single 
transaction loads, can cause easily a 50% reduction in performance. Why 
add to that? What we need to be doing is finding a way to decrease the 
impact of foreign key checks.

Sincerely,

Joshua D. Drake

-- 
   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240   Providing the most comprehensive  PostgreSQL
solutionssince 1997             http://www.commandprompt.com/
 




Re: Foreign keys

From
Stephan Szabo
Date:
On Sun, 10 Sep 2006, Gregory Stark wrote:

> Tom Lane <tgl@sss.pgh.pa.us> writes:
>
> > Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> > > I think if we were going to do this that all the constraint violations for
> > > unique, not null, check and foreign keys should be handled similarly, so
> > > we'd probably want something more general than just a way for the ri
> > > triggers to do this. I don't have a good idea of the right solution for
> > > that though.
> >
> > It seems pretty unwieldy to me: it's not hard to imagine a long INSERT
> > throwing millions of separate foreign-key errors before it's done, for
> > instance.  And then there's the cascading-errors problem, ie, bogus
> > reports that happen because some prior step failed ... not least being
> > your client crashing and failing to tell you anything about what
> > happened because it ran out of memory for the error list.
> >
> > My advice is to rethink the client code that wants such a behavior.
>
> Well you're still talking about the case of multiple queries deferring all
> constraint checks to the end of the transaction.

Well, or insert ... select or update or delete. Most "deferred"
conditions can happen within one statement as well.

> In any case the same logic that leads to it being desirable to report all the
> errors to the user in a UI and not just report them one by one also applies to
> the database. I'm not sure it's the most important issue in the world, but it
> does seem like a "it would be nice" feature if it reported all the errors in
> the statement, not just the first one it finds.

SQL seems to have a notion of setting the size of the diagnostics area for
a transaction to hold a number of conditions. There are a few odd bits,
for example it's mostly unordered, but the sqlcode returned must match to
the first condition and we presumably want to make sure that if there are
any errors that we return an exception sql code not a completion one.


Re: Foreign keys

From
Kevin Brown
Date:
Joshua D. Drake wrote:
> 
> >In any case the same logic that leads to it being desirable to report all 
> >the
> >errors to the user in a UI and not just report them one by one also 
> >applies to
> >the database. I'm not sure it's the most important issue in the world, but 
> >it
> >does seem like a "it would be nice" feature if it reported all the errors 
> >in
> >the statement, not just the first one it finds.
> >
> 
> Seems kind of extraneous to me. I am guessing it would cause yet further 
> overhead with our foreign key checks.

But in this case, it would be (or should be) overhead only in the case
of failure.  In the case of success, all the constraints are checked
anyway -- they just succeed.

I would expect that the number of applications for which a constraint
violation is the norm and not the exception is very small.


But Tom's concern is a valid one.  I expect a reasonable compromise
would be to record and show the errors for only the non-deferred
constraints in the currently executing statement, because after that
point the transaction is in an error state anyway and thus can't
continue without a rollback to a savepoint.  It probably wouldn't make
sense to evaluate the deferred constraints within the erroring
statement anyway -- they're deferred, which by definition means they
don't get evaluated until commit, so evaluating them at failure time
could easily show errors that are only there because subsequent
statements never got executed.

As for the deferred constraints, it might be reasonable to show errors
only up to some limit (controlled by a GUC, perhaps), with the default
limit being 1, which is what we have now.  Otherwise you run the risk
of throwing millions of errors, which is surely not desirable.  The
downside to this is until you've hit the limit, you have to evaluate
*all* the deferred constraints, which could take a while, whereas the
current setup will return immediately upon encountering the first
constraint error.



> My testing shows that the use of foreign keys on high velocity single 
> transaction loads, can cause easily a 50% reduction in performance. Why 
> add to that? What we need to be doing is finding a way to decrease the 
> impact of foreign key checks.

I definitely agree here, but this should be independent of how foreign
key failures are handled once they're detected.  In other words, what
you're experiencing is the perfomance hit that comes from evaluating
the constraints, not from reporting the errors afterwards.


-- 
Kevin Brown                          kevin@sysexperts.com


Re: Foreign keys

From
"Jim C. Nasby"
Date:
On Sun, Sep 10, 2006 at 09:40:51AM -0700, Joshua D. Drake wrote:
> 
> >In any case the same logic that leads to it being desirable to report all 
> >the
> >errors to the user in a UI and not just report them one by one also 
> >applies to
> >the database. I'm not sure it's the most important issue in the world, but 
> >it
> >does seem like a "it would be nice" feature if it reported all the errors 
> >in
> >the statement, not just the first one it finds.
> >
> 
> Seems kind of extraneous to me. I am guessing it would cause yet further 
> overhead with our foreign key checks.
> 
> My testing shows that the use of foreign keys on high velocity single 
> transaction loads, can cause easily a 50% reduction in performance. Why 
> add to that? What we need to be doing is finding a way to decrease the 
> impact of foreign key checks.

IIRC, a big chunk of that overhead is simply having triggers on the
table. I tested it once and found something like a 30% overhead for
having a trigger that did nothing on insert. Granted, that was a simple
test on a single machine, but still...

Obviously one place to look is in the trigger code to see if there's
performance gains to be had there. But something else to consider is
moving away from using a general-purpose trigger framework to impliment
RI. I suspect a dedicate code path for RI could be a lot leaner than the
general-purpose trigger code is.
-- 
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)