Thread: Foreign key bugs (Re: [BUGS] "New" bug?? Serious - crashes backend.)

Foreign key bugs (Re: [BUGS] "New" bug?? Serious - crashes backend.)

From
Tom Lane
Date:
ryan <ryan@bel.bc.ca> writes:
> I have a database which reliable crashes the database system.  When you
> try to do insert/update on the table 'machines' the backend crashes with
> what I think is signal 11.

The problem seems to be that you have a foreign-key trigger on
'machines' which refers to a non-existent primary-key relation:

> CREATE CONSTRAINT TRIGGER "siteid" AFTER INSERT OR UPDATE ON
> "machines" NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE
> PROCEDURE "RI_FKey_check_ins" ('siteid', 'machines', 'site',
> 'UNSPECIFIED', 'sites', 'siteid');

The 'site' argument is the name of the referenced table, and you
have no table named site.

There are at least two bugs here: the immediate cause of the crash
is lack of a check for heap_openr() failure in the RI trigger code,
but a larger question is why the system let you drop a table that
is the target of a referential integrity check (which I assume is
what you did to get into this state).

Anyway, dropping the siteid trigger, as well as any others that
refer to gone tables, ought to get you out of trouble for now.
Meanwhile the foreign-key boys have some work to do ...
        regards, tom lane


Re: Foreign key bugs (Re: [BUGS] "New" bug?? Serious - crashes backend.)

From
JanWieck@t-online.de (Jan Wieck)
Date:
Tom Lane wrote:
>
> There are at least two bugs here: the immediate cause of the crash
> is lack of a check for heap_openr() failure in the RI trigger code,
   Exactly where is that check missing (if it still is)?

> but a larger question is why the system let you drop a table that
> is the target of a referential integrity check (which I assume is
> what you did to get into this state).
   For me too.

> Anyway, dropping the siteid trigger, as well as any others that
> refer to gone tables, ought to get you out of trouble for now.
> Meanwhile the foreign-key boys have some work to do ...
   That's exactly the purpose of pg_trigger.tgconstrrelid, which   is filled with the  opposite  relations  Oid  for
constraint  triggers.    In  RelationRemoveTriggers(),  which  is  called   during DROP TABLE, theres a scan for it.
That'swhere the
 
       DROP TABLE implicitly drops referential ...
   NOTICE message comes from. So I wonder how he got  into  that   state?


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #




JanWieck@t-online.de (Jan Wieck) writes:
> Tom Lane wrote:
>> There are at least two bugs here: the immediate cause of the crash
>> is lack of a check for heap_openr() failure in the RI trigger code,

>     Exactly where is that check missing (if it still is)?

The heap_openr calls with NoLock --- the way heap_open[r] are set up
is that there's an elog on open failure iff you request a lock, but
if you don't then you have to check for a NULL return explicitly.
Perhaps this coding convention is too error-prone and ought to be
changed to have two different routine names, say "heap_open[r]"
and "heap_open[r]_noerr".  Opinions anyone?

I had a note to myself that ri_triggers' use of NoLock was probably
a bug anyway.  Shouldn't it be acquiring *some* kind of lock on the
referenced relation?  Else someone might be deleting it out from
under you.

>> but a larger question is why the system let you drop a table that
>> is the target of a referential integrity check (which I assume is
>> what you did to get into this state).

>     For me too.

What about renaming as opposed to dropping?  Since the triggers are set
up to use names rather than OIDs, seems like they are vulnerable to a
rename.  Maybe they should be using table OIDs in their parameter lists.
(That'd make pg_dump's life harder however...)
        regards, tom lane


Re: Foreign key bugs (Re: [BUGS] "New" bug?? Serious - crashes backend.)

From
Bruce Momjian
Date:
> JanWieck@t-online.de (Jan Wieck) writes:
> > Tom Lane wrote:
> >> There are at least two bugs here: the immediate cause of the crash
> >> is lack of a check for heap_openr() failure in the RI trigger code,
> 
> >     Exactly where is that check missing (if it still is)?
> 
> The heap_openr calls with NoLock --- the way heap_open[r] are set up
> is that there's an elog on open failure iff you request a lock, but
> if you don't then you have to check for a NULL return explicitly.
> Perhaps this coding convention is too error-prone and ought to be
> changed to have two different routine names, say "heap_open[r]"
> and "heap_open[r]_noerr".  Opinions anyone?

We already have heap_open and heap_openr.  Seems another is too hard. 
Better to give them a parameter to control it.  The API is confusing enough.



--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> Perhaps this coding convention is too error-prone and ought to be
>> changed to have two different routine names, say "heap_open[r]"
>> and "heap_open[r]_noerr".  Opinions anyone?

> We already have heap_open and heap_openr.  Seems another is too hard.
> Better to give them a parameter to control it.  The API is confusing
> enough.

It is confusing, but we have here graphic evidence that the way it's
currently done is confusing.  In a quick check, I found several other
cases of the same error that must have crept in over the past year or
so.  So I'm now convinced that we'd better change the API of these
routines to make it crystal-clear whether you are getting a check for
open failure or not.

I like a different routine name better than a check-or-no-check
parameter.  If you invoke the no-check case then you *MUST* have a check
for failure return --- forgetting to do this is exactly the problem.
So I think it should be harder to get at the no-check case, and you
should have to write something that reminds you that the routine is not
checking for you.  Thus "heap_open_noerr" (I'm not particularly wedded
to that suffix, though, if anyone has a better idea for what to call
it).  A parameter would only be useful if the same calling code might
reasonably do different things at different times --- but either there's
a check following the call, or there's not.
        regards, tom lane


Re: Foreign key bugs (Re: [BUGS] "New" bug?? Serious - crashes backend.)

From
Bruce Momjian
Date:
> I like a different routine name better than a check-or-no-check
> parameter.  If you invoke the no-check case then you *MUST* have a check
> for failure return --- forgetting to do this is exactly the problem.
> So I think it should be harder to get at the no-check case, and you
> should have to write something that reminds you that the routine is not
> checking for you.  Thus "heap_open_noerr" (I'm not particularly wedded
> to that suffix, though, if anyone has a better idea for what to call
> it).  A parameter would only be useful if the same calling code might
> reasonably do different things at different times --- but either there's
> a check following the call, or there's not.

OK.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Foreign key bugs (Re: [BUGS] "New" bug?? Serious - crashesbackend.)

From
"Stephan Szabo"
Date:
> > but a larger question is why the system let you drop a table that
> > is the target of a referential integrity check (which I assume is
> > what you did to get into this state).
>
>     For me too.
>
> > Anyway, dropping the siteid trigger, as well as any others that
> > refer to gone tables, ought to get you out of trouble for now.
> > Meanwhile the foreign-key boys have some work to do ...
>
>     That's exactly the purpose of pg_trigger.tgconstrrelid, which
>     is filled with the  opposite  relations  Oid  for  constraint
>     triggers.    In  RelationRemoveTriggers(),  which  is  called
>     during DROP TABLE, theres a scan for it. That's where the
>
>         DROP TABLE implicitly drops referential ...
>
>     NOTICE message comes from. So I wonder how he got  into  that
>     state?

I don't know in his case, but I think you could get into this state
from a partial restore from pg_dump.  If you restore one of the
two tables, and create the constraint trigger for the RI_FKey_check_ins
but the other table doesn't really exist, it will crash. I just tried it on
a 7.0.2 system by making a table with an int and then defining the
check_ins trigger manually with create constraint trigger with a bad
referenced table.

Also, I realized something else that is a little wierd.  When a temporary
table shadows a permanent table that you've made a foreign key reference
to, which table should it be going to check the constraint?



Re: Foreign key bugs (Re: [BUGS] "New" bug?? Serious - crashesbackend.)

From
JanWieck@t-online.de (Jan Wieck)
Date:
Stephan Szabo wrote:
>
> Also, I realized something else that is a little wierd.  When a temporary
> table shadows a permanent table that you've made a foreign key reference
> to, which table should it be going to check the constraint?
>
   Outch  -  that hurts. Haven't checked it yet, but from what I   have  in  memory  it  should  be  a  possibility  to
violate   constraints.
 
   Damned.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #




Re: Foreign key bugs + other problems

From
"Stephan Szabo"
Date:
> Stephan Szabo wrote:
> >
> > Also, I realized something else that is a little wierd.  When a
temporary
> > table shadows a permanent table that you've made a foreign key reference
> > to, which table should it be going to check the constraint?

>     Outch  -  that hurts. Haven't checked it yet, but from what I
>     have  in  memory  it  should  be  a  possibility  to  violate
>     constraints.

Yeah, I realized it when I was going in on AlterTableAddConstraint that I
need to prevent constraints referencing temporary tables on permanent
tables, and then I realized that the shadowing is a problem.  Also, this is
a problem for other users too, what about people who log things to
other tables via rules and triggers?   At least you can't shadow the
system catalogs :-)

I think that schemas might help, if you assume that at creation time of a
rule or constraint you must qualify any tables being used in a way that
prevents misunderstandings, since temporary tables live in a different
system defined schema assuming that schema.tablename is not
shadowed, only the unadorned tablename. In the FK case, I think
that the idea of moving to keeping oids would probably be the way
to go (that way the table is very explicitly defined as a particular one).
Not that this will help right now since I don't think we can make an
SPI request that will handle it.

Or, you might be able to make a case that you CANNOT shadow a table
that is referenced by a constraint (due to the permanent table constraints
cannot reference a temporary table restriction).  Since the creation of
the temp table would break the restriction, it is illegal.

-----

In an unrelated problem.  Well, I was thinking, well, maybe we could for
this transaction during the execution of the trigger, rename the temp table
and then rename it back.  Noone else should see the change (right?) because
we're not comitted and that user isn't doing anything else while we're
checking.
However, this tickles another problem.  It seems that you can't rename a
temp
table.  In fact it does something bad:

create table z (a int);
create temp table z (a int);
alter table z rename to zz;
select * from z;
ERROR: relation 'z' does not exist
select * from zz;
- 0 rows
\q
<enter again>
select * from z;
NOTICE:  mdopen: couldn't open z: No such file or directory
NOTICE:  RelationIdBuildRelation: smgropen(z): Input/output error
NOTICE:  mdopen: couldn't open z: No such file or directory
NOTICE:  mdopen: couldn't open z: No such file or directory
NOTICE:  mdopen: couldn't open z: No such file or directory
NOTICE:  mdopen: couldn't open z: No such file or directory
ERROR:  cannot open relation z
select * from zz;
- 0 rows






Re: Foreign key bugs + other problems

From
JanWieck@t-online.de (Jan Wieck)
Date:
Stephan Szabo wrote:
> > Stephan Szabo wrote:
> > >
> > > Also, I realized something else that is a little wierd.  When a
> temporary
> > > table shadows a permanent table that you've made a foreign key reference
> > > to, which table should it be going to check the constraint?
>
> >     Outch  -  that hurts. Haven't checked it yet, but from what I
> >     have  in  memory  it  should  be  a  possibility  to  violate
> >     constraints.
>
> Yeah, I realized it when I was going in on AlterTableAddConstraint that I
> need to prevent constraints referencing temporary tables on permanent
> tables, and then I realized that the shadowing is a problem.  Also, this is
> a problem for other users too, what about people who log things to
> other tables via rules and triggers?   At least you can't shadow the
> system catalogs :-)
   I  think triggers are in general problematic in this context.   They usually use SPI and name the tables in the
querystring.  If  a  trigger  uses  saved  plans  (as  RI  does  as much as   possible), the problem  is  gone  if  the
trigger  has  been   invoked  once and prepared all the plans. But if it's invoked   the first time while a temp table
exists,it'll do the  wrong   things and save the wrong plan for future invocations.
 
   Rules  aren't  affected,  because they refer to tables by OID   allways.

> [...]
>
> Or, you might be able to make a case that you CANNOT shadow a table
> that is referenced by a constraint (due to the permanent table constraints
> cannot reference a temporary table restriction).  Since the creation of
> the temp table would break the restriction, it is illegal.
   Good point. What does the standard say about it? Can a table,   referred   to  by  foreign  key  constraints  or
referential  actions, be shadowed by a temp table?
 


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #




"Stephan Szabo" <sszabo@kick.com> writes:
> Also, I realized something else that is a little wierd.  When a temporary
> table shadows a permanent table that you've made a foreign key reference
> to, which table should it be going to check the constraint?

Seems to me it should certainly be going to the permanent table, which
is another argument in favor of making the link via OID not table name.
The existing code will get this wrong.
        regards, tom lane


Re: Foreign key bugs + other problems

From
Bruce Momjian
Date:
> > Or, you might be able to make a case that you CANNOT shadow a table
> > that is referenced by a constraint (due to the permanent table constraints
> > cannot reference a temporary table restriction).  Since the creation of
> > the temp table would break the restriction, it is illegal.
> 
>     Good point. What does the standard say about it? Can a table,
>     referred   to  by  foreign  key  constraints  or  referential
>     actions, be shadowed by a temp table?

Maybe we should prevent shadowing of a table that is used as a foreign
key.


--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Foreign key bugs + other problems

From
"Stephan Szabo"
Date:
>     Rules  aren't  affected,  because they refer to tables by OID
>     always.
Ah, that's good. I hadn't tried that variation because I don't really
use rules that much other than select rules for views.

> > Or, you might be able to make a case that you CANNOT shadow a table
> > that is referenced by a constraint (due to the permanent table
constraints
> > cannot reference a temporary table restriction).  Since the creation of
> > the temp table would break the restriction, it is illegal.
>
>     Good point. What does the standard say about it? Can a table,
>     referred   to  by  foreign  key  constraints  or  referential
>     actions, be shadowed by a temp table?
Well, that's the question.  I don't see anything in the spec saying that it
can't precisely.  It's a question of wording on the rules about the
constraint.

11.8 Syntax rule 4 Case a)
If the referencing table is a permanent base table, then the referenced
table
shall be a persistant base table.

So, if you shadowed the table by a temp table, would this no longer be true
(and therefore the create is illegal) or are you supposed to imply that you
still reference the persistant base table despite the fact it is shadowed?
I'd
guess the latter because it's a syntax rule (I had thought it was a general
rule until I went to look it up).




Re: Foreign key bugs + other problems

From
Tom Lane
Date:
"Stephan Szabo" <sszabo@kick.com> writes:
> However, this tickles another problem.  It seems that you can't rename
> a temp table.  In fact it does something bad:

Are you using current CVS?  I thought I'd fixed that recently.
        regards, tom lane


Re: Foreign key bugs (Re: [BUGS] "New" bug?? Serious - crashesbackend.)

From
"Stephan Szabo"
Date:
> "Stephan Szabo" <sszabo@kick.com> writes:
> > Also, I realized something else that is a little wierd.  When a
temporary
> > table shadows a permanent table that you've made a foreign key reference
> > to, which table should it be going to check the constraint?
>
> Seems to me it should certainly be going to the permanent table, which
> is another argument in favor of making the link via OID not table name.
> The existing code will get this wrong.

Can I force the SPI query that's being generated to use the permanent
table rather than the shadowed table when they have the same name?
If not, then storing the oid isn't sufficient without moving away from
SPI.  I do agree that storing the oids is a good idea (and am planning to
change it unless someone comes up with a compelling reason not to)
since the only way via something like SPI that I can think of is once
we have schemas, using schemaname.tablename which may not be
shadowed by the temp table and it'll just be easier for everyone
involved if we store the oid.



Re: Foreign key bugs + other problems

From
"Stephan Szabo"
Date:
The machine I tried it on is only a 7.0.2 system.  My CVS system
is at home on a unconnected dialup, so I can't try it from work.  
I'll try it when I get back tonight.


> "Stephan Szabo" <sszabo@kick.com> writes:
> > However, this tickles another problem.  It seems that you can't rename
> > a temp table.  In fact it does something bad:
> 
> Are you using current CVS?  I thought I'd fixed that recently.




Re: Foreign key bugs + other problems

From
JanWieck@t-online.de (Jan Wieck)
Date:
Bruce Momjian wrote:
> > > Or, you might be able to make a case that you CANNOT shadow a table
> > > that is referenced by a constraint (due to the permanent table constraints
> > > cannot reference a temporary table restriction).  Since the creation of
> > > the temp table would break the restriction, it is illegal.
> >
> >     Good point. What does the standard say about it? Can a table,
> >     referred   to  by  foreign  key  constraints  or  referential
> >     actions, be shadowed by a temp table?
>
> Maybe we should prevent shadowing of a table that is used as a foreign
> key.
   And  open  another  DOS  attack  possibility? Each user could   potentially create a permanent table of a name used
asa temp   table somewhere inside of an application. Then create another   referencing it and boom - the application
goesdown. Not that   good IMHO.
 
   I  think  we  need  to hand OID's to the RI triggers and have   some special syntax to  tell  the  parser  that  a
table or   attribute is named by OID in the actual query.
 
   Will  cause  alot  of  headaches for the next pg_dump/restore   cycle.
   (Bruce: still waiting for your call)


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #




Re: Foreign key bugs (Re: [BUGS] "New" bug?? Serious - crashesbackend.)

From
JanWieck@t-online.de (Jan Wieck)
Date:
Tom Lane wrote:
> "Stephan Szabo" <sszabo@kick.com> writes:
> > Also, I realized something else that is a little wierd.  When a temporary
> > table shadows a permanent table that you've made a foreign key reference
> > to, which table should it be going to check the constraint?
>
> Seems to me it should certainly be going to the permanent table, which
> is another argument in favor of making the link via OID not table name.
> The existing code will get this wrong.
   But  even if the trigger knows the OID of the table to query,   can it prepare a plan to do so via SPI? I think no.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #




Re: Foreign key bugs + other problems

From
Mike Mascari
Date:
Jan Wieck wrote:
> > Maybe we should prevent shadowing of a table that is used as a foreign
> > key.
> 
>     And  open  another  DOS  attack  possibility? Each user could
>     potentially create a permanent table of a name used as a temp
>     table somewhere inside of an application. Then create another
>     referencing it and boom - the application goes down. Not that
>     good IMHO.
> 
>     I  think  we  need  to hand OID's to the RI triggers and have
>     some special syntax to  tell  the  parser  that  a  table  or
>     attribute is named by OID in the actual query.
> 
>     Will  cause  alot  of  headaches for the next pg_dump/restore
>     cycle.
> 
>     (Bruce: still waiting for your call)
> 
> Jan

Just curious, but couldn't this also be done in the TRUNCATE
TABLE case? Where the consensus was to disallow TRUNCATE TABLE on
a table whose attribute was the foreign key of another relation?
It seems that proper implementation of SCHEMA and a finer
granularity of GRANT/REVOKE is need to address these issues.

Mike Mascari