Thread: Re: [HACKERS] Foreign key bugs (Re: "New" bug?? Serious - crashes backend.)

JanWieck@t-online.de (Jan Wieck) writes:
> Tom Lane wrote:
>> There are at least two bugs here: the immediate cause of the crash
>> is lack of a check for heap_openr() failure in the RI trigger code,

>     Exactly where is that check missing (if it still is)?

The heap_openr calls with NoLock --- the way heap_open[r] are set up
is that there's an elog on open failure iff you request a lock, but
if you don't then you have to check for a NULL return explicitly.
Perhaps this coding convention is too error-prone and ought to be
changed to have two different routine names, say "heap_open[r]"
and "heap_open[r]_noerr".  Opinions anyone?

I had a note to myself that ri_triggers' use of NoLock was probably
a bug anyway.  Shouldn't it be acquiring *some* kind of lock on the
referenced relation?  Else someone might be deleting it out from
under you.

>> but a larger question is why the system let you drop a table that
>> is the target of a referential integrity check (which I assume is
>> what you did to get into this state).

>     For me too.

What about renaming as opposed to dropping?  Since the triggers are set
up to use names rather than OIDs, seems like they are vulnerable to a
rename.  Maybe they should be using table OIDs in their parameter lists.
(That'd make pg_dump's life harder however...)
        regards, tom lane


Re: [HACKERS] Foreign key bugs (Re: "New" bug?? Serious - crashes backend.)

From
Bruce Momjian
Date:
> JanWieck@t-online.de (Jan Wieck) writes:
> > Tom Lane wrote:
> >> There are at least two bugs here: the immediate cause of the crash
> >> is lack of a check for heap_openr() failure in the RI trigger code,
> 
> >     Exactly where is that check missing (if it still is)?
> 
> The heap_openr calls with NoLock --- the way heap_open[r] are set up
> is that there's an elog on open failure iff you request a lock, but
> if you don't then you have to check for a NULL return explicitly.
> Perhaps this coding convention is too error-prone and ought to be
> changed to have two different routine names, say "heap_open[r]"
> and "heap_open[r]_noerr".  Opinions anyone?

We already have heap_open and heap_openr.  Seems another is too hard. 
Better to give them a parameter to control it.  The API is confusing enough.



--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> Perhaps this coding convention is too error-prone and ought to be
>> changed to have two different routine names, say "heap_open[r]"
>> and "heap_open[r]_noerr".  Opinions anyone?

> We already have heap_open and heap_openr.  Seems another is too hard.
> Better to give them a parameter to control it.  The API is confusing
> enough.

It is confusing, but we have here graphic evidence that the way it's
currently done is confusing.  In a quick check, I found several other
cases of the same error that must have crept in over the past year or
so.  So I'm now convinced that we'd better change the API of these
routines to make it crystal-clear whether you are getting a check for
open failure or not.

I like a different routine name better than a check-or-no-check
parameter.  If you invoke the no-check case then you *MUST* have a check
for failure return --- forgetting to do this is exactly the problem.
So I think it should be harder to get at the no-check case, and you
should have to write something that reminds you that the routine is not
checking for you.  Thus "heap_open_noerr" (I'm not particularly wedded
to that suffix, though, if anyone has a better idea for what to call
it).  A parameter would only be useful if the same calling code might
reasonably do different things at different times --- but either there's
a check following the call, or there's not.
        regards, tom lane


Re: [HACKERS] Foreign key bugs (Re: "New" bug?? Serious - crashes backend.)

From
Bruce Momjian
Date:
> I like a different routine name better than a check-or-no-check
> parameter.  If you invoke the no-check case then you *MUST* have a check
> for failure return --- forgetting to do this is exactly the problem.
> So I think it should be harder to get at the no-check case, and you
> should have to write something that reminds you that the routine is not
> checking for you.  Thus "heap_open_noerr" (I'm not particularly wedded
> to that suffix, though, if anyone has a better idea for what to call
> it).  A parameter would only be useful if the same calling code might
> reasonably do different things at different times --- but either there's
> a check following the call, or there's not.

OK.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Foreign key bugs (Re: "New" bug?? Serious - crashes backend.)

From
JanWieck@t-online.de (Jan Wieck)
Date:
Tom Lane wrote:
>
> >> but a larger question is why the system let you drop a table that
> >> is the target of a referential integrity check (which I assume is
> >> what you did to get into this state).
>
> >     For me too.
>
> What about renaming as opposed to dropping?  Since the triggers are set
> up to use names rather than OIDs, seems like they are vulnerable to a
> rename.  Maybe they should be using table OIDs in their parameter lists.
> (That'd make pg_dump's life harder however...)

    That  at least shows how he might have gotten there. And yes,
    they need to either keep track of renamings or use OID's.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #
JanWieck@t-online.de (Jan Wieck) writes:
> Tom Lane wrote:
>> What about renaming as opposed to dropping?  Since the triggers are set
>> up to use names rather than OIDs, seems like they are vulnerable to a
>> rename.  Maybe they should be using table OIDs in their parameter lists.
>> (That'd make pg_dump's life harder however...)

>     That  at least shows how he might have gotten there. And yes,
>     they need to either keep track of renamings or use OID's.

I got mail from Ryan earlier admitting that he'd hand-edited a dump file
and reloaded it, so that was how his triggers got out of sync with the
table names.  But still, it sounds like we need to fix the RENAME case.

            regards, tom lane