Re: pgsql/src backend/tcop/postgres.c include/misc ... - Mailing list pgsql-committers

From Tom Lane
Subject Re: pgsql/src backend/tcop/postgres.c include/misc ...
Date
Msg-id 26178.1010340954@sss.pgh.pa.us
Whole thread Raw
In response to Re: pgsql/src backend/tcop/postgres.c include/misc ...  ("Hiroshi Inoue" <Inoue@tpf.co.jp>)
List pgsql-committers
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
>> Doesn't bother me a whole lot; I don't think that's what the die
>> interrupt is for.  In my mind the main reason die() exists is to
>> behave reasonably when the system is being shut down and init has
>> sent SIGTERM to all processes.

> In my mind the main reason die() exists is to kill individual
> backends which seems to be in trouble without causing
> the database-wide restart.

[ raises eyebrow ]  That isn't recommended procedure or even documented
anywhere, AFAIR.  But if you are using SIGTERM for that purpose then you
should certainly want some confidence that the killed backend hasn't
left things in a bad state.

> Before 7.1 QueryCancel flag was checked at the points
> CHECK_FOR_INTERRUPTS are currently placed.
> But the QueryCancel flag had nothing to do with die
> interrupts.

Indeed, and before 7.1 killing a backend with SIGTERM at a random time
was horribly dangerous.  I did a bunch of retail patching at one point:

2001-01-12 16:53  tgl

    * src/: backend/access/heap/heapam.c,
    backend/access/nbtree/nbtinsert.c, backend/access/nbtree/nbtpage.c,
    backend/access/transam/xact.c, backend/access/transam/xlog.c,
    backend/commands/sequence.c, backend/commands/vacuum.c,
    backend/storage/buffer/bufmgr.c, backend/storage/file/fd.c,
    backend/storage/ipc/spin.c, backend/storage/lmgr/proc.c,
    backend/tcop/postgres.c, backend/utils/cache/temprel.c,
    backend/utils/init/postinit.c, backend/utils/mmgr/aset.c,
    include/access/xlog.h, include/utils/elog.h: Add more
    critical-section calls: all code sections that hold spinlocks are
    now critical sections, so as to ensure die() won't interrupt us
    while we are munging shared-memory data structures.  Avoid insecure
    intermediate states in some code that proc_exit will call, like
    palloc/pfree.  Rename START/END_CRIT_CODE to
    START/END_CRIT_SECTION, since that seems to be what people tend to
    call them anyway, and make them be called with () like a function
    call, in hopes of not confusing pg_indent.  I doubt that this is
    sufficient to make SIGTERM safe anywhere; there's just too much
    code that could get invoked during proc_exit().

and then gave up and proposed the current scheme.  See

http://archives.postgresql.org/pgsql-hackers/2001-01/msg00147.php
http://fts.postgresql.org/db/mw/msg.html?mid=1277517
http://fts.postgresql.org/db/mw/msg.html?mid=1277531

The notion of checking at every spinlock boundary seems to have been
tossed out without much thought in
http://fts.postgresql.org/db/mw/msg.html?mid=1277532
As the instigator of that approach I feel entitled to decide that
it wasn't such a good idea after all ;-).

I put a bunch of CHECK_FOR_INTERRUPTS() into various outer loops
yesterday (vacuum, create index, sort --- btw, there are no spinlocks
grabbed or released while sorting, so the old approach didn't help
there anyway).  I'm happy to throw in more as we discover we need 'em.
But I don't think it's a good idea to have them occur implicitly in
operations as low-level as grabbing or releasing spinlocks.

            regards, tom lane

pgsql-committers by date:

Previous
From: momjian@postgresql.org
Date:
Subject: pgsql/doc/src/sgml/ref ecpg-ref.sgml
Next
From: tgl@postgresql.org
Date:
Subject: pgsql/src/backend/postmaster postmaster.c