Thread: pgsql/src backend/tcop/postgres.c include/misc ...
CVSROOT: /cvsroot Module name: pgsql Changes by: tgl@postgresql.org 02/01/01 18:16:22 Modified files: src/backend/tcop: postgres.c src/include : miscadmin.h Log message: Do not accept interrupts in RESUME_INTERRUPTS() and END_CRIT_SECTION() macros, but only at explicit CHECK_FOR_INTERRUPTS() calls. Not clear whether overenthusiastic acceptance of interrupts accounts for any real bugs, but it definitely seems risky and unnecessary.
tgl@postgresql.org wrote: > > CVSROOT: /cvsroot > Module name: pgsql > Changes by: tgl@postgresql.org 02/01/01 18:16:22 > > Modified files: > src/backend/tcop: postgres.c > src/include : miscadmin.h > > Log message: > Do not accept interrupts in RESUME_INTERRUPTS() and END_CRIT_SECTION() > macros, but only at explicit CHECK_FOR_INTERRUPTS() calls. Does this change also affect the SIGTERM handling (not only query-cancel) ? regards, Hiroshi Inoue
Hiroshi Inoue <Inoue@tpf.co.jp> writes: > Oh I see. But this seems to change the behabior significantly > at least for die signals. Well, it considerably reduces the number of places at which either signal will be accepted, but that's exactly the point. The code as written was accepting the signals in many more places than we envisioned in the original discussion, and I'm unconvinced that that's safe. AFAIK this should at worst increase the interrupt response time from order-of-microseconds to order-of-milliseconds, so I'm not especially worried. Sub-second response time is plenty good enough for either kind of interrupt, IMHO. regards, tom lane
Tom Lane wrote: > > Hiroshi Inoue <Inoue@tpf.co.jp> writes: > > Oh I see. But this seems to change the behabior significantly > > at least for die signals. > > Well, it considerably reduces the number of places at which either > signal will be accepted, but that's exactly the point. The code > as written was accepting the signals in many more places than we > envisioned in the original discussion, and I'm unconvinced that > that's safe. > > AFAIK this should at worst increase the interrupt response time > from order-of-microseconds to order-of-milliseconds, so I'm not > especially worried. Sub-second response time is plenty good enough > for either kind of interrupt, IMHO. When are cancel or die interrupts accepted while executing a long query ? regards, Hiroshi Inoue
Hiroshi Inoue <Inoue@tpf.co.jp> writes: > ExecProcNode doesn't return unless > it finds a tuple which satisfies the qualification. Oh, of course you are right. I guess there should be CHECK_FOR_INTERRUPTS in the per-tuple loops of nodeSeqscan.c and nodeIndexscan.c, as well. Joins should be okay, since they get their inputs from lower-level ExecProcNode calls. regards, tom lane
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > > Hiroshi Inoue <Inoue@tpf.co.jp> writes: > > ExecProcNode doesn't return unless > > it finds a tuple which satisfies the qualification. > > Oh, of course you are right. I guess there should be > CHECK_FOR_INTERRUPTS in the per-tuple loops of nodeSeqscan.c > and nodeIndexscan.c, as well. Joins should be okay, since they > get their inputs from lower-level ExecProcNode calls. Now I realize that 7.1 already changed the handling of die interrupts fundamentally. For example we can't kill the backend which is in a trouble with an infinite loop. Was it an intended change ? regards, Hiroshi Inoue
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > Now I realize that 7.1 already changed the handling of > die interrupts fundamentally. For example we can't kill > the backend which is in a trouble with an infinite loop. > Was it an intended change ? Doesn't bother me a whole lot; I don't think that's what the die interrupt is for. In my mind the main reason die() exists is to behave reasonably when the system is being shut down and init has sent SIGTERM to all processes. Shared memory needs to be left in a good state because the postmaster is going to try to run a checkpoint. (Otherwise we'd just SIGQUIT all the backends.) Therefore, it's more important to have a clean shutdown than to have an instantaneous one. regards, tom lane
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > > "Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > > Now I realize that 7.1 already changed the handling of > > die interrupts fundamentally. For example we can't kill > > the backend which is in a trouble with an infinite loop. > > Was it an intended change ? > > Doesn't bother me a whole lot; I don't think that's what the die > interrupt is for. In my mind the main reason die() exists is to > behave reasonably when the system is being shut down and init has > sent SIGTERM to all processes. In my mind the main reason die() exists is to kill individual backends which seems to be in trouble without causing the database-wide restart. Before 7.1 QueryCancel flag was checked at the points CHECK_FOR_INTERRUPTS are currently placed. But the QueryCancel flag had nothing to do with die interrupts. regards, Hiroshi Inoue
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes: >> Doesn't bother me a whole lot; I don't think that's what the die >> interrupt is for. In my mind the main reason die() exists is to >> behave reasonably when the system is being shut down and init has >> sent SIGTERM to all processes. > In my mind the main reason die() exists is to kill individual > backends which seems to be in trouble without causing > the database-wide restart. [ raises eyebrow ] That isn't recommended procedure or even documented anywhere, AFAIR. But if you are using SIGTERM for that purpose then you should certainly want some confidence that the killed backend hasn't left things in a bad state. > Before 7.1 QueryCancel flag was checked at the points > CHECK_FOR_INTERRUPTS are currently placed. > But the QueryCancel flag had nothing to do with die > interrupts. Indeed, and before 7.1 killing a backend with SIGTERM at a random time was horribly dangerous. I did a bunch of retail patching at one point: 2001-01-12 16:53 tgl * src/: backend/access/heap/heapam.c, backend/access/nbtree/nbtinsert.c, backend/access/nbtree/nbtpage.c, backend/access/transam/xact.c, backend/access/transam/xlog.c, backend/commands/sequence.c, backend/commands/vacuum.c, backend/storage/buffer/bufmgr.c, backend/storage/file/fd.c, backend/storage/ipc/spin.c, backend/storage/lmgr/proc.c, backend/tcop/postgres.c, backend/utils/cache/temprel.c, backend/utils/init/postinit.c, backend/utils/mmgr/aset.c, include/access/xlog.h, include/utils/elog.h: Add more critical-section calls: all code sections that hold spinlocks are now critical sections, so as to ensure die() won't interrupt us while we are munging shared-memory data structures. Avoid insecure intermediate states in some code that proc_exit will call, like palloc/pfree. Rename START/END_CRIT_CODE to START/END_CRIT_SECTION, since that seems to be what people tend to call them anyway, and make them be called with () like a function call, in hopes of not confusing pg_indent. I doubt that this is sufficient to make SIGTERM safe anywhere; there's just too much code that could get invoked during proc_exit(). and then gave up and proposed the current scheme. See http://archives.postgresql.org/pgsql-hackers/2001-01/msg00147.php http://fts.postgresql.org/db/mw/msg.html?mid=1277517 http://fts.postgresql.org/db/mw/msg.html?mid=1277531 The notion of checking at every spinlock boundary seems to have been tossed out without much thought in http://fts.postgresql.org/db/mw/msg.html?mid=1277532 As the instigator of that approach I feel entitled to decide that it wasn't such a good idea after all ;-). I put a bunch of CHECK_FOR_INTERRUPTS() into various outer loops yesterday (vacuum, create index, sort --- btw, there are no spinlocks grabbed or released while sorting, so the old approach didn't help there anyway). I'm happy to throw in more as we discover we need 'em. But I don't think it's a good idea to have them occur implicitly in operations as low-level as grabbing or releasing spinlocks. regards, tom lane
Tom Lane wrote: > > "Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > >> Doesn't bother me a whole lot; I don't think that's what the die > >> interrupt is for. In my mind the main reason die() exists is to > >> behave reasonably when the system is being shut down and init has > >> sent SIGTERM to all processes. > > > In my mind the main reason die() exists is to kill individual > > backends which seems to be in trouble without causing > > the database-wide restart. > > [ raises eyebrow ] That isn't recommended procedure or even documented > anywhere, AFAIR. I don't call it a dbms unless it has a will to limit a trouble locally. Anyway it seems too late to complain. I was foolish enough to have overlooked the very significant change that introduced the dominant ImmediateInterruptOK variable. Sigh... Where were my eyes ? regards, Hiroshi Inoue
Tom Lane wrote: > > "Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > > > Before 7.1 QueryCancel flag was checked at the points > > CHECK_FOR_INTERRUPTS are currently placed. > > But the QueryCancel flag had nothing to do with die > > interrupts. > > Indeed, and before 7.1 killing a backend with SIGTERM at a random time > was horribly dangerous. I did a bunch of retail patching at one point: > > 2001-01-12 16:53 tgl > > * src/: backend/access/heap/heapam.c, > backend/access/nbtree/nbtinsert.c, backend/access/nbtree/nbtpage.c, > backend/access/transam/xact.c, backend/access/transam/xlog.c, > backend/commands/sequence.c, backend/commands/vacuum.c, > backend/storage/buffer/bufmgr.c, backend/storage/file/fd.c, > backend/storage/ipc/spin.c, backend/storage/lmgr/proc.c, > backend/tcop/postgres.c, backend/utils/cache/temprel.c, > backend/utils/init/postinit.c, backend/utils/mmgr/aset.c, > include/access/xlog.h, include/utils/elog.h: Add more > critical-section calls: all code sections that hold spinlocks are > now critical sections, so as to ensure die() won't interrupt us > while we are munging shared-memory data structures. Avoid insecure > intermediate states in some code that proc_exit will call, like > palloc/pfree. Rename START/END_CRIT_CODE to > START/END_CRIT_SECTION, since that seems to be what people tend to > call them anyway, and make them be called with () like a function > call, in hopes of not confusing pg_indent. I doubt that this is > sufficient to make SIGTERM safe anywhere; there's just too much > code that could get invoked during proc_exit(). > > and then gave up and proposed the current scheme. I think the much more significant change is the following one not the above one. ImmediateInterruptOK flag was introduced and the flag is set to false except when the backends are idle. I must have checked and objected to the change then. regards, Hiroshi Inoue CVSROOT: /home/projects/pgsql/cvsroot Module name: pgsql Changes by: tgl@hub.org 01/01/14 00:08:17 Modified files: src/backend/access/nbtree: nbtinsert.c src/backend/access/transam: xact.c xlog.c src/backend/bootstrap: bootstrap.c src/backend/commands: vacuum.c analyze.c copy.c src/backend/executor: execProcnode.c src/backend/storage/buffer: bufmgr.c s_lock.c src/backend/storage/ipc: ipc.c spin.c src/backend/storage/lmgr: lock.c proc.c src/backend/tcop: postgres.c src/backend/utils/error: elog.c src/backend/utils/init: globals.c src/include/access: xlog.h src/include : miscadmin.h src/include/storage: proc.h ipc.h src/include/tcop: tcopprot.h src/include/utils: elog.h src/interfaces/ecpg/preproc: pgc.l Log message: Restructure backend SIGINT/SIGTERM handling so that 'die' interrupts are treated more like 'cancel' interrupts: the signal handler sets a flag that is examined at well-defined spots, rather than trying to cope with an interrupt that might happen anywhere. See pghackers discussion of 1/12/01.
Hiroshi Inoue <Inoue@tpf.co.jp> writes: > I think the much more significant change is the following > one not the above one. Well, yes; the "above" change was just a last-gasp attempt to make the old scheme work, whereas the "following" change introduced the new scheme. But there is no way that we can allow proc_exit to be invoked at any random instant; it has to be constrained to places where all the infrastructure state is consistent. (Or at least all the state of the infrastructure modules that proc_exit will invoke. But that covers an awful lot of territory.) In particular, since routines invoked during proc_exit rely on both shared and private state, I don't believe it can be safe to allow proc_exit at any spinlock boundary. We could possibly make it safe by introducing enough HOLD/RESUME_INTERRUPTS calls --- but that direction requires that we be forever vigilant to ensure that we've locked interrupts everywhere we have to; and there's no way to notice if we've failed to. I'd much rather take the other approach of allowing the interrupt only at certain specified spots. The failure mode here is that a cancel/die interrupt doesn't get handled as soon as you'd like. I consider that better than the failure mode of the other approach, which could be undetected data corruption. regards, tom lane
Tom Lane wrote: > > Hiroshi Inoue <Inoue@tpf.co.jp> writes: > > I think the much more significant change is the following > > one not the above one. > > Well, yes; the "above" change was just a last-gasp attempt to make the > old scheme work, whereas the "following" change introduced the new > scheme. As far as I see, the introduction of the ImmediateInterruptOK flag made HOLD/RESUME_INTERRUPTS scheme pretty meaningless. Does 'die' interrupts still really need HOLD/RESUME_INTERRUPTS scheme ? If 'die' interrupts are only for normal shutdown, even LockWaitCancel() isn't needed. regards, Hiroshi Inoue
Hiroshi Inoue <Inoue@tpf.co.jp> writes: > For example I think RESUME_INTERRUPTS should > have been > #define RESUME_INTERRUPTS() \ > do { \ > Assert(InterruptHoldoffCount > 0); \ > InterruptHoldoffCount--; \ > ! if (ImmediateInterruptOK && InterruptPending) \ > ProcessInterrupts(); \ > } while(0) But that's only useful if ImmediateInterruptOK is true most of the time; which I think is far too risky an approach. We'd end up having to put HOLD/RESUME_INTERRUPTS calls in an awful lot of places. Right now it's true that HOLD/RESUME_INTERRUPTS isn't absolutely necessary; we could eliminate it as long as we were very careful about where we put CHECK_FOR_INTERRUPTS calls. However, I like having it as an extra (and very cheap) measure of security that an interrupt won't be accepted in critical sections of code. Basically it lets us be slightly less worried about where the CHECK_FOR_INTERRUPTS calls can safely go. > In my impression 1 year ago you introduced > 2 pretty opposed schemes in a few days. I prefer to think of 'em as "complementary" schemes ;-). > What I meant was to not accept 'die' interrupts immdiately > while waiting for a lock. The lock would be released > naturally by other backends. That would work if we only cared about using "die" for system-wide shutdown; but aren't you the one arguing that it should have other uses? If I can't use "die" to kick a selected backend off a lock, I wouldn't think retail "die" interrupts would be very useful... regards, tom lane
Tom Lane wrote: > > Hiroshi Inoue <Inoue@tpf.co.jp> writes: > > As far as I see, the introduction of the ImmediateInterruptOK > > flag made HOLD/RESUME_INTERRUPTS scheme pretty meaningless. > > Not at all. The point of HOLD_INTERRUPTS is to disable any > CHECK_FOR_INTERRUPTS call that might be issued by subroutines > you call. That's very different from ImmediateInterruptOK, which > can be set true only in *extremely* limited areas wherein we can > fully understand the behavior of executing the cancel/die request > in the signal handler. As you know there weren't so many CHECK_FOR_INTERRUPTS as we are worried about and we can check ImmediateInterruptOK if necessary. For example I think RESUME_INTERRUPTS should have been #define RESUME_INTERRUPTS() \ do { \ Assert(InterruptHoldoffCount > 0); \ InterruptHoldoffCount--; \ ! if (ImmediateInterruptOK && InterruptPending) \ ProcessInterrupts(); \ } while(0) from the first. In my impression 1 year ago you introduced 2 pretty opposed schemes in a few days. > > > Does 'die' interrupts still really need HOLD/RESUME_INTERRUPTS > > scheme ? If 'die' interrupts are only for normal shutdown, > > even LockWaitCancel() isn't needed. > > It's needed for cancels. Possibly we could skip it during shutdown, > but trying to do that seems risky and pointless. (If we skip it > then we are leaving the lock-manager shared memory in a bad state, > which is exactly what die() should not do.) What I meant was to not accept 'die' interrupts immdiately while waiting for a lock. The lock would be released naturally by other backends. regards, Hiroshi Inoue
Tom Lane wrote: > > Hiroshi Inoue <Inoue@tpf.co.jp> writes: > > For example I think RESUME_INTERRUPTS should > > have been > > > #define RESUME_INTERRUPTS() \ > > do { \ > > Assert(InterruptHoldoffCount > 0); \ > > InterruptHoldoffCount--; \ > > ! if (ImmediateInterruptOK && InterruptPending) \ > > ProcessInterrupts(); \ > > } while(0) > > But that's only useful if ImmediateInterruptOK is true most of the time; But it seems what your scheme(ImmediateInterruptOK) requires. > > In my impression 1 year ago you introduced > > 2 pretty opposed schemes in a few days. > > I prefer to think of 'em as "complementary" schemes ;-). You made a supplementary change first and a very significant change second. Who would think the first change has little meaning ? In addition I could find no explantion about ImmediateInterruptOK whereas there seems a nice description about HOLD/RESUME_INTERRUPTS scheme. > > > What I meant was to not accept 'die' interrupts immdiately > > while waiting for a lock. The lock would be released > > naturally by other backends. > > That would work if we only cared about using "die" for system-wide > shutdown; but aren't you the one arguing that it should have other > uses? If I can't use "die" to kick a selected backend off a lock, > I wouldn't think retail "die" interrupts would be very useful... Of cource I'd not like to limit the use of 'die' interrupts for normal shutdown. However other developers seem to be comforatable with it. regards, Hiroshi Inoue
Hiroshi Inoue <Inoue@tpf.co.jp> writes: > When are cancel or die interrupts accepted while > executing a long query ? CHECK_FOR_INTERRUPTS() calls. The call in ExecProcNode should alone suffice for most queries, I think. regards, tom lane
Hiroshi Inoue <Inoue@tpf.co.jp> writes: > Does this change also affect the SIGTERM handling > (not only query-cancel) ? Yes. Cancel and die interrupts are accepted at the same places. regards, tom lane
Hiroshi Inoue <Inoue@tpf.co.jp> writes: > Cancel interrupts are accepted at CHECK_FOR_INTERRUPTS > or while waiting for a lock. > Die interrupts are accepted at CHECK_FOR_INTERRUPTS or > at the time when they are received while interrupts > are allowed. Is my understanding right ? No, I don't think so. Both types of interrupts can be honored during the signal handler only if ImmediateInterruptOK is true. It is true *only* while waiting for a lock or while waiting for client input. regards, tom lane
Tom Lane wrote: > > Hiroshi Inoue <Inoue@tpf.co.jp> writes: > > Cancel interrupts are accepted at CHECK_FOR_INTERRUPTS > > or while waiting for a lock. > > Die interrupts are accepted at CHECK_FOR_INTERRUPTS or > > at the time when they are received while interrupts > > are allowed. Is my understanding right ? > > No, I don't think so. Both types of interrupts can be honored during > the signal handler only if ImmediateInterruptOK is true. It is true > *only* while waiting for a lock or while waiting for client input. Oh I see. But this seems to change the behabior significantly at least for die signals. regards, Hiroshi Inoue
Tom Lane wrote: > > Hiroshi Inoue <Inoue@tpf.co.jp> writes: > > Does this change also affect the SIGTERM handling > > (not only query-cancel) ? > > Yes. Cancel and die interrupts are accepted at the same places. Cancel interrupts are accepted at CHECK_FOR_INTERRUPTS or while waiting for a lock. Die interrupts are accepted at CHECK_FOR_INTERRUPTS or at the time when they are received while interrupts are allowed. Is my understanding right ? If so, isn't it strange ? regards, Hiroshi Inoue
Hiroshi Inoue <Inoue@tpf.co.jp> writes: > As far as I see, the introduction of the ImmediateInterruptOK > flag made HOLD/RESUME_INTERRUPTS scheme pretty meaningless. Not at all. The point of HOLD_INTERRUPTS is to disable any CHECK_FOR_INTERRUPTS call that might be issued by subroutines you call. That's very different from ImmediateInterruptOK, which can be set true only in *extremely* limited areas wherein we can fully understand the behavior of executing the cancel/die request in the signal handler. > Does 'die' interrupts still really need HOLD/RESUME_INTERRUPTS > scheme ? If 'die' interrupts are only for normal shutdown, > even LockWaitCancel() isn't needed. It's needed for cancels. Possibly we could skip it during shutdown, but trying to do that seems risky and pointless. (If we skip it then we are leaving the lock-manager shared memory in a bad state, which is exactly what die() should not do.) regards, tom lane
Tom Lane wrote: > > Hiroshi Inoue <Inoue@tpf.co.jp> writes: > > When are cancel or die interrupts accepted while > > executing a long query ? > > CHECK_FOR_INTERRUPTS() calls. The call in ExecProcNode should alone > suffice for most queries, I think. I couldn't cancel VACUUM using current cvs. I couldn't cancel simple SELECT which returns no row either. ... IIRC cancel interrupts before 7.1 was so but die interrupts has never been so. regards, Hiroshi Inoue
Hiroshi Inoue <Inoue@tpf.co.jp> writes: > I couldn't cancel VACUUM using current cvs. Hm. There should probably be a CHECK_FOR_INTERRUPTS in the per- tuple (or at worst per-page) loops in vacuum. > I couldn't cancel simple SELECT which returns no row either. How long could that take? regards, tom lane
Tom Lane wrote: > > Hiroshi Inoue <Inoue@tpf.co.jp> writes: > > > I couldn't cancel simple SELECT which returns no row either. > > How long could that take? Hmm I may be misunderstanding your point. The larger the table size is, the longer it should take. For examle it takes about 10000 milliseconds in an offhand test case. ExecProcNode doesn't return unless it finds a tuple which satisfies the qualification. regards, Hiroshi Inoue