Thread: Re: possible bug not in open items
On Thu, 2009-03-26 at 21:45 -0400, Bruce Momjian wrote: > > http://archives.postgresql.org/pgsql-bugs/2009-03/msg00062.php > > > > It may or may not be a real bug, but I didn't receive any response. If > > you think it might be a bug, can you please add it to the open items? > > Hmm, odd I don't have it either; can you repost it? The docs say: "SIGINT -- The server disallows new connections and sends all existing server processes SIGTERM, which will cause them to abort their current transactions and exit promptly." http://www.postgresql.org/docs/8.3/static/server-shutdown.html If you have an open COPY and no data is moving, it simply won't terminate it. You can terminate it with ctrl-C from psql, but not a SIGINT to the postmaster or a SIGINT or SIGTERM to the backend. Regards, Jeff Davis
Jeff Davis wrote: > The docs say: > > "SIGINT -- The server disallows new connections and sends all existing > server processes SIGTERM, which will cause them to abort their current > transactions and exit promptly." > > http://www.postgresql.org/docs/8.3/static/server-shutdown.html > > If you have an open COPY and no data is moving, it simply won't > terminate it. You can terminate it with ctrl-C from psql, but not a > SIGINT to the postmaster or a SIGINT or SIGTERM to the backend. I actually started to looked at this when you posted, but was pre-empted with something else. Looking at the code, there's comments that actually use COPY FROM STDIN as an example of things that should not be interrupted by signals: > /* > * (2) Allow asynchronous signals to be executed immediately if they > * come in while we are waiting for client input. (This must be > * conditional since we don't want, say, reads on behalf of COPY FROM > * STDIN doing the same thing.) > */ > QueryCancelPending = false; /* forget any earlier CANCEL signal */ > DoingCommandRead = true; > > /* > * (3) read a command (loop blocks here) > */ > firstchar = ReadCommand(&input_message); But in light of your bug report, that doesn't seem quite right. We don't want to enable notify or catchup interrupts during COPY FROM, but we should react to fast shutdown and query cancel. I'm not too familiar with this code, but I think we could just enable ImmediateInterruptOK in CopyGetData(). -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Jeff Davis wrote: > "SIGINT -- The server disallows new connections and sends all existing > server processes SIGTERM, which will cause them to abort their current > transactions and exit promptly." > > http://www.postgresql.org/docs/8.3/static/server-shutdown.html > > If you have an open COPY and no data is moving, it simply won't > terminate it. You can terminate it with ctrl-C from psql, but not a > SIGINT to the postmaster or a SIGINT or SIGTERM to the backend. Tracking and grepping for pq_get* functions, there's one more place that does a blocking read like that: reading the function oid and args in a fastpath function call. Using v2 protocol. That has got to be deprecated enough to not worry about :-). Then again, it wouldn't be hard to put set ImmediateInterruptOK there as well, for the sake of completeness. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > I'm not too familiar with this code, but I think we could just enable > ImmediateInterruptOK in CopyGetData(). Only if you are wanting to break things. The reason we don't allow client read to be interrupted is the fear of losing protocol sync on an incomplete message. For the SIGTERM case this would (probably) be okay, since we aren't going to pay any more attention to the client anyway, but accepting SIGINT there is right out. regards, tom lane
On Fri, 2009-03-27 at 15:43 -0400, Tom Lane wrote: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > > I'm not too familiar with this code, but I think we could just enable > > ImmediateInterruptOK in CopyGetData(). > > Only if you are wanting to break things. > > The reason we don't allow client read to be interrupted is the fear of > losing protocol sync on an incomplete message. For the SIGTERM case > this would (probably) be okay, since we aren't going to pay any more > attention to the client anyway, but accepting SIGINT there is right out. > That's perfectly acceptable to me. I'm only concerned about the shutdown case, and that's the only case that's in conflict with the docs. Regards, Jeff Davis
On Fri, 2009-03-27 at 15:43 -0400, Tom Lane wrote: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > > I'm not too familiar with this code, but I think we could just enable > > ImmediateInterruptOK in CopyGetData(). > > Only if you are wanting to break things. > Doesn't DoingCommandRead protect us in the SIGINT case? Regards, Jeff Davis
Where are we on this? --------------------------------------------------------------------------- Heikki Linnakangas wrote: > Jeff Davis wrote: > > "SIGINT -- The server disallows new connections and sends all existing > > server processes SIGTERM, which will cause them to abort their current > > transactions and exit promptly." > > > > http://www.postgresql.org/docs/8.3/static/server-shutdown.html > > > > If you have an open COPY and no data is moving, it simply won't > > terminate it. You can terminate it with ctrl-C from psql, but not a > > SIGINT to the postmaster or a SIGINT or SIGTERM to the backend. > > Tracking and grepping for pq_get* functions, there's one more place that > does a blocking read like that: reading the function oid and args in a > fastpath function call. Using v2 protocol. That has got to be deprecated > enough to not worry about :-). Then again, it wouldn't be hard to put > set ImmediateInterruptOK there as well, for the sake of completeness. > > -- > Heikki Linnakangas > EnterpriseDB http://www.enterprisedb.com -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes: > Where are we on this? Pretty much nowhere --- there's no proposed patch, and I don't think it's exactly trivial. Do you want to put it on TODO? regards, tom lane
On Thu, 2009-04-09 at 12:59 -0400, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > Where are we on this? > > Pretty much nowhere --- there's no proposed patch, and I don't think > it's exactly trivial. Do you want to put it on TODO? Here is a patch that does what I think Heikki was suggesting. If a proper fix is non-trivial, then I assume there's some problem with my patch, but I'll post it for the archives anyway. I don't see any obvious protocol synchronization problem, and it looks like DoingCommandRead protects against that in the case of SIGINT to a backend. And in the case of SIGTERM to a backend, the connection will be terminated anyway. Regards, Jeff Davis
Attachment
Jeff Davis <pgsql@j-davis.com> writes: > Here is a patch that does what I think Heikki was suggesting. If a > proper fix is non-trivial, then I assume there's some problem with my > patch, but I'll post it for the archives anyway. This patch is so wrong that it's scary. You can't have ImmediateInterruptOK true over the duration of any significant amount of backend processing --- as an example, if you take control away in the middle of a malloc call, you'll probably be left with a corrupt malloc arena. It doesn't even work to try to take control away while control is inside, say, the OpenSSL code. The reason we have the prepare_for_client_read/client_read_ended hooks is to allow the flag to be turned on over a sufficiently narrow scope --- to wit, the recv() kernel call and nothing else --- that it's safe. AFAICS, a safe patch for this has got to involve teaching those hooks about COPY mode. regards, tom lane
On Fri, 2009-04-10 at 14:47 -0400, Tom Lane wrote: > This patch is so wrong that it's scary. You can't have > ImmediateInterruptOK true over the duration of any significant amount of > backend processing --- as an example, if you take control away in the > middle of a malloc call, you'll probably be left with a corrupt malloc > arena. > Thank you for the explanation. My initial thinking was that either DoingCommandRead would protect us (for SIGINT to the backend), or we were going to terminate the process anyway (for SIGTERM). But it sounds like it leaves us in a state so unsafe that we can't even abort the transaction nicely. Regards, Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes: > Thank you for the explanation. My initial thinking was that either > DoingCommandRead would protect us (for SIGINT to the backend), or we > were going to terminate the process anyway (for SIGTERM). But it sounds > like it leaves us in a state so unsafe that we can't even abort the > transaction nicely. Well, we could presumably do exit(1) regardless. But if the idea is to have a clean shutdown, you have to get through proc_exit(), and that requires essentially all the backend subsystems to be alive and undamaged. regards, tom lane
Was this ever addressed? --------------------------------------------------------------------------- Jeff Davis wrote: > On Thu, 2009-03-26 at 21:45 -0400, Bruce Momjian wrote: > > > http://archives.postgresql.org/pgsql-bugs/2009-03/msg00062.php > > > > > > It may or may not be a real bug, but I didn't receive any response. If > > > you think it might be a bug, can you please add it to the open items? > > > > Hmm, odd I don't have it either; can you repost it? > > The docs say: > > "SIGINT -- The server disallows new connections and sends all existing > server processes SIGTERM, which will cause them to abort their current > transactions and exit promptly." > > http://www.postgresql.org/docs/8.3/static/server-shutdown.html > > If you have an open COPY and no data is moving, it simply won't > terminate it. You can terminate it with ctrl-C from psql, but not a > SIGINT to the postmaster or a SIGINT or SIGTERM to the backend. > > Regards, > Jeff Davis -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com PG East: http://www.enterprisedb.com/community/nav-pg-east-2010.do + If your life is a hard drive, Christ can be your backup. +
On Thu, 2010-02-25 at 23:15 -0500, Bruce Momjian wrote: > Was this ever addressed? > It doesn't appear to be fixed, and I don't see it on the TODO, either. Should we add it there? Regards, Jeff Davis > --------------------------------------------------------------------------- > > Jeff Davis wrote: > > On Thu, 2009-03-26 at 21:45 -0400, Bruce Momjian wrote: > > > > http://archives.postgresql.org/pgsql-bugs/2009-03/msg00062.php > > > > > > > > It may or may not be a real bug, but I didn't receive any response. If > > > > you think it might be a bug, can you please add it to the open items? > > > > > > Hmm, odd I don't have it either; can you repost it? > > > > The docs say: > > > > "SIGINT -- The server disallows new connections and sends all existing > > server processes SIGTERM, which will cause them to abort their current > > transactions and exit promptly." > > > > http://www.postgresql.org/docs/8.3/static/server-shutdown.html > > > > If you have an open COPY and no data is moving, it simply won't > > terminate it. You can terminate it with ctrl-C from psql, but not a > > SIGINT to the postmaster or a SIGINT or SIGTERM to the backend. > > > > Regards, > > Jeff Davis >
Jeff Davis <pgsql@j-davis.com> writes: > On Thu, 2010-02-25 at 23:15 -0500, Bruce Momjian wrote: >> Was this ever addressed? > It doesn't appear to be fixed, and I don't see it on the TODO, either. > Should we add it there? +1. It likely wouldn't be real hard to fix, but given the lack of field complaints I'm not thinking we need to treat it as urgent. regards, tom lane
Tom Lane wrote: > Jeff Davis <pgsql@j-davis.com> writes: > > On Thu, 2010-02-25 at 23:15 -0500, Bruce Momjian wrote: > >> Was this ever addressed? > > > It doesn't appear to be fixed, and I don't see it on the TODO, either. > > Should we add it there? > > +1. It likely wouldn't be real hard to fix, but given the lack of field > complaints I'm not thinking we need to treat it as urgent. Added to TODO: Allow a stalled COPY to exit if the backend is terminated * http://archives.postgresql.org/pgsql-bugs/2009-04/msg00067.php -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com PG East: http://www.enterprisedb.com/community/nav-pg-east-2010.do