Re: plperlu stored procedure seems to freeze for a minute - Mailing list pgsql-general

From Peter J. Holzer
Subject Re: plperlu stored procedure seems to freeze for a minute
Date
Msg-id 20151202152613.GB10220@hjp.at
Whole thread Raw
In response to Re: plperlu stored procedure seems to freeze for a minute  ("Peter J. Holzer" <hjp-pgsql@hjp.at>)
Responses Re: plperlu stored procedure seems to freeze for a minute  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
List pgsql-general
On 2015-12-01 20:55:02 +0100, Peter J. Holzer wrote:
> On 2015-12-01 18:58:31 +0100, Peter J. Holzer wrote:
> > I suspect such an interaction because I cannot reproduce the problem
> > outside of a stored procedure. A standalone Perl script doing the same
> > requests doesn't get a timeout.
[...]
> The strace doesn't show a reason for the SIGALRM, though. No alarm(2) or
> setitimer(2) system call (I connected strace to a running postgres
> process just after I got the prompt from "psql" and before I typed
> "select * from mb_search('export');" (I used a different (but very
> similar) stored procedure for those tests because it is much easier to
> find a search which is slow enough to trigger a timeout at least
> sometimes than a data request (which normally finishes in
> milliseconds)).
>
> So I guess my next task will be to find out where that SIGALRM comes
> from and/or whether I can just restart the zmq_msg_recv if it happens.

Ok, I think I know where that SIGALRM comes from: It's the
AuthenticationTimeout. What I'm seeing in strace (if I attach it early
enough) is that during authentication the postgres worker process calls
setitimer with a 60 second timeout twice. This matches the comment in
backend/postmaster/postmaster.c:

         * Note: AuthenticationTimeout is applied here while waiting for the
         * startup packet, and then again in InitPostgres for the duration of any
         * authentication operations.  So a hostile client could tie up the
         * process for nearly twice AuthenticationTimeout before we kick him off.

As explained in backend/utils/misc/timeout.c, the timers are never
cancelled: If a timeout is cancelled, postgres just sees that it has
nothing to do and resumes whatever it is doing.

This is also what I'm seeing: 60 seconds after start, the process
receives a SIGALRM.

If the process is idle or in a "normal" SQL statement at the time, thats
not a problem. But if it is in one of my stored procedures which is
currently calling a ØMQ function which is waiting for some I/O
(zmq_msg_recv(), most likely), that gets interrupted and returns an
error which my code doesn't know how to handle (yet). So the error gets
back to the user.

A strange interaction between postgres and ØMQ indeed. But now that I
know what's causing it I can handle that. Thanks for your patience.

    hp


--
   _  | Peter J. Holzer    | I want to forget all about both belts and
|_|_) |                    | suspenders; instead, I want to buy pants
| |   | hjp@hjp.at         | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/

Attachment

pgsql-general by date:

Previous
From: David Rowley
Date:
Subject: Re: [GENERAL] how to import "where exists(subquery)" EXISTS CONDITION performance?
Next
From: Leonardo M. Ramé
Date:
Subject: Could not connect to server: No buffer space available (0x00002747/10055)