Re: Idle processes chewing up CPU? - Mailing list pgsql-general

From Brendan Hill
Subject Re: Idle processes chewing up CPU?
Date
Msg-id 005401ca88fa$70ac6bf0$520543d0$@net
Whole thread Raw
In response to Re: Idle processes chewing up CPU?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Hi Tom,

I think I've confirmed the fix. Using a dirty disconnect generator, I was
able to reliably recreate the problem within about 30-60 seconds. The
symptoms were the same as before, however it occurred around SSL_write
instead of SSL_read - I assume this was due to the artificial nature of the
dirty disconnect (easier for the client to artificially break the connection
while waiting/receiving, than sending).

The solution you proposed solved it for SSL_write (ran for 30 minutes, no
runaway processes), and I think it's safe to assume SSL_read too. So I
suggest two additions:

====================================================
rloop:
+        errno = 0;

           n = SSL_read(port->ssl, ptr, len);
           err = SSL_get_error(port->ssl, n);
           switch (err)
           {
               case SSL_ERROR_NONE:
                   port->count += n;
                   break;
====================================================

And:

====================================================
wloop:
+        errno = 0;

        n = SSL_write(port->ssl, ptr, len);
        err = SSL_get_error(port->ssl, n);
        switch (err)
        {
            case SSL_ERROR_NONE:
                port->count += n;
                break;
====================================================

I'm not comfortable running my own compiled version in production (it was
rather difficult to get it working), so I'm interested to know when the next
release is planned. We can test beta copies on a non-critical load balancing
server if necessary.

Cheers,
-Brendan



-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Sunday, 27 September 2009 2:42 PM
To: Brendan Hill
Cc: 'Craig Ringer'; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Idle processes chewing up CPU?

"Brendan Hill" <brendanh@jims.net> writes:
> Makes sense to me. Seems to be happening rarely now.

> I'm not all that familiar with the open source process, is this likely to
be
> included in the next release version?

Can you confirm that that change actually fixes the problem you're
seeing?  I'm happy to apply it if it does, but I'd like to know that
the problem is dealt with.

            regards, tom lane


> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Monday, 21 September 2009 5:25 AM
> To: Brendan Hill
> Cc: 'Craig Ringer'; pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Idle processes chewing up CPU?

> "Brendan Hill" <brendanh@jims.net> writes:
>> My best interpretation is that an SSL client dirty disconnected while
>> running a request. This caused an infinite loop in pq_recvbuf(), calling
>> secure_read(), triggering my_sock_read() over and over. Calling
>> SSL_get_error() in secure_read() returns 10045 (either connection reset,
> or
>> WSAEOPNOTSUPP, I'm not sure) - after this, pq_recvbuf() appears to think
>> errno=EINTR has occurred, so it immediately tries again.

> I wonder if this would be a good idea:

>   #ifdef USE_SSL
>       if (port->ssl)
>       {
>           int            err;

>   rloop:
> +        errno = 0;
>           n = SSL_read(port->ssl, ptr, len);
>           err = SSL_get_error(port->ssl, n);
>           switch (err)
>           {
>               case SSL_ERROR_NONE:
>                   port->count += n;
>                   break;

> It looks to me like the basic issue is that pq_recvbuf is expecting
> a relevant value of errno when secure_read returns -1, and there's
> some path in the Windows case where errno doesn't get set, and if
> it just happens to have been EINTR then we've got a loop.

>             regards, tom lane


> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


pgsql-general by date:

Previous
From: Nick
Date:
Subject: Error during make when installing geos for postgis install...still trying
Next
From: Phoenix Kiula
Date:
Subject: Looking for professionals for a PG server move