Thread: PgSQL 9.1: Warning - error 10061 on Windows, no error on Linux - but connection is broken

I just discovered a non PostgreSQL problem (but I was suspecting all the
time from PostgreSQL).

I'm recording this because would save lot of time from others in the
list, since my problem is already solved.

During this day, we had very busy servers and suddenly we started to get
error 500 and 502 on our Java server, after a select, update or insert.
- Looking web server logs shows no error.
- Looking PostgreSQL logs, in Windows server I saw error "winsock error
10061", but in Linux server I've found no evidence of the problem.

After digging for an hour, I've discovered our connection pool (max 100
connections, 50 idle) have been configured (probably by me) to drop
connections if they don't return in 2 milliseconds (maxWait="2")...

HUGE mistake. Changed connection pool parameter to 60 seconds
(maxWait="60000"), and problem has gone.

Just my 2c,

Edson Richter


You do not log failed connection attempts from your Java application?

Your desire is commendable but is your only advice: "don't set connection
timeout to 2ms"?

What could these products (not you, by setting up better logging) do to
minimize the amount of time you had to spend diagnosing the problem?  If
they already can be configured to do so, and were not in your case, what
configuration option values would have helped you to diagnose more quickly
(so other do not disable/change those settings and/or why you thought to
change them in the first place)?

David J.


> -----Original Message-----
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
> owner@postgresql.org] On Behalf Of Edson Richter
> Sent: Friday, December 14, 2012 2:58 PM
> To: pgsql-general
> Subject: [GENERAL] PgSQL 9.1: Warning - error 10061 on Windows, no error
> on Linux - but connection is broken
>
> I just discovered a non PostgreSQL problem (but I was suspecting all the
time
> from PostgreSQL).
>
> I'm recording this because would save lot of time from others in the list,
since
> my problem is already solved.
>
> During this day, we had very busy servers and suddenly we started to get
> error 500 and 502 on our Java server, after a select, update or insert.
> - Looking web server logs shows no error.
> - Looking PostgreSQL logs, in Windows server I saw error "winsock error
> 10061", but in Linux server I've found no evidence of the problem.
>
> After digging for an hour, I've discovered our connection pool (max 100
> connections, 50 idle) have been configured (probably by me) to drop
> connections if they don't return in 2 milliseconds (maxWait="2")...
>
> HUGE mistake. Changed connection pool parameter to 60 seconds
> (maxWait="60000"), and problem has gone.
>
> Just my 2c,
>
> Edson Richter
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make
> changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



Em 14/12/2012 18:14, David Johnston escreveu:
> You do not log failed connection attempts from your Java application?
Can you imagine 100 users attempting (and failing) to get connection
every 2 milliseconds would just drop all the server :-)

>
> Your desire is commendable but is your only advice: "don't set connection
> timeout to 2ms"?

Actually, I've spend an hour checking for error 10061 in the Internet
and in this mail list archives. Everyone was pointing to "dll problems
in windows", reinstalling operating system etc.

My short recommendation is to check if the connection is not being
dropped by the connection pool just too soon.
May save someone else an hour of troubleshooting...

> What could these products (not you, by setting up better logging) do to
> minimize the amount of time you had to spend diagnosing the problem?  If
> they already can be configured to do so, and were not in your case, what
> configuration option values would have helped you to diagnose more quickly
> (so other do not disable/change those settings and/or why you thought to
> change them in the first place)?

This is an interesting question.
1st, I don't know if there is better loggin to set (I can't afford
higher log level in production servers).
To improve, IMHO, the jdbc pool tooling would standardize parameters,
because some are defined in seconds, others in milliseconds.
When I did set the maxWait="2", I thought it was "2 seconds". After
re-reading documentation, I realized it was in milliseconds. In the
referred documentation, another parameter next is set in seconds. So,
this is the cause of the confusion.

Anyway, I hope this advice save someone else time.

Regards,

Edson

>
> David J.
>
>
>> -----Original Message-----
>> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
>> owner@postgresql.org] On Behalf Of Edson Richter
>> Sent: Friday, December 14, 2012 2:58 PM
>> To: pgsql-general
>> Subject: [GENERAL] PgSQL 9.1: Warning - error 10061 on Windows, no error
>> on Linux - but connection is broken
>>
>> I just discovered a non PostgreSQL problem (but I was suspecting all the
> time
>> from PostgreSQL).
>>
>> I'm recording this because would save lot of time from others in the list,
> since
>> my problem is already solved.
>>
>> During this day, we had very busy servers and suddenly we started to get
>> error 500 and 502 on our Java server, after a select, update or insert.
>> - Looking web server logs shows no error.
>> - Looking PostgreSQL logs, in Windows server I saw error "winsock error
>> 10061", but in Linux server I've found no evidence of the problem.
>>
>> After digging for an hour, I've discovered our connection pool (max 100
>> connections, 50 idle) have been configured (probably by me) to drop
>> connections if they don't return in 2 milliseconds (maxWait="2")...
>>
>> HUGE mistake. Changed connection pool parameter to 60 seconds
>> (maxWait="60000"), and problem has gone.
>>
>> Just my 2c,
>>
>> Edson Richter
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make
>> changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>
>