Thread: Postmaster fails in select() in strange way

Postmaster fails in select() in strange way

From
Zbigniew Zagórski
Date:
Hi,

I've encountered probably similar problem to one described in
   http://archives.postgresql.org/pgsql-general/2005-08/msg00847.php
but have more information.

After some time (about 1000-3000 connections, each one transaction
with 1-50 of selects, updates are rare) postmaster stops receiving
connections (but is still alive and silently waits for children).

Snippet from logs at this moment:
----
<snip> postgres[89874]: [1-1] LOG:  XX000: select() failed in postmaster: Inappropriate ioctl for device
<snip> postgres[89874]: [1-2] LOCATION:  ServerLoop, postmaster.c:1183
----
Children (I'm not sure name - child processes of postmaster) are still alive
and established connections work fine, but no new connection can be established.

After closing all connections, postmaster exits leaving no message in
logs - these above are last before postmaster dies.

Also, there are no other strange, or even not strange messages/errors in logs.

When I start postmaster again I can see this in logs:
---
<snip> postgres[13035]: [1-1] LOG:  00000: database system was interrupted at 2005-08-18 16:12:10 CEST
<snip> postgres[13035]: [1-2] LOCATION:  StartupXLOG, xlog.c:4063
<snip> postgres[13035]: [2-1] LOG:  00000: checkpoint record is at 0/BFE8748
<snip> postgres[13035]: [2-2] LOCATION:  StartupXLOG, xlog.c:4132
<snip> postgres[13035]: [3-1] LOG:  00000: redo record is at 0/BFE8748; undo record is at 0/0; shutdown FALSE
<snip> postgres[13035]: [3-2] LOCATION:  StartupXLOG, xlog.c:4160
<snip> postgres[13035]: [4-1] LOG:  00000: next transaction ID: 688728; next OID: 639822
<snip> postgres[13035]: [4-2] LOCATION:  StartupXLOG, xlog.c:4163
<snip> postgres[13035]: [5-1] LOG:  00000: database system was not properly shut down; automatic recovery in
progress
<snip> postgres[13035]: [5-2] LOCATION:  StartupXLOG, xlog.c:4219
<snip> postgres[13035]: [6-1] LOG:  00000: record with zero length at 0/BFE8784
<snip> postgres[13035]: [6-2] LOCATION:  ReadRecord, xlog.c:2496
<snip> postgres[13035]: [7-1] LOG:  00000: redo is not required
<snip> postgres[13035]: [7-2] LOCATION:  StartupXLOG, xlog.c:4321
<snip> postgres[13035]: [8-1] LOG:  00000: database system is ready
<snip> postgres[13035]: [8-2] LOCATION:  StartupXLOG, xlog.c:4526
---
Looks OK i think.

The most strangest part of this, that in FreeBSD manual page of
select(2) ENOTTY (errno code for 'Inappropriate ioctl for device')
is not listed.

In previous thread Csaba Nagy wrote:
 > Is it possible that you're application is not closing connections, and
 > the server has a limit on connection count, and that is reached in a few
I'm sure that my application closes all connections correctly.

Does it look like OS or PostgreSQL bug?

Platform:
   PostgreSQL: psql (PostgreSQL) 8.0.3
   FreeBSD:    5.2-RELEASE

Thanks, Greetings.
--
:: zbigg ::::::::::::: Zbigniew Zagórski :::::::::::::::::
::::::: zzbigg (at) o2 (dot) pl ::: GG:5280474 :::::::::::
: 2B OR (NOT 2B) That is the question. The answer is FF. :

Re: Postmaster fails in select() in strange way

From
Tom Lane
Date:
=?ISO-8859-2?Q?Zbigniew_Zag=F3rski?= <zbigg@filmzone.pl> writes:
> postgres[89874]: [1-1] LOG:  XX000: select() failed in postmaster: Inappropriate ioctl for device

Wow, that's bizarre.

> After closing all connections, postmaster exits leaving no message in
> logs - these above are last before postmaster dies.

Yeah, the postmaster just throws up its hands and quits.  Everything
you've said follows directly from the unexpected select() failure.

> Does it look like OS or PostgreSQL bug?

I'd say definitely an OS bug.  Time to enlist some FreeBSD hackers.

            regards, tom lane