Re: BUG #15438: Standby corruption after "Too many open files insystem" error - Mailing list pgsql-bugs

From Andres Freund
Subject Re: BUG #15438: Standby corruption after "Too many open files insystem" error
Date
Msg-id 20181022175732.awdhznheldlwm5ho@alap3.anarazel.de
Whole thread Raw
In response to Re: BUG #15438: Standby corruption after "Too many open files insystem" error  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: BUG #15438: Standby corruption after "Too many open files insystem" error  (Juan José Santamaría Flecha <juanjo.santamaria@gmail.com>)
List pgsql-bugs
Hi,

On 2018-10-22 14:45:04 -0300, Alvaro Herrera wrote:
> On 2018-Oct-18, PG Bug reporting form wrote:
> 
> > The primary was working the whole time and recreating the standby replica,
> > after configuring the user limits, seems to solve the issue.

When you say corruption, do you mean that the data was corrupted
afterwards? Or just that the standby spewed errors until you restarted
with adjusted limits?


> Hmm, we recently fixed a bug were file descriptors were being leaked,
> after 10.5 was tagged (commit e00f4b68dc87 [1]) but that was in logical
> decoding, and the symptoms seem completely different anyway.  I'm not
> sure it would affect you.  The epoll_create stuff in latch.c is pretty
> young though.
> 
> [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=e00f4b68dc878dcee46833a742844346daa1e3c8
> [2] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=fa31b6f4e9696f3c9777bf4ec2faea822826ce9f

I'd assume the fact that this hit epoll_create is more owing to the fact
that the file descriptor used for epoll isn't going through fd.c. That
means we don't previously release an fd if we're at our limit. That's
supposed to be catered for by a reserve of fds, but that reserve isn't
that large.

There previously have been discussions whether we can make fd.c handle
fds that need to be created outside of fd.c's remit (by making sure
there's space, and then tracking that similar to OpenTransientFile()).

Greetings,

Andres Freund


pgsql-bugs by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: BUG #15438: Standby corruption after "Too many open files insystem" error
Next
From: PG Bug reporting form
Date:
Subject: BUG #15450: postgis 2.4 and postgis 2.5 extention not properly built