Re: EINTR causes panic (data dir on btrfs) - Mailing list pgsql-general

From Alvaro Herrera
Subject Re: EINTR causes panic (data dir on btrfs)
Date
Msg-id 20160513170457.GA19032@alvherre.pgsql
Whole thread Raw
In response to EINTR causes panic (data dir on btrfs)  (Gustavo Lopes <gustavo@thehyve.nl>)
Responses Re: EINTR causes panic (data dir on btrfs)
Re: EINTR causes panic (data dir on btrfs)
List pgsql-general
Gustavo Lopes wrote:
> Every few weeks, I'm getting a error like this:
>
> > 2015-02-11 15:31:00 CET PANIC: could not write to log file 00000001000000070000007D at offset 1335296, length 8192:
Interruptedsystem call 
> > 2015-02-11 15:31:00 CET STATEMENT: COMMIT
> > 2015-02-11 15:31:17 CET LOG: server process (PID 8390) was terminated by signal 6: Aborted
> > 2015-02-11 15:31:17 CET DETAIL: Failed process was running: COMMIT
> > 2015-02-11 15:31:17 CET LOG: terminating any other active server processes
> > 2015-02-11 15:31:17 CET WARNING: terminating connection because of crash of another server proces
>
> I'm running the Ubuntu 9.3.4-1 package on a 3.2.13 kernel.
>
> Is there any solution for this? The code generating the error seems to
> be this:
>
> >             if (write(openLogFile, from, nbytes) != nbytes)
> >             {
> >                 /* if write didn't set errno, assume no disk space */
> >                 if (errno == 0)
> >                     errno = ENOSPC;
> >                 ereport(PANIC,
> >                         (errcode_for_file_access(),
> >                          errmsg("could not write to log file %s "
> >                                 "at offset %u, length %lu: %m",
> >                                 XLogFileNameP(ThisTimeLineID, openLogSegNo),
> >                                 openLogOff, (unsigned long) nbytes)));
> >             }
>
> which strikes me as a bit strange (but there may be data consistency
> issues I'm not aware of). Why wouldn't postgres retry on EINTR or even
> allow return values of write() lower than nbytes (and then continue in a
> loop).

I happened to notice this report from 15 months ago, which didn't get
any response.  Did you find a solution to this problem?  I would first
blame btrfs, mostly because I've never heard of anyone with this problem
on more mainstream filesystems.  As I recall, we use SA_RESTART almost
everywhere so we don't expect EINTR anywhere.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-general by date:

Previous
From: Viswanath
Date:
Subject: Re: Update or Delete causes canceling of long running slave queries
Next
From: "D'Arcy J.M. Cain"
Date:
Subject: Re: Using both ident and password in pg_hba.conf