Thread: WSL (windows subsystem on linux) users will need to turn fsync off asof 11.2

WSL (windows subsystem on linux) users will need to turn fsync off asof 11.2

From
Bruce Klein
Date:
Just in case this helps the next person who can't figure out why their postgres server won't start today:

If you are running Postgres inside Microsoft WSL (at least on Ubuntu, maybe on others too), and just picked up a software update to version 11.2, you will need to go into your /etc/postgresql.conf file and set fsync=off. 

This took me a while to fix because the error you message you get if you don't is the generic:

terminating connection because of crash of another server process 2015-07-15 20:18:37 UTC The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
 
I spent a long time trying to completely uninstall and resintall, etc. to recover from the "crash" although I don't think there ever was one and the message appears on first use of the create database command even on a completely clean install.

I don't know if this is possible/reasonable, but if the database code could automatically turn fsync off on WSL it might save the next users some trouble.
Bruce Klein <brucek@gmail.com> writes:
> If you are running Postgres inside Microsoft WSL (at least on Ubuntu, maybe
> on others too), and just picked up a software update to version 11.2, you
> will need to go into your /etc/postgresql.conf file and set fsync=off.

Hm.  Probably this is some unexpected problem with the
panic-on-fsync-failure change; although that still leaves some things
unexplained, because if fsync is failing for you now, why didn't it fail
before?  Anyway, you might try experimenting with data_sync_retry,
instead of running with scissors by turning off fsync altogether.
See first item in the release notes:

https://www.postgresql.org/docs/11/release-11-2.html

Also, we'd quite like to hear more details; can you find any PANIC
messages in the server log?

            regards, tom lane


Thanks Tom I feel like I'm in a little over my head here but I'll try to help as I can.

With fsync off, everything appears to run as it did before on 11.1.

With fsync default/on, the problem is easily reproducible by trying to create a database. I believe the very first time I saw it it was with a routine query but I'm not 100% sure.

psql-11.2=> create database testdb;
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
SSL SYSCALL error: EOF detected
The connection to the server was lost. Attempting reset: Failed.
!>

Here are the entries from the log:
1527 2019-02-14 15:06:08.218 DST [8398] PANIC:  could not flush dirty data: Function not implemented
1528 2019-02-14 15:06:08.218 DST [8396] LOG:  checkpointer process (PID 8398) was terminated by signal 6: Aborted
1529 2019-02-14 15:06:08.218 DST [8396] LOG:  terminating any other active server processes
1530 2019-02-14 15:06:08.218 DST [8422] homestead@homestead WARNING:  terminating connection because of crash of another server process
1531 2019-02-14 15:06:08.218 DST [8422] homestead@homestead DETAIL:  The postmaster has commanded this server process to roll back the current transaction an     d exit, because another server process exited abnormally and possibly corrupted shared memory.
1532 2019-02-14 15:06:08.218 DST [8422] homestead@homestead HINT:  In a moment you should be able to reconnect to the database and repeat your command.
1533 2019-02-14 15:06:08.218 DST [8401] WARNING:  terminating connection because of crash of another server process
1534 2019-02-14 15:06:08.218 DST [8401] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because anot     her server process exited abnormally and possibly corrupted shared memory.
1535 2019-02-14 15:06:08.218 DST [8401] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
1536 2019-02-14 15:06:08.241 DST [8396] LOG:  all server processes terminated; reinitializing
1537 2019-02-14 15:06:08.259 DST [8433] LOG:  database system was interrupted; last known up at 2019-02-14 15:05:30 DST
1538 2019-02-14 15:06:08.259 DST [8433] PANIC:  could not flush dirty data: Function not implemented
1539 2019-02-14 15:06:08.264 DST [8396] LOG:  startup process (PID 8433) was terminated by signal 6: Aborted
1540 2019-02-14 15:06:08.264 DST [8396] LOG:  aborting startup due to startup process failure
1541 2019-02-14 15:06:08.266 DST [8434] homestead@homestead FATAL:  the database system is in recovery mode
1542 2019-02-14 15:06:08.268 DST [8396] LOG:  database system is shut down

As to why it worked before, I don't think fsync() ever worked on WSL, and there were places where you'd see warnings about it in 11.1, they just wouldn't crash the server.

As to the "running with scissors" risk, I'm going to guess the most common use case for WSL is as a personal dev box where all the data is disposable anyway. That's the case for me at least.

Best,
Bruce

On Thu, Feb 14, 2019 at 2:48 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Bruce Klein <brucek@gmail.com> writes:
> If you are running Postgres inside Microsoft WSL (at least on Ubuntu, maybe
> on others too), and just picked up a software update to version 11.2, you
> will need to go into your /etc/postgresql.conf file and set fsync=off.

Hm.  Probably this is some unexpected problem with the
panic-on-fsync-failure change; although that still leaves some things
unexplained, because if fsync is failing for you now, why didn't it fail
before?  Anyway, you might try experimenting with data_sync_retry,
instead of running with scissors by turning off fsync altogether.
See first item in the release notes:

https://www.postgresql.org/docs/11/release-11-2.html

Also, we'd quite like to hear more details; can you find any PANIC
messages in the server log?

                        regards, tom lane
>  In 11.1 did you see the message "WARNING: could not flush dirty data: Function not implemented"
Yes

re: the discussions of O/S and filesystem in that thread:
I am not qualified to describe the implementation of WSL but I believe it is neither pure Ubuntu running on metal, nor a virtual machine hosted on Windows. I believe what the Microsoft folks have done is implement something around the driver/kernel layer that fools Ubuntu into thinking it is connected to hardware it expects, while it is ultimately still running on top of a Windows kernel and Windows filesystem. That includes stubbing out or otherwise presenting an appearance of implementing some functions like perhaps fsync() that it really doesn't. Note I believe this is fundamentally different from the old Cygwin and similar projects approach, i.e. WSL does not involve recompiling on top of window specific libraries etc. If any of these details are important to anyone you should verify them from a more credible source.

If it matters, the Ubuntu version I am running on WSL now is 16.04.5.


On Thu, Feb 14, 2019 at 3:44 PM Ravi Krishna <srkrishna100@aol.com> wrote:
Hi Bruce,

Check my earlier thread on PG 10.5 on Ubuntu Bash with WSL.

https://www.postgresql.org/message-id/1301077575.68539.1535929075959%40mail.yahoo.com

In 11.1 did you see the message "WARNING: could not flush dirty data: Function not implemented"

regards

Re: WSL (windows subsystem on linux) users will need to turn fsyncoff as of 11.2

From
Thomas Munro
Date:
On Fri, Feb 15, 2019 at 2:56 PM Bruce Klein <brucek@gmail.com> wrote:
> >  In 11.1 did you see the message "WARNING: could not flush dirty data: Function not implemented"
> Yes

I wonder if this is coming from sync_file_range(), which is not
implemented on WSL according to random intergoogling, but probably
appears as present to our configure script.  I find it harder to
believe they didn't implement fsync().


--
Thomas Munro
http://www.enterprisedb.com


Re: WSL (windows subsystem on linux) users will need to turn fsyncoff as of 11.2

From
Thomas Munro
Date:
On Fri, Feb 15, 2019 at 3:56 PM Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Fri, Feb 15, 2019 at 2:56 PM Bruce Klein <brucek@gmail.com> wrote:
> > >  In 11.1 did you see the message "WARNING: could not flush dirty data: Function not implemented"
> > Yes
>
> I wonder if this is coming from sync_file_range(), which is not
> implemented on WSL according to random intergoogling, but probably
> appears as present to our configure script.  I find it harder to
> believe they didn't implement fsync().

Here is a place where people go to complain about that:

https://github.com/Microsoft/WSL/issues/645

I suppose we could tolerate ENOSYS.

-- 
Thomas Munro
http://www.enterprisedb.com


Thomas Munro <thomas.munro@enterprisedb.com> writes:
>> On Fri, Feb 15, 2019 at 2:56 PM Bruce Klein <brucek@gmail.com> wrote:
>>> In 11.1 did you see the message "WARNING: could not flush dirty data: Function not implemented"
>> Yes

> Here is a place where people go to complain about that:
> https://github.com/Microsoft/WSL/issues/645
> I suppose we could tolerate ENOSYS.

What I'm not grasping here is why you considered that sync_file_range
failure should be treated as a reason to PANIC in the first place?
Surely it is not fsync(), nor some facsimile thereof.  In fact, if
any of the branches in pg_flush_data really need the data_sync_elevel
treatment, somebody's mental model of that operation needs adjustment.
Maybe it's mine.

            regards, tom lane


Re: WSL (windows subsystem on linux) users will need to turn fsyncoff as of 11.2

From
Thomas Munro
Date:
On Fri, Feb 15, 2019 at 5:29 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@enterprisedb.com> writes:
> >> On Fri, Feb 15, 2019 at 2:56 PM Bruce Klein <brucek@gmail.com> wrote:
> >>> In 11.1 did you see the message "WARNING: could not flush dirty data: Function not implemented"
> >> Yes
>
> > Here is a place where people go to complain about that:
> > https://github.com/Microsoft/WSL/issues/645
> > I suppose we could tolerate ENOSYS.
>
> What I'm not grasping here is why you considered that sync_file_range
> failure should be treated as a reason to PANIC in the first place?
> Surely it is not fsync(), nor some facsimile thereof.  In fact, if
> any of the branches in pg_flush_data really need the data_sync_elevel
> treatment, somebody's mental model of that operation needs adjustment.
> Maybe it's mine.

My thinking was that sync_file_range() might in its current, future or
alternative (WSL, ...) implementation eat an error that would
otherwise reach fsync(), due to the single-flag error state treatment
we see in several OSes (older Linux, also recent Linux via the 'seen'
flag that we rely on to receive errors that happened before we opened
the fd).  Should we be inspecting the Linux source or asking
assurances from Linux hackers that that can't happen?   Perhaps it
behaves more like fdatasync() with the SYNC_FILE_RANGE_WAIT_* flags (=
can clear seen flag), but more like fadvise() without (can't touch
it)?  I don't know, and I didn't want to base my choice on what it
looks like it currently does in the Linux tree.  Without guarantees
from standards (not relevant here) or man pages (which note only that
EIO is possible), I made what I thought was an appropriately
pessimistic choice.

-- 
Thomas Munro
http://www.enterprisedb.com


Re: WSL (windows subsystem on linux) users will need to turn fsyncoff as of 11.2

From
Andres Freund
Date:
Hi,

On 2019-02-14 19:48:05 -0500, Tom Lane wrote:
> Bruce Klein <brucek@gmail.com> writes:
> > If you are running Postgres inside Microsoft WSL (at least on Ubuntu, maybe
> > on others too), and just picked up a software update to version 11.2, you
> > will need to go into your /etc/postgresql.conf file and set fsync=off.
> 
> Hm.  Probably this is some unexpected problem with the
> panic-on-fsync-failure change; although that still leaves some things
> unexplained, because if fsync is failing for you now, why didn't it fail
> before?  Anyway, you might try experimenting with data_sync_retry,
> instead of running with scissors by turning off fsync altogether.
> See first item in the release notes:
> 
> https://www.postgresql.org/docs/11/release-11-2.html
> 
> Also, we'd quite like to hear more details; can you find any PANIC
> messages in the server log?

I suspect that's because WSL has an empty implementation of
sync_file_range(), i.e. it unconditionally returns ENOSYS. But as
configure detects it, we still emit calls for it.  I guess we ought to
except ENOSYS for the cases where we do panic-on-fsync-failure?

You temporarily can work around it, mostly, by setting
checkpoint_flush_after = 0 and bgwriter_flush_after = 0.

Greetings,

Andres Freund


Andres Freund <andres@anarazel.de> writes:
> I suspect that's because WSL has an empty implementation of
> sync_file_range(), i.e. it unconditionally returns ENOSYS. But as
> configure detects it, we still emit calls for it.  I guess we ought to
> except ENOSYS for the cases where we do panic-on-fsync-failure?

I'm of the opinion that we shouldn't be panicking for sync_file_range
failure, period.

            regards, tom lane


Re: WSL (windows subsystem on linux) users will need to turn fsync off as of 11.2

From
Andres Freund
Date:

On February 15, 2019 9:13:10 AM PST, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>Andres Freund <andres@anarazel.de> writes:
>> I suspect that's because WSL has an empty implementation of
>> sync_file_range(), i.e. it unconditionally returns ENOSYS. But as
>> configure detects it, we still emit calls for it.  I guess we ought
>to
>> except ENOSYS for the cases where we do panic-on-fsync-failure?
>
>I'm of the opinion that we shouldn't be panicking for sync_file_range
>failure, period.

With some flags it's strictly required, it does"eat"errors depending on the flags. So I'm not sure I understand?

Access
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Andres Freund <andres@anarazel.de> writes:
> On February 15, 2019 9:13:10 AM PST, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'm of the opinion that we shouldn't be panicking for sync_file_range
>> failure, period.

> With some flags it's strictly required, it does"eat"errors depending on the flags. So I'm not sure I understand?

Really?  The specification says that it starts I/O, not that it waits
around for any to finish.

The bigger picture here is that this set of patches seems to have moved
us too far in the direction of defending against hypothetical kernel
bugs, and too far away from real-world usability.  I am not happy with
the tradeoff.

            regards, tom lane


Re: WSL (windows subsystem on linux) users will need to turn fsync off as of 11.2

From
Andres Freund
Date:

On February 15, 2019 9:44:50 AM PST, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>Andres Freund <andres@anarazel.de> writes:
>> On February 15, 2019 9:13:10 AM PST, Tom Lane <tgl@sss.pgh.pa.us>
>wrote:
>>> I'm of the opinion that we shouldn't be panicking for
>sync_file_range
>>> failure, period.
>
>> With some flags it's strictly required, it does"eat"errors depending
>on the flags. So I'm not sure I understand?
>
>Really?  The specification says that it starts I/O, not that it waits
>around for any to finish.

That depends on the flags you pass in. By memory I don't think it eats an error with our flags in recent kernels, but
I'mnot sure. 

Andres

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: WSL (windows subsystem on linux) users will need to turn fsyncoff as of 11.2

From
Thomas Munro
Date:
On Sat, Feb 16, 2019 at 6:50 AM Andres Freund <andres@anarazel.de> wrote:
> On February 15, 2019 9:44:50 AM PST, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >Andres Freund <andres@anarazel.de> writes:
> >> On February 15, 2019 9:13:10 AM PST, Tom Lane <tgl@sss.pgh.pa.us>
> >wrote:
> >>> I'm of the opinion that we shouldn't be panicking for
> >sync_file_range
> >>> failure, period.
> >
> >> With some flags it's strictly required, it does"eat"errors depending
> >on the flags. So I'm not sure I understand?
> >
> >Really?  The specification says that it starts I/O, not that it waits
> >around for any to finish.
>
> That depends on the flags you pass in. By memory I don't think it eats an error with our flags in recent kernels, but
I'mnot sure.
 

Right, there was some discussion of that, and I didn't (and still
don't) think it'd be wise to rely on undocumented knowledge about
which flags can eat errors based on a drive-by reading of a particular
snapshot of the Linux tree.  The man page says it can return EIO; I
think we should assume that it might actually do that.

BTW I had a report from someone on IRC that PostgreSQL breaks in other
ways (not yet understood) if you build it directly on WSL/Ubuntu.  I
guess the OP is reporting about a .deb that was built on a real Linux
system.  I'm vaguely familiar with these types of problems from other
platforms (Linux syscall emulation on FreeBSD and Sun-ish systems, and
also I'm old enough to remember people doing SCO SysV syscall
emulation on Linux systems back before certain valuable software was
available natively); it's possible that you get ENOSYS on other
emulators too, considering that other kernels don't seem to have a
sync_file_range()-like facility, but probably no one cares, since
there is no reason to run PostgreSQL on a syscall emulator when you
can run it natively.  This is a bit different though: I guess people
want to be able to develop Linux-stack stuff on company-issued Windows
computers for later deployment on Linux servers; someone interested in
this would ideally make it work and set up a build farm animal to tell
us when we break it.  It would probably require only minimal changes,
but considering that no one bothered to complain about PostgreSQL
spewing scary looking warnings on WSL for years, it's not too
surprising that we didn't consider this case before.  A bit like the
nightjar case, the PANIC patch revealed a pre-existing problem that
had gone unreported and needs some work, but it doesn't seem like a
very good reason to roll back that part of the change completely IMHO.

-- 
Thomas Munro
http://www.enterprisedb.com


>  I guess the OP is reporting about a .deb that was built on a real Linux system

Yes, I (OP) installed via:
  % wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
  % sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/ $(lsb_release -sc)-pgdg main" > /etc/apt/sources.list.d/PostgreSQL.list'
  % sudo apt update
  % sudo apt-get install postgresql-11

> no one bothered to complain about PostgreSQL spewing scary looking warnings on WSL for years

At least you weren't spamming a once-per-second(!) log entry about a missing function call like one of my other packages did (can't remember, maybe it was nginx?)

WSL still feels early and if you're going to try it, you get used to annoyances like that. I'm glad Microsoft is trying though and I hope with time and support they get all the way there because developers who have enterprise or other reasons to be on Windows instead of Mac desktops deserve to have decent unix tools too. Warts and all I still find it overall more convenient and fluid than my previous VirtualBox / vagrant solution.    

On Fri, Feb 15, 2019 at 11:20 AM Thomas Munro <thomas.munro@enterprisedb.com> wrote:
On Sat, Feb 16, 2019 at 6:50 AM Andres Freund <andres@anarazel.de> wrote:
> On February 15, 2019 9:44:50 AM PST, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >Andres Freund <andres@anarazel.de> writes:
> >> On February 15, 2019 9:13:10 AM PST, Tom Lane <tgl@sss.pgh.pa.us>
> >wrote:
> >>> I'm of the opinion that we shouldn't be panicking for
> >sync_file_range
> >>> failure, period.
> >
> >> With some flags it's strictly required, it does"eat"errors depending
> >on the flags. So I'm not sure I understand?
> >
> >Really?  The specification says that it starts I/O, not that it waits
> >around for any to finish.
>
> That depends on the flags you pass in. By memory I don't think it eats an error with our flags in recent kernels, but I'm not sure.

Right, there was some discussion of that, and I didn't (and still
don't) think it'd be wise to rely on undocumented knowledge about
which flags can eat errors based on a drive-by reading of a particular
snapshot of the Linux tree.  The man page says it can return EIO; I
think we should assume that it might actually do that.

BTW I had a report from someone on IRC that PostgreSQL breaks in other
ways (not yet understood) if you build it directly on WSL/Ubuntu.  I
guess the OP is reporting about a .deb that was built on a real Linux
system.  I'm vaguely familiar with these types of problems from other
platforms (Linux syscall emulation on FreeBSD and Sun-ish systems, and
also I'm old enough to remember people doing SCO SysV syscall
emulation on Linux systems back before certain valuable software was
available natively); it's possible that you get ENOSYS on other
emulators too, considering that other kernels don't seem to have a
sync_file_range()-like facility, but probably no one cares, since
there is no reason to run PostgreSQL on a syscall emulator when you
can run it natively.  This is a bit different though: I guess people
want to be able to develop Linux-stack stuff on company-issued Windows
computers for later deployment on Linux servers; someone interested in
this would ideally make it work and set up a build farm animal to tell
us when we break it.  It would probably require only minimal changes,
but considering that no one bothered to complain about PostgreSQL
spewing scary looking warnings on WSL for years, it's not too
surprising that we didn't consider this case before.  A bit like the
nightjar case, the PANIC patch revealed a pre-existing problem that
had gone unreported and needs some work, but it doesn't seem like a
very good reason to roll back that part of the change completely IMHO.

--
Thomas Munro
http://www.enterprisedb.com
On 2/15/19 4:04 PM, Bruce Klein wrote:
[snip]
> I'm glad Microsoft is trying though

If Steve "Linux is a cancer" Ballmer were dead, he's be spinning in his grave...

-- 
Angular momentum makes the world go 'round.


On Fri, Feb 15, 2019 at 1:34 AM Bruce Klein <brucek@gmail.com> wrote:

If you are running Postgres inside Microsoft WSL

Who is WSL for?
This is primarily a tool for developers ...
-----------------------
 
One problem with WSL is that the I/O performance is not good and it might never be solved. So using WSL for production is not what it was ment for.

WSL is called a "compatibility layer". When running WSL there is no Linux kernel despite "uname" say so. Like WINE, where one can run Windows binaries on Linux but there is no Windows OS.

That said, WSL is a great tool for developers. Better than Cygwin.

./hans
Thomas Munro <thomas.munro@enterprisedb.com> writes:
>>> Really?  The specification says that it starts I/O, not that it waits
>>> around for any to finish.

> Right, there was some discussion of that, and I didn't (and still
> don't) think it'd be wise to rely on undocumented knowledge about
> which flags can eat errors based on a drive-by reading of a particular
> snapshot of the Linux tree.  The man page says it can return EIO; I
> think we should assume that it might actually do that.

I had a thought about this: maybe we should restrict the scope of this
behavior to be "panic on EIO", not "panic on anything within hailing
distance of fsync".

The direction you and Andres seem to want to go in is to add a pile of
unprincipled exception cases, which seems like a recipe for constant
pain to me.  I think we might be better off with a whitelist of errnos
that mean trouble, instead of a blacklist of some that don't.  I'm
especially troubled by the idea that blacklisting some errnos might
reduce to ignoring them completely, which would be a step backwards
from our pre-PANIC behavior.

            regards, tom lane


Re: WSL (windows subsystem on linux) users will need to turn fsyncoff as of 11.2

From
Thomas Munro
Date:
On Sun, Feb 17, 2019 at 4:56 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@enterprisedb.com> writes:
> >>> Really?  The specification says that it starts I/O, not that it waits
> >>> around for any to finish.
>
> > Right, there was some discussion of that, and I didn't (and still
> > don't) think it'd be wise to rely on undocumented knowledge about
> > which flags can eat errors based on a drive-by reading of a particular
> > snapshot of the Linux tree.  The man page says it can return EIO; I
> > think we should assume that it might actually do that.
>
> I had a thought about this: maybe we should restrict the scope of this
> behavior to be "panic on EIO", not "panic on anything within hailing
> distance of fsync".
>
> The direction you and Andres seem to want to go in is to add a pile of
> unprincipled exception cases, which seems like a recipe for constant
> pain to me.  I think we might be better off with a whitelist of errnos
> that mean trouble, instead of a blacklist of some that don't.  I'm
> especially troubled by the idea that blacklisting some errnos might
> reduce to ignoring them completely, which would be a step backwards
> from our pre-PANIC behavior.

Hmm.  Well, at least ENOSPC should be treated the same way as EIO.
Here's an experiment that seems to confirm some speculations about NFS
on Linux from the earlier threads:

$ uname -a
Linux debian 4.18.0-3-amd64 #1 SMP Debian 4.18.20-2 (2018-11-23)
x86_64 GNU/Linux
$ dpkg -l nfs-kernel-server | tail -1
ii  nfs-kernel-server 1:1.3.4-2.4  amd64        support for NFS kernel server

First, set up a 10MB loop-back filesystem:

$ dd if=/dev/zero of=/tmp/10mb.loopback bs=1024 count=10000
$ sudo losetup /dev/loop0 /tmp/10mb.loopback
$ sudo mkfs -t ext3 -m 1 -v /dev/loop0
...
$ sudo mkdir /mnt/test_loopback
$ sudo mount -t ext3 /dev/loop0 /mnt/test_loopback

Then, export that via NFS:

$ tail -1 /etc/exports
/mnt/test_loopback localhost(rw,sync,no_subtree_check)
$ sudo exportfs -av
exporting localhost:/mnt/test_loopback

Next, mount that over NFS:

$ sudo mkdir /mnt/test_loopback_remote
$ sudo mount localhost:/mnt/test_loopback /mnt/test_loopback_remote

Now, fill up the whole disk with a file full of newlines:

$ sudo mkdir /mnt/test_loopback/dir
$ sudo chown $USER:$USER /mnt/test_loopback/dir
$ tr "\000" "\n" < /dev/zero > /mnt/test_loopback_remote/dir/file
tr: write error: No space left on device
tr: write error
$ df -h /mnt/test_loopback*
Filesystem                    Size  Used Avail Use% Mounted on
/dev/loop0                    8.5M  8.4M     0 100% /mnt/test_loopback
localhost:/mnt/test_loopback  8.5M  8.4M     0 100% /mnt/test_loopback_remote

Now, run a program that appends a greeting and then calls fsync() twice:

$ cat test.c
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
        int fd, rc;

        fd = open("/mnt/test_loopback_remote/dir/file", O_RDWR | O_APPEND);
        if (fd < 0)
        {
                perror("open");
                return 1;
        }
        rc = write(fd, "hello world\n", 12);
        if (rc < 0)
                perror("write");
        else if (rc < 12)
                fprintf(stderr, "only managed to write %d bytes\n", rc);
        rc = fsync(fd);
        if (rc < 0)
                perror("fsync 1");
        rc = fsync(fd);
        if (rc < 0)
                perror("fsync 2");
        rc = close(fd);
        if (rc < 0)
                perror("close");

        return 0;
}
$ cc test.c
$ ./a.out
fsync 1: No space left on device
$

The write() and the second fsync() reported success.  Great, let's go
and look at our precious data, both through NFS and locally:

$ tail -3 /mnt/test_loopback_remote/dir/file



$ tail -3 /mnt/test_loopback/dir/file



$

It's gone.  If you try it again with a file containing just a few
newlines so there is free space, it works correctly and you see the
appended greeting.  Perhaps the same sort of thing might happen with
remote EDQUOT, but I haven't tried that.  Perhaps there are some
things that could be tuned that would avoid that?

(Some speculation about NFS:  To avoid data-loss from running out of
disk space, I think PostgreSQL requires either a filesystem that
reserves space when we're extending a file, so that we can exclude the
possibility of ENOSPC before we evict data from our own shared
buffers, or a page cache that doesn't drop dirty flags or whole
buffers on failure so we can meaningfully retry once space becomes
available.  As far as I know, the former would be theoretically
possible with NFS, if the client and server are using NFSv4.2+ with
ALLOCATE support and glibc and kernel versions both support true
fallocate() and pass it all the way through, but current versions
either don't support fallocate() on NFS files at all (this 4.18 kernel
doesn't) or sometimes emulate it by writing zeroes, which is useless
for remote space reservation purposes and (according to some sources I
found) there is currently no reliable way to find out about that
though libc.  If that situation improves, we'd still need to do
explicit fallocate() on our side to reserve space, and it'd probably
be slow so we might have to work in bigger chunks to amortise the
latency.)

Returning to your question of how to decide whether to have an errno
include-list or an exclude-list for our new panic behaviour, I think
we should tolerate ENOSYS as a special case for sync_file_range()
only, because:

1.  We don't actually need sync_file_range() at all for correct
operation (unlike fsync()).
2.  ENOSYS is the only errno that very explicitly says "this didn't go
anywhere near Linux filesystem code".  Therefore, it definitely didn't
eat any errors relating to jettisoned data.  (Ok, perhaps EBADF and
EINVAL tell you something similar, but they also imply a serious bug
in the calling code, having lost track of file descriptors or
something.)

So far I still think that we should panic if fsync() returns any error
number at all.  For sync_file_range(), it sounds like maybe you think
we should leave the warning-spewing in there for ENOSYS, to do exactly
what we did before on principle since that's what back-branches are
all about?  Something like:

  ereport(errno == ENOSYS ? WARNING : data_sync_elevel(WARNING),

Perhaps for master we could skip it completely, or somehow warn just
once, and/or switch to one of our other implementations at runtime?  I
don't really have a strong view on that, not being a user of that
system.  Will they ever implement it?  Are there other systems we care
about that don't implement it?  (Android?)


--
Thomas Munro
http://www.enterprisedb.com


Re: WSL (windows subsystem on linux) users will need to turn fsync off as of11.2

From
"Ravi Krishna"
Date:
If this one appears in the list, then it means the problem is with AOL.



Re: WSL (windows subsystem on linux) users will need to turn fsyncoff as of 11.2

From
Andres Freund
Date:
Hi,

On 2019-02-17 23:29:09 +1300, Thomas Munro wrote:
> Hmm.  Well, at least ENOSPC should be treated the same way as EIO.
> Here's an experiment that seems to confirm some speculations about NFS
> on Linux from the earlier threads:

I wish we'd' a good way to have test scripts in the tree for something
like that, but using postgres. Unfortunately it's not easy to write
portable setup scripts for it.


> So far I still think that we should panic if fsync() returns any error
> number at all.  For sync_file_range(), it sounds like maybe you think
> we should leave the warning-spewing in there for ENOSYS, to do exactly
> what we did before on principle since that's what back-branches are
> all about?  Something like:
> 
>   ereport(errno == ENOSYS ? WARNING : data_sync_elevel(WARNING),
> 
> Perhaps for master we could skip it completely, or somehow warn just
> once, and/or switch to one of our other implementations at runtime?  I
> don't really have a strong view on that, not being a user of that
> system.  Will they ever implement it?  Are there other systems we care
> about that don't implement it?  (Android?)

I'm not sure I see much need for leaving the warning in out of
principle. Feels like we ought to sync_file_range once at postmaster
startup and then just force-disable the flush GUCs if not available?

Greetings,

Andres Freund


Re: WSL (windows subsystem on linux) users will need to turn fsyncoff as of 11.2

From
Michael Paquier
Date:
On Sun, Feb 17, 2019 at 10:54:54AM -0800, Andres Freund wrote:
> On 2019-02-17 23:29:09 +1300, Thomas Munro wrote:
>> Hmm.  Well, at least ENOSPC should be treated the same way as EIO.
>> Here's an experiment that seems to confirm some speculations about NFS
>> on Linux from the earlier threads:
>
> I wish we'd' a good way to have test scripts in the tree for something
> like that, but using postgres. Unfortunately it's not easy to write
> portable setup scripts for it.

Yes, it seems to me as well that ENOSPC should be treated as much as
EIO.  Just looking at the code for data_sync_retry we should really
have some errno filtering.
--
Michael

Attachment

Re: WSL (windows subsystem on linux) users will need to turn fsyncoff as of 11.2

From
Thomas Munro
Date:
On Mon, Feb 18, 2019 at 2:19 PM Michael Paquier <michael@paquier.xyz> wrote:
> On Sun, Feb 17, 2019 at 10:54:54AM -0800, Andres Freund wrote:
> > On 2019-02-17 23:29:09 +1300, Thomas Munro wrote:
> >> Hmm.  Well, at least ENOSPC should be treated the same way as EIO.
> >> Here's an experiment that seems to confirm some speculations about NFS
> >> on Linux from the earlier threads:
> >
> > I wish we'd' a good way to have test scripts in the tree for something
> > like that, but using postgres. Unfortunately it's not easy to write
> > portable setup scripts for it.
>
> Yes, it seems to me as well that ENOSPC should be treated as much as
> EIO.  Just looking at the code for data_sync_retry we should really
> have some errno filtering.

I agree with you up to a point:  It would make some amount of sense
for data_sync_elevel() not to promote to PANIC if errno == ENOSYS;
then for sync_file_range() you'd get WARNING and for fsync() you'd get
ERROR (since that's what those call sites pass in) on hypothetical
kernels that lack those syscalls.  As I argued earlier, ENOSYS seems
to be the only errno that we know for sure to be non-destructive to
the page cache since it promises it didn't run any kernel code at all
(or rather didn't make it past the front door), so it's the *only*
errno that belongs on such a whitelist IMHO.  That would get us back
to where we were for WSL users in 11.1.

The question is whether we want to go further than that and provide a
better experience for WSL users, now that we know that it was already
spewing warnings.  One way to do that might be not to bother with
errno filtering at all, but instead (as Andres suggested) do a test of
whether sync_file_range() is implemented on this kernel/emulator at
startup and if not, just disable it somehow.  Then we don't need any
filtering.

Here is a restatement of my rationale for not including errno
filtering in the first place:  Take a look at the documented errnos
for fsync() on Linux.  Which of those tell us that it's sane to retry?
 AFAICS they all either mean that it ran filesystem code that is known
to toast your data on error (data loss has already occurred), or your
file descriptor is bogus (code somewhere is seriously busted and all
bets are off).

-- 
Thomas Munro
http://www.enterprisedb.com


Re: WSL (windows subsystem on linux) users will need to turn fsyncoff as of 11.2

From
Ravi Krishna
Date:
Are there any plans to support PG on WSL ?  Just curious.



Re: WSL (windows subsystem on linux) users will need to turn fsyncoff as of 11.2

From
Andres Freund
Date:
On 2019-02-18 10:33:50 -0500, Ravi Krishna wrote:
> Are there any plans to support PG on WSL ?  Just curious.

I think if somebody wanted to start investing efforts to improve testing
of that setup, and then fix the resulting issues, nobody would seriously
object. But also most people working on PG are already busy.  Please
feel free to create a buildfarm test machine with postgres running on
WSL.

- Andres


Re: WSL (windows subsystem on linux) users will need to turn fsyncoff as of 11.2

From
Thomas Munro
Date:
On Tue, Feb 19, 2019 at 6:01 AM Andres Freund <andres@anarazel.de> wrote:
> On 2019-02-18 10:33:50 -0500, Ravi Krishna wrote:
> > Are there any plans to support PG on WSL ?  Just curious.

Hi Ravi,

I definitely want to fix this particular issue for 11.3.

> I think if somebody wanted to start investing efforts to improve testing
> of that setup, and then fix the resulting issues, nobody would seriously
> object. But also most people working on PG are already busy.  Please
> feel free to create a buildfarm test machine with postgres running on
> WSL.

Right, the first step would be for a WSL user to figure out what's
wrong with builds on the WSL and show us how to fix it; I heard
through the grapevine that if you try it, initdb doesn't work (it must
be something pretty subtle in the configure phase or something like
that, since the Ubuntu .deb apparently works, except for the issue
reported in this thread).  Then, confirm that they're happy with
whatever patch we come up with.  Then if someone wants to make sure we
don't accidentally break it in future, yeah, a build farm animal would
help a lot.

Here's a starter patch that shows one of the approaches discussed.  It
gets WSL users to a better place than they were before, by suppressing
further warnings after the first one.

-- 
Thomas Munro
http://www.enterprisedb.com

-- 
 <https://postgresvision.com/>
 <https://postgresvision.com/>

Attachment

Re: WSL (windows subsystem on linux) users will need to turn fsyncoff as of 11.2

From
James Sewell
Date:
Right, the first step would be for a WSL user to figure out what's
wrong with builds on the WSL and show us how to fix it; I heard
through the grapevine that if you try it, initdb doesn't work (it must
be something pretty subtle in the configure phase or something like
that, since the Ubuntu .deb apparently works, except for the issue
reported in this thread). 

That's correct - initdb doesn't work when you've built on WSL as *somehow* HAVE_FDATASYNC is set to 1 by configure - but it ends up not being included by #ifdef blocks. This causes the following PANIC

PANIC:  unrecognized wal_sync_method: 1

Which happens because wal_sync is set to 1, but in src/backend/access/transam/xlog.c that block in the switch is inside the #ifdef so never gets checked,

--
James


The contents of this email are confidential and may be subject to legal or professional privilege and copyright. No representation is made that this email is free of viruses or other defects. If you have received this communication in error, you may not copy or distribute any part of it or otherwise disclose its contents to anyone. Please advise the sender of your incorrect receipt of this correspondence.

Re: WSL (windows subsystem on linux) users will need to turn fsyncoff as of 11.2

From
Andres Freund
Date:
Hi,

On 2019-02-19 11:50:36 +1100, James Sewell wrote:
> >
> > Right, the first step would be for a WSL user to figure out what's
> > wrong with builds on the WSL and show us how to fix it; I heard
> > through the grapevine that if you try it, initdb doesn't work (it must
> > be something pretty subtle in the configure phase or something like
> > that, since the Ubuntu .deb apparently works, except for the issue
> > reported in this thread).
> 
> 
> That's correct - initdb doesn't work when you've built on WSL as
> *somehow* HAVE_FDATASYNC is set to 1 by configure - but it ends up not
> being included by #ifdef blocks. This causes the following PANIC

What do you mean by "not being included by #ifdef blocks"? The only
guard in issue_xlog_fsync() is #ifdef HAVE_FDATASYNC, which ought to be
independent of any includes?  I can see how this'd go wrong if configure
did *not* detect fdatasync, because then

#if defined(PLATFORM_DEFAULT_SYNC_METHOD)
#define DEFAULT_SYNC_METHOD        PLATFORM_DEFAULT_SYNC_METHOD

would trigger, which we explicitly set for linux:

/*
 * Set the default wal_sync_method to fdatasync.  With recent Linux versions,
 * xlogdefs.h's normal rules will prefer open_datasync, which (a) doesn't
 * perform better and (b) causes outright failures on ext4 data=journal
 * filesystems, because those don't support O_DIRECT.
 */
#define PLATFORM_DEFAULT_SYNC_METHOD    SYNC_METHOD_FDATASYNC

Greetings,

Andres Freund


Re: WSL (windows subsystem on linux) users will need to turn fsyncoff as of 11.2

From
James Sewell
Date:

What do you mean by "not being included by #ifdef blocks"? The only
guard in issue_xlog_fsync() is #ifdef HAVE_FDATASYNC, which ought to be
independent of any includes?  I can see how this'd go wrong if configure
did *not* detect fdatasync, because then

And now this looks like it works again from a clean build - something screwy with WSL perhaps? Or me?

Either way, I can't reproduce - annoyingly. 
 


The contents of this email are confidential and may be subject to legal or professional privilege and copyright. No representation is made that this email is free of viruses or other defects. If you have received this communication in error, you may not copy or distribute any part of it or otherwise disclose its contents to anyone. Please advise the sender of your incorrect receipt of this correspondence.

Re: WSL (windows subsystem on linux) users will need to turn fsyncoff as of 11.2

From
James Sewell
Date:

Here's a starter patch that shows one of the approaches discussed.  It
gets WSL users to a better place than they were before, by suppressing
further warnings after the first one.

This wasn't quite right, updated to check erro for ENOSYS (not rc)

This compiles and stops the panic on WSL (with a single warning). 

I haven't tested if a version compiled on Linux will behave the same way - but based on the error messages in the top post it looks like the behavior is the same.

--
James 


The contents of this email are confidential and may be subject to legal or professional privilege and copyright. No representation is made that this email is free of viruses or other defects. If you have received this communication in error, you may not copy or distribute any part of it or otherwise disclose its contents to anyone. Please advise the sender of your incorrect receipt of this correspondence.
Attachment

Re: WSL (windows subsystem on linux) users will need to turn fsyncoff as of 11.2

From
Thomas Munro
Date:
On Tue, Feb 19, 2019 at 5:16 PM James Sewell <james.sewell@jirotech.com> wrote:
>> Here's a starter patch that shows one of the approaches discussed.  It
>> gets WSL users to a better place than they were before, by suppressing
>> further warnings after the first one.
>
> This wasn't quite right, updated to check erro for ENOSYS (not rc)
>
> This compiles and stops the panic on WSL (with a single warning).
>
> I haven't tested if a version compiled on Linux will behave the same way - but based on the error messages in the top
postit looks like the behavior is the same.
 

Great.  Thanks for testing, and for the fix!  Well that all sounds
like good news: it corrects the behaviour from 11.2, and also improves
on the previous behaviour which I'd have accepted as a bug if anyone
had reported it.  So the next problem is that we don't have a
consensus on whether this is the right approach, so I don't feel like
I can commit it yet.  Does any want to make another concrete proposal?

-- 
Thomas Munro
https://enterprisedb.com


Re: WSL (windows subsystem on linux) users will need to turn fsyncoff as of 11.2

From
Thomas Munro
Date:
On Tue, Feb 19, 2019 at 5:31 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Tue, Feb 19, 2019 at 5:16 PM James Sewell <james.sewell@jirotech.com> wrote:
> >> Here's a starter patch that shows one of the approaches discussed.  It
> >> gets WSL users to a better place than they were before, by suppressing
> >> further warnings after the first one.
> >
> > This wasn't quite right, updated to check erro for ENOSYS (not rc)
> >
> > This compiles and stops the panic on WSL (with a single warning).
> >
> > I haven't tested if a version compiled on Linux will behave the same way - but based on the error messages in the
toppost it looks like the behavior is the same.
 
>
> Great.  Thanks for testing, and for the fix!  Well that all sounds
> like good news: it corrects the behaviour from 11.2, and also improves
> on the previous behaviour which I'd have accepted as a bug if anyone
> had reported it.  So the next problem is that we don't have a
> consensus on whether this is the right approach, so I don't feel like
> I can commit it yet.  Does any want to make another concrete proposal?

Ok, here's the version I'm planning to push soon if there are no objections.
Re-adding Bruce to the thread, as I just noticed the CC list got
pruned at some point in this thread.

-- 
Thomas Munro
https://enterprisedb.com

Attachment
Sounds good to me. Thank you!

On Fri, Feb 22, 2019 at 11:47 AM Thomas Munro <thomas.munro@gmail.com> wrote:
On Tue, Feb 19, 2019 at 5:31 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Tue, Feb 19, 2019 at 5:16 PM James Sewell <james.sewell@jirotech.com> wrote:
> >> Here's a starter patch that shows one of the approaches discussed.  It
> >> gets WSL users to a better place than they were before, by suppressing
> >> further warnings after the first one.
> >
> > This wasn't quite right, updated to check erro for ENOSYS (not rc)
> >
> > This compiles and stops the panic on WSL (with a single warning).
> >
> > I haven't tested if a version compiled on Linux will behave the same way - but based on the error messages in the top post it looks like the behavior is the same.
>
> Great.  Thanks for testing, and for the fix!  Well that all sounds
> like good news: it corrects the behaviour from 11.2, and also improves
> on the previous behaviour which I'd have accepted as a bug if anyone
> had reported it.  So the next problem is that we don't have a
> consensus on whether this is the right approach, so I don't feel like
> I can commit it yet.  Does any want to make another concrete proposal?

Ok, here's the version I'm planning to push soon if there are no objections.
Re-adding Bruce to the thread, as I just noticed the CC list got
pruned at some point in this thread.

--
Thomas Munro
https://enterprisedb.com

Re: WSL (windows subsystem on linux) users will need to turn fsyncoff as of 11.2

From
Thomas Munro
Date:
>> > Great.  Thanks for testing, and for the fix!  Well that all sounds
>> > like good news: it corrects the behaviour from 11.2, and also improves
>> > on the previous behaviour which I'd have accepted as a bug if anyone
>> > had reported it.  So the next problem is that we don't have a
>> > consensus on whether this is the right approach, so I don't feel like
>> > I can commit it yet.  Does any want to make another concrete proposal?
>>
>> Ok, here's the version I'm planning to push soon if there are no objections.
>> Re-adding Bruce to the thread, as I just noticed the CC list got
>> pruned at some point in this thread.

Pushed.

I also noticed that the call to sync_file_range() in file_utils.c used
by fsync_pgdata() ignores the return code, and in the 9.4 and 9.5
branches, the call in pg_flush_data() ignores the return code too.
This inconsistency should be fixed; I'll think about which direction
it should be fixed in (either we are convinced that
sync_file_range(SYNC_FILE_RANGE_WRITE) is non-destructive of error
state or we aren't, and should handle it everywhere), and maybe start
a new -hackers thread.

-- 
Thomas Munro
https://enterprisedb.com


Re: WSL (windows subsystem on linux) users will need to turn fsyncoff as of 11.2

From
Hendrickx Pablo
Date:
Who areb


From: Thomas Munro <thomas.munro@gmail.com>
Sent: Tuesday, February 19, 2019 5:31:26 AM
To: James Sewell
Cc: Andres Freund; Ravi Krishna; pgsql-generallists.postgresql.org
Subject: Re: WSL (windows subsystem on linux) users will need to turn fsync off as of 11.2
 
On Tue, Feb 19, 2019 at 5:16 PM James Sewell <james.sewell@jirotech.com> wrote:
>> Here's a starter patch that shows one of the approaches discussed.  It
>> gets WSL users to a better place than they were before, by suppressing
>> further warnings after the first one.
>
> This wasn't quite right, updated to check erro for ENOSYS (not rc)
>
> This compiles and stops the panic on WSL (with a single warning).
>
> I haven't tested if a version compiled on Linux will behave the same way - but based on the error messages in the top post it looks like the behavior is the same.

Great.  Thanks for testing, and for the fix!  Well that all sounds
like good news: it corrects the behaviour from 11.2, and also improves
on the previous behaviour which I'd have accepted as a bug if anyone
had reported it.  So the next problem is that we don't have a
consensus on whether this is the right approach, so I don't feel like
I can commit it yet.  Does any want to make another concrete proposal?

--
Thomas Munro
https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fenterprisedb.com&amp;data=02%7C01%7Cpablo.hendrickx%40exitas.be%7Ce13b08e515cd465ca6d808d696233602%7C49c3d703357947bfa8887c913fbdced9%7C0%7C1%7C636861475311562797&amp;sdata=4qLSN4n1kGMWIO6luNlNdvDAqV02UhQ4ArqDa%2FCulsU%3D&amp;reserved=0