Thread: BUG #15636: PostgreSQL 11.1 pg_basebackup backup to a CIFS destination throws fsync error at end of backup

The following bug has been logged on the website:

Bug reference:      15636
Logged by:          John Klann
Email address:      jk7255@gmail.com
PostgreSQL version: 11.1
Operating system:   Red Hat Enterprise Linux Server release 7.5
Description:

Issue:
    - PostgreSQL 11.1 pg_basebackup and pg_dump parallel database backup to a
CIFS destination throws fsync error at the very end of the backup.
    - Command:  pg_basebackup -D /cifs/backups/<backupDirectoryName> -U
backupuser -Ft -Z 1 -X fetch -p 5432
    - Error: pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>": Invalid argument

Details:
We are preparing to move to PostgreSQL 11.1 from 9.3.x, this move will be a
complete rebuild on new hardware. After setting up the new hardware
installing/configuring Rhel 7.5 (updated), installing/configuring PostgreSQL
11.1 I tested our migration process, parallel dump --> parellel restoring
databases worked without issue.

I started testing backups and that is when I came across the fsync error. I
have seen references that PostgreSQL does not support storing the data
directory on CIFS due to similar issues. Although I have not found any
reference to backing up to CIFS not being supported. I am able to fully
restore and recovery from these backups no issue and based off the research
I have done I would suspect some sort of issue of cifs not supporting the
fsync call on the containing directory level. I put examples below of all of
the testing I have performed that has also lead me to this conclusion.

Environment:
    - Server
        ○ Model: Dell R740 
        ○ RAM: 768 GB RDIMM 2666MT/s
        ○ Processor: Intel Xeon Gold 6146 3.2G 24.75MB
            § 2 Nodes, 12 cores each, HT = total cores of 48
        ○ Storage:
            § OS: local ssd in raid 
            § pgdata, pgwal, pglog: each on their own dedicated EMC XIO luns attached
via 16 8Gbps paths (2x QLogic 2562, Dual Port 8Gb Optical Fibre Channel
HBAs) 
                □ XIO is XtremIO V1 in two brick clustered configuration
        ○ OS:
            § uname -a:
                □  Linux 3.10.0-862.14.4.el7.x86_64 #1 SMP Fri Sep 21 09:07:21 UTC 2018
x86_64 x86_64 x86_64 GNU/Linux
            § cat /etc/redhat-release:
                □ Red Hat Enterprise Linux Server release 7.5 (Maipo)
        ○ PostgreSQL
            § PostgreSQL 11.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-28), 64-bit
            § Custom Configs:
                [postgres@servername data1]$ cat postgresql.auto.conf 
                # Do not edit this file manually!
                # It will be overwritten by the ALTER SYSTEM command.
                port = '5432'
                listen_addresses = '0.0.0.0'
                work_mem = '8MB'
                maintenance_work_mem = '1GB'
                random_page_cost = '1.0'
                track_functions = 'all'
                wal_buffers = '-1'
                checkpoint_timeout = '10min'
                checkpoint_completion_target = '0.9'
                checkpoint_warning = '30s'
                log_destination = 'csvlog'
                log_directory = '/dbalog/data1'
                log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
                log_min_messages = 'error'
                log_min_error_statement = 'error'
                log_line_prefix = '[%u] [%d] [%h] [%m] [%p]:>'
                log_rotation_size = '10MB'
                log_statement = 'ddl'
                shared_buffers = '8GB'
                max_connections = '500'
                effective_cache_size = '589824 MB'
                wal_level = 'replica'
                max_wal_senders = '2'
                archive_mode = 'on'
                archive_command = 'test ! -f /pgxlog1/data1/%f || cp /pgxlog1/data1/%f
/cifs/backups/<backupDirectoryName>/dmp/archive/%f'
                log_disconnections = 'on'
                standard_conforming_strings = 'off'
            § Databases
                □ 4 dbs 
                    ® 3 - very small < 3 GB total
                    ® 1 - 964.67 GB
                □ Load: OLTP, DW mix
    - CIFS (Backup destination)
        ○ Windows 2012 R2 (last patched September/2018)
        ○ VNX Block Storage lun configured for CIFS using NTFS

Testing/Reproduction:
    - T1:
        ○ basebackup to windows cifs
            § Command:
                □ pg_basebackup -D /cifs/backups/<backupDirectoryName> -U backupuser -Ft
-Z 1 -X fetch -p 5432
            § Error: pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>": Invalid argument
    - T2:
        ○ Same backup as T1 with much less data - same result
    - T3:
        ○ Same as T2 with -N (--no-sync) option - no error (fairly obvious why)
    - T4 
        ○ basebackup same dataset as T2, no tar, no compression going to windows
cifs
            § Command:
                □  pg_basebackup -D /cifs/backups/dbadb1linbos/5432/dmp/basebkp -U
backupuser -X fetch -p 5432
            § Same error seems to happen on all directories:
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/base/1": Invalid argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/base/13877": Invalid argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/base/13878": Invalid argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/base/16397": Invalid argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/base/16660": Invalid argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/base": Invalid argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/global": Invalid argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/log": Invalid argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/pg_commit_ts": Invalid
argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/pg_dynshmem": Invalid
argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/pg_logical/mappings": Invalid
argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/pg_logical/snapshots": Invalid
argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/pg_logical": Invalid argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/pg_multixact/members": Invalid
argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/pg_multixact/offsets": Invalid
argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/pg_multixact": Invalid
argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/pg_notify": Invalid argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/pg_replslot": Invalid
argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/pg_serial": Invalid argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/pg_snapshots": Invalid
argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/pg_stat": Invalid argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/pg_stat_tmp": Invalid
argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/pg_subtrans": Invalid
argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/pg_tblspc": Invalid argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/pg_twophase": Invalid
argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/pg_wal/archive_status": Invalid
argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/pg_wal": Invalid argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/pg_xact": Invalid argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp": Invalid argument
                pg_basebackup: could not fsync file
"/cifs/backups/<backupDirectoryName>/basebkp/pg_tblspc": Invalid argument
    - T5
        ○ pg_dump single threaded backup of a database to windows cifs - no
error
    - T6
        ○ pg_dump multithreaded backup of a database to windows cifs - same fsync
error on containing directory
            § Command: pg_dump -p 5432 -j 16 -f
/cifs/backups/<backupDirectoryName>/dmp/DBA_02132019_133120  -U backupuser
-Fd -d DBA
            § Error: 
                □ pg_dump: could not fsync file
"/cifs/backups/<backupDirectoryName>/dmp/DBA_02132019_133120": Invalid
argument
    - T7
        ○ Same as T6 but with --no-sync option - no error
    - T8
        ○ Same as T1 but to local storage (XIO SAN attached Lun ext4)
            § No error
    - T9
        ○ Same as T1 but to linux CIFS share with XIO SAN attached lun ext4 (same
version of linux)
            § Same Fsync error
    - T10
        ○ Same as T1 but to linux NFS share with XIO SAN attached lun ext4 (same
version of linux)
            § No Error
Questions:
    - Is backing up to CIFS supported?
    - Based on research and other reported issues that the error may mean that
cifs handles this call differently or already performs this action itself,
should this error bring the integrity of the backup into question?
        ○ In what scenarios would it bring the integrity into question?
        ○ Issue reference, see bug #6372 thread:
https://www.postgresql.org/message-id/1149.1325535272%40sss.pgh.pa.us
    - Is there a workaround or configuration that we could use that maintains
using fsync to our current windows CIFS configuration?
    - Is there a better way to check integrity of backup rather than restoring
and performing a dump backup?
    - Recommended steps forward?


On Fri, Feb 15, 2019 at 10:15 AM PG Bug reporting form
<noreply@postgresql.org> wrote:
>                                 pg_basebackup: could not fsync file
> "/cifs/backups/<backupDirectoryName>/basebkp/base/1": Invalid argument
>                                 pg_basebackup: could not fsync file

Hmm, it looks like your system gives EINVAL when you try to fsync a
directory.  Perhaps we should teach fsync__fname() about that here:

        /*
         * Some OSes don't allow us to fsync directories at all, so we
can ignore
         * those errors. Anything else needs to be reported.
         */
        if (returncode != 0 && !(isdir && errno == EBADF))
        {
                fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
                                progname, fname, strerror(errno));
                (void) close(fd);
                return -1;
        }

EINVAL actually makes more sense to me than EBADF for a filesystem
that can't fsync directories.  From POSIX: EINVAL = "The fildes
argument does not refer to a file on which this operation is
possible." vs EBADF "The fildes argument is not a valid descriptor."
It *is* a valid descriptor, it's just not a valid operation
(apparently).

Quick googling on the topic tells me that CIFS directory operations
are "synchronous", so fsync'ing isn't necessary.  However, they only
made it silently do nothing in a recent version:

https://github.com/torvalds/linux/commit/6e70c267e68d77679534dcf4aaf84e66f2cf1425

Presumably before that you get EINVAL because there is no handler
registered.  The commit message even mentions that this was breaking
stuff like us.

-- 
Thomas Munro
http://www.enterprisedb.com


+adding back to thread

On Thu, Feb 14, 2019 at 5:49 PM John Klann <jk7255@gmail.com> wrote:
Thanks Thomas that certainly makes sense considering the commit comment as well: "Directory operations under CIFS/SMB2/SMB3 are synchronous, so fsync()".

I think will probably not use the --no-sync option then and handle for the EINVAL messages.


On Thu, Feb 14, 2019 at 5:31 PM Thomas Munro <thomas.munro@enterprisedb.com> wrote:
Hi John,
Unfortunately that applies only to directories, not to regular files
inside them.
--no-sync should get you past the problem as a workaround for now, but
then if your server(s) crash/lose power in the following seconds the
data might not be on disk.  Fsync just means that the program doesn't
return until the data is flushed, but within some short period of time
the data will be flushed anyway, and power failure in the short window
before that is really unlikely, it's more a theoretical issue that we
database hackers like to worry about.

On Fri, Feb 15, 2019 at 11:22 AM John Klann <jk7255@gmail.com> wrote:
>
> Ah thank you this all makes sense.
>
> If CIFS is synchronous and and fsync'ing is not necessary then running with the --no-sync option should be safe and possibly more performant correct?
>
> John
>
> On Thu, Feb 14, 2019 at 5:06 PM Thomas Munro <thomas.munro@enterprisedb.com> wrote:
>>
>> On Fri, Feb 15, 2019 at 10:15 AM PG Bug reporting form
>> <noreply@postgresql.org> wrote:
>> >                                 pg_basebackup: could not fsync file
>> > "/cifs/backups/<backupDirectoryName>/basebkp/base/1": Invalid argument
>> >                                 pg_basebackup: could not fsync file
>>
>> Hmm, it looks like your system gives EINVAL when you try to fsync a
>> directory.  Perhaps we should teach fsync__fname() about that here:
>>
>>         /*
>>          * Some OSes don't allow us to fsync directories at all, so we
>> can ignore
>>          * those errors. Anything else needs to be reported.
>>          */
>>         if (returncode != 0 && !(isdir && errno == EBADF))
>>         {
>>                 fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
>>                                 progname, fname, strerror(errno));
>>                 (void) close(fd);
>>                 return -1;
>>         }
>>
>> EINVAL actually makes more sense to me than EBADF for a filesystem
>> that can't fsync directories.  From POSIX: EINVAL = "The fildes
>> argument does not refer to a file on which this operation is
>> possible." vs EBADF "The fildes argument is not a valid descriptor."
>> It *is* a valid descriptor, it's just not a valid operation
>> (apparently).
>>
>> Quick googling on the topic tells me that CIFS directory operations
>> are "synchronous", so fsync'ing isn't necessary.  However, they only
>> made it silently do nothing in a recent version:
>>
>> https://github.com/torvalds/linux/commit/6e70c267e68d77679534dcf4aaf84e66f2cf1425
>>
>> Presumably before that you get EINVAL because there is no handler
>> registered.  The commit message even mentions that this was breaking
>> stuff like us.
>>
>> --
>> Thomas Munro
>> http://www.enterprisedb.com



--
Thomas Munro
http://www.enterprisedb.com
On Fri, Feb 15, 2019 at 11:53 AM John Klann <jk7255@gmail.com> wrote:
>>> >> Hmm, it looks like your system gives EINVAL when you try to fsync a
>>> >> directory.  Perhaps we should teach fsync__fname() about that here:
>>> >>
>>> >>         /*
>>> >>          * Some OSes don't allow us to fsync directories at all, so we
>>> >> can ignore
>>> >>          * those errors. Anything else needs to be reported.
>>> >>          */
>>> >>         if (returncode != 0 && !(isdir && errno == EBADF))
>>> >>         {
>>> >>                 fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
>>> >>                                 progname, fname, strerror(errno));
>>> >>                 (void) close(fd);
>>> >>                 return -1;
>>> >>         }

Here is a patch like that.  Any chance you could test it?  Happy to
send instructions on how to do that if necessary.  Does anyone think
it'd be a bad idea to do this, and possibly even back-patch it?  We
have tolerated EBADF since 7d7db18a (which OS is that for?)

-- 
Thomas Munro
http://www.enterprisedb.com

Attachment
Happy to test it, I will need the instructions if you can. It's for Rhel 7.5 Maipo latest. (Linux 3.10.0-862.14.4.el7.x86_64 #1 SMP Fri Sep 21 09:07:21 UTC 2018
x86_64 x86_64 x86_64 GNU/Linux)

Thanks.


On Thu, Feb 14, 2019, 23:11 Thomas Munro <thomas.munro@enterprisedb.com wrote:
On Fri, Feb 15, 2019 at 11:53 AM John Klann <jk7255@gmail.com> wrote:
>>> >> Hmm, it looks like your system gives EINVAL when you try to fsync a
>>> >> directory.  Perhaps we should teach fsync__fname() about that here:
>>> >>
>>> >>         /*
>>> >>          * Some OSes don't allow us to fsync directories at all, so we
>>> >> can ignore
>>> >>          * those errors. Anything else needs to be reported.
>>> >>          */
>>> >>         if (returncode != 0 && !(isdir && errno == EBADF))
>>> >>         {
>>> >>                 fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
>>> >>                                 progname, fname, strerror(errno));
>>> >>                 (void) close(fd);
>>> >>                 return -1;
>>> >>         }

Here is a patch like that.  Any chance you could test it?  Happy to
send instructions on how to do that if necessary.  Does anyone think
it'd be a bad idea to do this, and possibly even back-patch it?  We
have tolerated EBADF since 7d7db18a (which OS is that for?)

--
Thomas Munro
http://www.enterprisedb.com
On Fri, Feb 15, 2019 at 5:27 PM John Klann <jk7255@gmail.com> wrote:
> Happy to test it, I will need the instructions if you can. It's for Rhel 7.5 Maipo latest. (Linux
3.10.0-862.14.4.el7.x86_64#1 SMP Fri Sep 21 09:07:21 UTC 2018
 
> x86_64 x86_64 x86_64 GNU/Linux)

Thanks!  Use yum to install gcc, make, bison, yacc, git,
readline-devel, zlib-devel.  (Hope I'm not forgetting anything
here...)

Pull down the sources:

$ git clone https://github.com/postgres/postgres.git

Go into the source directory:

$ cd postgres

Let's say we want to test with the 11 branch (so you can test against
an existing 11.x development environment database you might have
installed the usual way with packages, as opposed to the master branch
where we are developing 12).  Switch to that branch:

$ git checkout REL_11_STABLE

Create a new branch for this experiment:

$ git checkout -b tolerate-einval

Apply the patch (adjust path to wherever you put the patch):

$ git am 0001-Tolerate-EINVAL-when-calling-fsync-on-a-directory.patch

Configure, build, install (the prefix is where it will install the
resulting binaries, adjust to taste, -j is how many CPUs you have if
you want to build fast):

$ mkdir $HOME/install
$ ./configure --prefix=$HOME/install

$ make -s -j4 && make -s install

Optionally, run a self-test to see if everything is sane:

$ make -s check

Now you should have binaries like $HOME/install/bin/pg_basebackup and
you can test if they work on CIFS.

-- 
Thomas Munro
http://www.enterprisedb.com


Great thank you I will let you know how is goes.

John

On Thu, Feb 14, 2019, 23:47 Thomas Munro <thomas.munro@enterprisedb.com wrote:
On Fri, Feb 15, 2019 at 5:27 PM John Klann <jk7255@gmail.com> wrote:
> Happy to test it, I will need the instructions if you can. It's for Rhel 7.5 Maipo latest. (Linux 3.10.0-862.14.4.el7.x86_64 #1 SMP Fri Sep 21 09:07:21 UTC 2018
> x86_64 x86_64 x86_64 GNU/Linux)

Thanks!  Use yum to install gcc, make, bison, yacc, git,
readline-devel, zlib-devel.  (Hope I'm not forgetting anything
here...)

Pull down the sources:

$ git clone https://github.com/postgres/postgres.git

Go into the source directory:

$ cd postgres

Let's say we want to test with the 11 branch (so you can test against
an existing 11.x development environment database you might have
installed the usual way with packages, as opposed to the master branch
where we are developing 12).  Switch to that branch:

$ git checkout REL_11_STABLE

Create a new branch for this experiment:

$ git checkout -b tolerate-einval

Apply the patch (adjust path to wherever you put the patch):

$ git am 0001-Tolerate-EINVAL-when-calling-fsync-on-a-directory.patch

Configure, build, install (the prefix is where it will install the
resulting binaries, adjust to taste, -j is how many CPUs you have if
you want to build fast):

$ mkdir $HOME/install
$ ./configure --prefix=$HOME/install

$ make -s -j4 && make -s install

Optionally, run a self-test to see if everything is sane:

$ make -s check

Now you should have binaries like $HOME/install/bin/pg_basebackup and
you can test if they work on CIFS.

--
Thomas Munro
http://www.enterprisedb.com
Hi Thomas,

I was able to test the patch without issue and it appears to have worked see below. Let me know if there is further testing I can do or logging you would like.

existing bin:
[postgres@server]$ /usr/pgsql-11/bin/pg_basebackup -D /cifs/backups/99testNoPatch -U backupuser -Ft -Z 1 -X fetch -p 5433                 
pg_basebackup: could not fsync file "/cifs/backups/99testNoPatch": Invalid argument
patched bin:
[postgres@server]$ /home/postgres/patchinstall/bin/pg_basebackup -D /cifs/backups/99testWithPatch -U backupuser -Ft -Z 1 -X fetch -p 5433 
[postgres@server]$ 

I also restored the from the patched backup no issues:
2019-02-15 10:29:12.426 EST, LOG, "ending log output to stderr",,"Future log output will go to log destination ""csvlog""."
2019-02-15 10:29:12.431 EST, LOG, "database system was interrupted; last known up at 2019-02-15 10:18:00 EST"
2019-02-15 10:29:12.605 EST, LOG, "starting point-in-time recovery to 2019-02-15 10:29:09-05"
2019-02-15 10:29:12.796 EST, LOG, "restored log file ""000000010000000000000008"" from archive"
2019-02-15 10:29:12.845 EST, LOG, "redo starts at 0/8000028"
2019-02-15 10:29:12.848 EST, LOG, "consistent recovery state reached at 0/80000F8"
2019-02-15 10:29:12.849 EST, LOG, "database system is ready to accept read only connections"
2019-02-15 10:29:13.013 EST, LOG, "restored log file ""000000010000000000000009"" from archive"
2019-02-15 10:29:13.039 EST, LOG, "redo done at 0/9000140"
2019-02-15 10:29:13.135 EST, LOG, "restored log file ""000000010000000000000009"" from archive"
2019-02-15 10:29:13.188 EST, LOG, "selected new timeline ID: 3"
2019-02-15 10:29:13.228 EST, LOG, "archive recovery complete"
2019-02-15 10:29:13.363 EST, LOG, "database system is ready to accept connections"

Thanks,

John


On Fri, Feb 15, 2019 at 12:53 AM John Klann <jk7255@gmail.com> wrote:
Great thank you I will let you know how is goes.

John

On Thu, Feb 14, 2019, 23:47 Thomas Munro <thomas.munro@enterprisedb.com wrote:
On Fri, Feb 15, 2019 at 5:27 PM John Klann <jk7255@gmail.com> wrote:
> Happy to test it, I will need the instructions if you can. It's for Rhel 7.5 Maipo latest. (Linux 3.10.0-862.14.4.el7.x86_64 #1 SMP Fri Sep 21 09:07:21 UTC 2018
> x86_64 x86_64 x86_64 GNU/Linux)

Thanks!  Use yum to install gcc, make, bison, yacc, git,
readline-devel, zlib-devel.  (Hope I'm not forgetting anything
here...)

Pull down the sources:

$ git clone https://github.com/postgres/postgres.git

Go into the source directory:

$ cd postgres

Let's say we want to test with the 11 branch (so you can test against
an existing 11.x development environment database you might have
installed the usual way with packages, as opposed to the master branch
where we are developing 12).  Switch to that branch:

$ git checkout REL_11_STABLE

Create a new branch for this experiment:

$ git checkout -b tolerate-einval

Apply the patch (adjust path to wherever you put the patch):

$ git am 0001-Tolerate-EINVAL-when-calling-fsync-on-a-directory.patch

Configure, build, install (the prefix is where it will install the
resulting binaries, adjust to taste, -j is how many CPUs you have if
you want to build fast):

$ mkdir $HOME/install
$ ./configure --prefix=$HOME/install

$ make -s -j4 && make -s install

Optionally, run a self-test to see if everything is sane:

$ make -s check

Now you should have binaries like $HOME/install/bin/pg_basebackup and
you can test if they work on CIFS.

--
Thomas Munro
http://www.enterprisedb.com
On Sat, Feb 16, 2019 at 4:39 AM John Klann <jk7255@gmail.com> wrote:
> I was able to test the patch without issue and it appears to have worked see below. Let me know if there is further
testingI can do or logging you would like.
 

Thanks.  So that leaves the question for the list: should we accept
this as a bug and consider back-patching it?  Our relationship with
fsync() is ... complicated at the moment.  Considering that we already
tolerated EBADF for directories on some other (forgotten?) operating
system, I think I would vote +1 for doing that ... if I didn't know
that they'd fixed this on the Linux side, and I see now that they've
back-patched that even to long term kernel 3.16:

https://cdn.kernel.org/pub/linux/kernel/v3.x/ChangeLog-3.16.60

You mentioned that you were on RHEL 7.5, which ships with a 3.10
kernel.  When you eventually move to RHEL 8 (4.x) or even install an
alternative newer kernel for 7.x, the problem will hopefully go away
by itself.  Considering that, the lack of other complaints, and the
availability of a workaround (--no-sync followed by "sync"), I'm not
so sure it's worth committing this.

PS I found an earlier discussion of this topic that didn't go
anywhere:  https://www.postgresql.org/message-id/flat/E1RhkjA-0005bq-B6%40wrigleys.postgresql.org

-- 
Thomas Munro
http://www.enterprisedb.com


On Mon, Feb 18, 2019 at 02:34:15PM +1300, Thomas Munro wrote:
> You mentioned that you were on RHEL 7.5, which ships with a 3.10
> kernel.  When you eventually move to RHEL 8 (4.x) or even install an
> alternative newer kernel for 7.x, the problem will hopefully go away
> by itself.  Considering that, the lack of other complaints, and the
> availability of a workaround (--no-sync followed by "sync"), I'm not
> so sure it's worth committing this.

If we get to ignore EINVAL, then we won't know that fsync has failed
even in cases where the caller has done an incorrect thing by using a
special file, which sounds like a bad idea for the core code as much
as any plugins calling that?
--
Michael

Attachment
On Mon, Feb 18, 2019 at 3:55 PM Michael Paquier <michael@paquier.xyz> wrote:
> On Mon, Feb 18, 2019 at 02:34:15PM +1300, Thomas Munro wrote:
> > ...  Considering that, the lack of other complaints, and the
> > availability of a workaround (--no-sync followed by "sync"), I'm not
> > so sure it's worth committing this.
>
> If we get to ignore EINVAL, then we won't know that fsync has failed
> even in cases where the caller has done an incorrect thing by using a
> special file, which sounds like a bad idea for the core code as much
> as any plugins calling that?

Well the theory is that it's OK because it's only for directories (the
same case in which case we already tolerate EBADF (for an OS that
nobody documented so we don't really know if/when we can take it
out)).  But I agree with you that the patch does not spark joy.

-- 
Thomas Munro
http://www.enterprisedb.com


On Mon, Feb 18, 2019 at 04:10:58PM +1300, Thomas Munro wrote:
> Well the theory is that it's OK because it's only for directories (the
> same case in which case we already tolerate EBADF (for an OS that
> nobody documented so we don't really know if/when we can take it
> out)).  But I agree with you that the patch does not spark joy.

If I would vote for this patch, that would be -0.1, meaning that I am
not strongly opposed to it, but I don't see any strong point either to
do something about it as there are downsides.  When in doubt about a
back-patch, I use as basic rule to never back-patch so as we are sure
that users avoid any surprises and keep relying on the existing
design.
--
Michael

Attachment
Greetings,

* Thomas Munro (thomas.munro@enterprisedb.com) wrote:
> On Sat, Feb 16, 2019 at 4:39 AM John Klann <jk7255@gmail.com> wrote:
> > I was able to test the patch without issue and it appears to have worked see below. Let me know if there is further
testingI can do or logging you would like. 
>
> Thanks.  So that leaves the question for the list: should we accept
> this as a bug and consider back-patching it?  Our relationship with
> fsync() is ... complicated at the moment.

Yes, we should accept this as a bug and back-patch it, imv.

> Considering that we already
> tolerated EBADF for directories on some other (forgotten?) operating
> system, I think I would vote +1 for doing that ... if I didn't know
> that they'd fixed this on the Linux side, and I see now that they've
> back-patched that even to long term kernel 3.16:
>
> https://cdn.kernel.org/pub/linux/kernel/v3.x/ChangeLog-3.16.60

Which seems like it's just another good reason why we should also
back-patch it.

> PS I found an earlier discussion of this topic that didn't go
> anywhere:  https://www.postgresql.org/message-id/flat/E1RhkjA-0005bq-B6%40wrigleys.postgresql.org

As does this.

I would be much more concerned about a change like this if we either
didn't know it was a directory that we got fsync() EINVAL for.  Since we
know it's a directory and we know there's filesystems out there which do
synchronous metadata changes and we know they return EINVAL in those
cases, I'm inclined to say we should allow it.  In the above thread, Tom
mentioned the standard, which pretty clearly distinguishes between "this
thing you gave to fsync() doesn't support synchronization" and "an I/O
error occured" and it's really the latter that we care about here.

Further, saying "just use --no-sync" is *not* a workaround.  That's
throwing the baby out with the bathwater.  A workaround would be an
option like "--do-not-complain-about-directory-fsync-einval", but
suggesting that all fsync's be removed is terrible.

Thanks!

Stephen

Attachment
Greetings,

* PG Bug reporting form (noreply@postgresql.org) wrote:
>                 archive_mode = 'on'
>                 archive_command = 'test ! -f /pgxlog1/data1/%f || cp /pgxlog1/data1/%f
> /cifs/backups/<backupDirectoryName>/dmp/archive/%f'

I realize you were complaining about pg_basebackup here, but since you
shared your config, I wanted to point out that the above is an
absolutely *terrible* archive_command, *especially* when you've got a
network filesystem involved.

There's no guarantee with that archive command that the WAL makes it to
the backup server or that it's completely written out on the backup
server before the PG server removes the WAL- if something happened (a
network hiccup, the CIFS server being rebooted, or the PG server
restarting) there's a real possibility that you'd lose some portion of
your WAL, which could make a backup invalid, or could make it so you
can't do PITR.

Please don't roll your own backup solution like this.

Thanks!

Stephen

Attachment
Thanks Stephen for pointing this out.

John

On Mon, Feb 18, 2019 at 11:26 AM Stephen Frost <sfrost@snowman.net> wrote:
Greetings,

* PG Bug reporting form (noreply@postgresql.org) wrote:
>                               archive_mode = 'on'
>                               archive_command = 'test ! -f /pgxlog1/data1/%f || cp /pgxlog1/data1/%f
> /cifs/backups/<backupDirectoryName>/dmp/archive/%f'

I realize you were complaining about pg_basebackup here, but since you
shared your config, I wanted to point out that the above is an
absolutely *terrible* archive_command, *especially* when you've got a
network filesystem involved.

There's no guarantee with that archive command that the WAL makes it to
the backup server or that it's completely written out on the backup
server before the PG server removes the WAL- if something happened (a
network hiccup, the CIFS server being rebooted, or the PG server
restarting) there's a real possibility that you'd lose some portion of
your WAL, which could make a backup invalid, or could make it so you
can't do PITR.

Please don't roll your own backup solution like this.

Thanks!

Stephen
Hi,

So, for this patch tolerating EINVAL for fsync of directories (but not
files), enabling eg pg_backbackup on CIFS on eg Linux 3.10 to work,
tested by John, we have:

Stephen: +1
Michael: -0.1
Thomas: -0.1

Any other opinions?

Just in case you think it's strange that I'm voting against my own
patch: I'd probably be for it if I hadn't discovered that they've
fixed this in Linux 3.16 and later so that it succeeds.  It's
apparently not in the default RHEL 6 and 7 kernels, though, and the
latter could be around for a while.  I'm not entirely sure what amount
of work we should be doing to tolerate problems that are fixed in a
newer versions.  One argument is that a 3.10 user who cares about this
should petition RH to back-port the fix into that kernel.  (The nearby
WSL thread has some things in common but is a more clear cut case IMV
because there is no fix available on the WSL side.)

-- 
Thomas Munro
https://enterprisedb.com


Hi,

On 2019-02-23 11:20:02 +1300, Thomas Munro wrote:
> So, for this patch tolerating EINVAL for fsync of directories (but not
> files), enabling eg pg_backbackup on CIFS on eg Linux 3.10 to work,
> tested by John, we have:
> 
> Stephen: +1
> Michael: -0.1
> Thomas: -0.1
> 
> Any other opinions?
> 
> Just in case you think it's strange that I'm voting against my own
> patch: I'd probably be for it if I hadn't discovered that they've
> fixed this in Linux 3.16 and later so that it succeeds.  It's
> apparently not in the default RHEL 6 and 7 kernels, though, and the
> latter could be around for a while.  I'm not entirely sure what amount
> of work we should be doing to tolerate problems that are fixed in a
> newer versions.  One argument is that a 3.10 user who cares about this
> should petition RH to back-port the fix into that kernel.  (The nearby
> WSL thread has some things in common but is a more clear cut case IMV
> because there is no fix available on the WSL side.)

Hm, given the fact that EINVAL doesn't appear to be triggerable by data
level issues, I don't see much reason not to allow it. I mean it's
really stupid that cifs ever returns it, but there's going to be a lot
of users running things on older kernel for the forseeable future.

Greetings,

Andres Freund


On Sat, Feb 23, 2019 at 11:28 AM Andres Freund <andres@anarazel.de> wrote:
> On 2019-02-23 11:20:02 +1300, Thomas Munro wrote:
> > So, for this patch tolerating EINVAL for fsync of directories (but not
> > files), enabling eg pg_backbackup on CIFS on eg Linux 3.10 to work,
> > tested by John, we have:
> >
> > Stephen: +1
> > Michael: -0.1
> > Thomas: -0.1
> >
> > Any other opinions?
> >
> > Just in case you think it's strange that I'm voting against my own
> > patch: I'd probably be for it if I hadn't discovered that they've
> > fixed this in Linux 3.16 and later so that it succeeds.  It's
> > apparently not in the default RHEL 6 and 7 kernels, though, and the
> > latter could be around for a while.  I'm not entirely sure what amount
> > of work we should be doing to tolerate problems that are fixed in a
> > newer versions.  One argument is that a 3.10 user who cares about this
> > should petition RH to back-port the fix into that kernel.  (The nearby
> > WSL thread has some things in common but is a more clear cut case IMV
> > because there is no fix available on the WSL side.)
>
> Hm, given the fact that EINVAL doesn't appear to be triggerable by data
> level issues, I don't see much reason not to allow it. I mean it's
> really stupid that cifs ever returns it, but there's going to be a lot
> of users running things on older kernel for the forseeable future.

Thanks.  Pushed.

FTR here is the discussion about the pre-existing tolerance of EBADF
for directories:

https://www.postgresql.org/message-id/flat/20100215005057.5AC8F7541C5%40cvs.postgresql.org

-- 
Thomas Munro
https://enterprisedb.com