Thread: [BUGS] Fails to work on live images due to fsync() on pg_commit_ts beforedoing any write there
[BUGS] Fails to work on live images due to fsync() on pg_commit_ts beforedoing any write there
From
Raphael Hertzog
Date:
[ 2nd try after subscription to pgsql-bugs ] Hello, PostgreSQL 10 no longer works on a (Kali) live system where the root filesystem is an overlayfs with an underlying squashfs filesystem (where postgresql and its initial file structure is present) and a writable tmpfs overlay. When you try to create a new database you get this failure: createdb: database creation failed: ERROR: checkpoint request failed HINT: Consult recent messages in the server log for details. And in the server log you have this: ERROR: could not fsync file "pg_commit_ts": Invalid argument When you strace the postgresql checkpointer process you see this: # strace -f -p 31599 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 open("pg_xact", O_RDONLY) = 3 fsync(3) = 0 close(3) = 0 open("pg_commit_ts", O_RDONLY) = 3 fsync(3) = -1 EINVAL (Invalid argument) close(3) = 0 rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP ABRT BUS FPE KILL SEGV CONT STOP SYS RTMIN RT_1], NULL, 8) = 0 write(2, "2017-11-07 09:47:38.580 UTC [315"..., 98) = 98 The reason why the second fsync() fails is because the pg_commit_ts directory has not had any change since its creation in the initial image. It is thus stored in the read-only squashfs filesystem and has not yet been copied up in the writable tmpfs (which does support fsync). In this case, overlayfs delegates the fsync() call to the read-only squashfs filesystem which returns EINVAL as it does not support such an operation. This has been explained by the overlayfs upstream developer (to which I reported this bug initially, thinking it was an overlayfs regression): https://marc.info/?l=linux-unionfs&m=151005246512873&w=2 https://marc.info/?l=linux-unionfs&m=151005699414227&w=2 My request is thus that PostgreSQL should fsync that directory only after it has made changes to the directory or its content. PostgreSQL 9.6 was working fine in the same setup and I would like PostgreSQL 10 to do the same. :) I'm ccing Teodor Sigaev <teodor@sigaev.ru> because I believe that the problematic fsync() has been added by him in this commit: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=1b02be21f271db6bd3cd43abb23fa596fcb6bac3 Cheers, -- Raphaël Hertzog ◈ Writer/Consultant ◈ Debian Developer Discover the Debian Administrator's Handbook: → https://debian-handbook.info/get/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] Fails to work on live images due to fsync() onpg_commit_ts before doing any write there
From
Alvaro Herrera
Date:
Raphael Hertzog wrote: > PostgreSQL 10 no longer works on a (Kali) live system where the > root filesystem is an overlayfs with an underlying squashfs > filesystem (where postgresql and its initial file structure > is present) and a writable tmpfs overlay. Please create a machine that works this way and get it added to the buildfarm, so that this sort of thing doesn't surprise us in the future months after the fact.https://wiki.postgresql.org/wiki/Buildfarm Thanks -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] Fails to work on live images due to fsync() onpg_commit_ts before doing any write there
From
Michael Paquier
Date:
On Tue, Nov 7, 2017 at 10:54 PM, Raphael Hertzog <hertzog@debian.org> wrote: > This has been explained by the overlayfs upstream developer > (to which I reported this bug initially, thinking it was an > overlayfs regression): > https://marc.info/?l=linux-unionfs&m=151005246512873&w=2 > https://marc.info/?l=linux-unionfs&m=151005699414227&w=2 > > My request is thus that PostgreSQL should fsync that directory only after > it has made changes to the directory or its content. PostgreSQL 9.6 was > working fine in the same setup and I would like PostgreSQL 10 to do the > same. :) > > I'm ccing Teodor Sigaev <teodor@sigaev.ru> because I believe that > the problematic fsync() has been added by him in this commit: > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=1b02be21f271db6bd3cd43abb23fa596fcb6bac3 Hm. I am wondering if we should change fsync_fname_ext() so as EINVAL is considered as a no-op. EIO and EINTR should really be caught with a proper error, but I am not sure about this one. Thoughts? -- Michael -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] Fails to work on live images due to fsync() onpg_commit_ts before doing any write there
From
Stephen Frost
Date:
Alvaro, Raphael, * Alvaro Herrera (alvherre@alvh.no-ip.org) wrote: > Raphael Hertzog wrote: > > > PostgreSQL 10 no longer works on a (Kali) live system where the > > root filesystem is an overlayfs with an underlying squashfs > > filesystem (where postgresql and its initial file structure > > is present) and a writable tmpfs overlay. > > Please create a machine that works this way and get it added to the > buildfarm, so that this sort of thing doesn't surprise us in the future > months after the fact. While I agree with this, I'm not entirely convinced that this isn't an issue with the implementation of the underlying filesystem after all. I haven't had a chance to go read those other bug reports, but my fsync() manpage pretty clearly seems to say that fsync should only be returning EINVAL if it's called on a special file (FIFO, pipe, et al). There's certainly no indication that it's ok for the same file to sometimes support fsync() and other times *not* support fsync(). That's pretty bizarre. Why wouldn't it make sense for the filesystem to realize it's a no-op if there's been no changes? Thanks! Stephen