[BUGS] Fails to work on live images due to fsync() on pg_commit_ts beforedoing any write there - Mailing list pgsql-bugs

From Raphael Hertzog
Subject [BUGS] Fails to work on live images due to fsync() on pg_commit_ts beforedoing any write there
Date
Msg-id 20171107135454.lbelbbvfgadljmuj@home.ouaza.com
Whole thread Raw
Responses Re: [BUGS] Fails to work on live images due to fsync() onpg_commit_ts before doing any write there  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Re: [BUGS] Fails to work on live images due to fsync() onpg_commit_ts before doing any write there  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-bugs
[ 2nd try after subscription to pgsql-bugs ]

Hello,

PostgreSQL 10 no longer works on a (Kali) live system where the
root filesystem is an overlayfs with an underlying squashfs
filesystem (where postgresql and its initial file structure
is present) and a writable tmpfs overlay.

When you try to create a new database you get this failure:
createdb: database creation failed: ERROR: checkpoint request failed
HINT: Consult recent messages in the server log for details.

And in the server log you have this:
ERROR: could not fsync file "pg_commit_ts": Invalid argument

When you strace the postgresql checkpointer process you see
this:
# strace -f -p 31599
select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout)
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
open("pg_xact", O_RDONLY)               = 3
fsync(3)                                = 0
close(3)                                = 0
open("pg_commit_ts", O_RDONLY)          = 3
fsync(3)                                = -1 EINVAL (Invalid argument)
close(3)                                = 0
rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP ABRT BUS FPE KILL SEGV CONT STOP SYS RTMIN RT_1], NULL, 8) = 0
write(2, "2017-11-07 09:47:38.580 UTC [315"..., 98) = 98

The reason why the second fsync() fails is because the
pg_commit_ts directory has not had any change since its
creation in the initial image. It is thus stored in the
read-only squashfs filesystem and has not yet been copied
up in the writable tmpfs (which does support fsync). In
this case, overlayfs delegates the fsync() call to the read-only
squashfs filesystem which returns EINVAL as it does not support
such an operation.

This has been explained by the overlayfs upstream developer
(to which I reported this bug initially, thinking it was an
overlayfs regression):
https://marc.info/?l=linux-unionfs&m=151005246512873&w=2
https://marc.info/?l=linux-unionfs&m=151005699414227&w=2

My request is thus that PostgreSQL should fsync that directory only after
it has made changes to the directory or its content. PostgreSQL 9.6 was
working fine in the same setup and I would like PostgreSQL 10 to do the
same. :)

I'm ccing Teodor Sigaev <teodor@sigaev.ru> because I believe that
the problematic fsync() has been added by him in this commit:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=1b02be21f271db6bd3cd43abb23fa596fcb6bac3

Cheers,
-- 
Raphaël Hertzog ◈ Writer/Consultant ◈ Debian Developer

Discover the Debian Administrator's Handbook:
→ https://debian-handbook.info/get/



-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

pgsql-bugs by date:

Previous
From: sean.johnston@edgeintelligence.com
Date:
Subject: [BUGS] BUG #14890: Error grouping by same column twice using FDW
Next
From: Alvaro Herrera
Date:
Subject: Re: [BUGS] Fails to work on live images due to fsync() onpg_commit_ts before doing any write there