On Thu, Jul 15, 2021 at 09:35:52PM +0900, Michael Paquier wrote:
> For this one, I'll try to test harder on my own host. I am curious to
> see if the other Windows members running the TAP tests have anything
> to say. Looking at the code of zlib, this would come from gz_zero()
> in gzflush(), which could blow up on a write() in gz_comp().
bowerbird has just failed for the second time in a row on EACCESS, so
there is more here than meets the eye. Looking at the code, I think I
have spotted what it is and the buildfarm logs give a very good hint:
# Running: pg_receivewal -D
:/prog/bf/root/HEAD/pgsql.build/src/bin/pg_basebackup/tmp_check/t_020_pg_receivewal_primary_data/archive_wal
--verbose --endpos 0/3000028 --compress 1
pg_receivewal: starting log streaming at 0/2000000 (timeline 1)
pg_receivewal: fatal: could not fsync existing write-ahead log file
"000000010000000000000002.partial": Permission denied
not ok 20 - streaming some WAL using ZLIB compression
--compress is used and the sync fails for a non-compressed segment.
Looking at the code it is pretty obvious that open_walfile() is
getting confused with the handling of an existing .partial segment
while walmethods.c uses dir_data->compression in all the places that
matter. So that's a legit bug, that happens only when mixing
pg_receivewal runs for where successive runs use the compression or
non-compression modes.
I am amazed that the other buildfarm members are not complaining, to
be honest. jacana runs this TAP test with MinGW and ZLIB, and does
not complain.
--
Michael