Thread: pg_upgrade cannot create btrfs clones on linux kernel 6.8.0

pg_upgrade cannot create btrfs clones on linux kernel 6.8.0

From
Michael Misiewicz
Date:
Hello all, 

I am attempting to upgrade a postgres cluster from 16 to 17. I'm running ubuntu 24.04 on x86_64, with postgres installed from the postgres apt repositories. 

I'm invoking pg_upgrade like so:
/usr/lib/postgresql/17/bin/pg_upgrade --clone -d /var/lib/postgresql/16/main -b /usr/lib/postgresql/16/bin -B  /usr/lib/postgresql/17/bin -D /var/lib/postgresql/17/main -o '-c config_file=/etc/postgresql/16/main/postgresql.conf' -O '-c config_file=/etc/postgresql/17/main/postgresql.conf'

and it fails with:
could not clone file between old and new data directories: Invalid argument
Failure, exiting

I ran the following strace command to find out where the problem is happening:
sudo -u postgres strace -f -e trace=copy_file_range,clone,ioctl,%file -s 2000 -v /usr/lib/postgresql/17/bin/pg_upgrade --clone -d /var/lib/postgresql/16/main -b /usr/lib/postgresql/16/bin -B  /usr/lib/postgresql/17/bin -D /var/lib/postgresql/17/main -o '-c config_file=/etc/postgresql/16/main/postgresql.conf' -O '-c config_file=/etc/postgresql/17/main/postgresql.conf' 2>/tmp/pg_strace.log

And grepping through the log I found what looks like the problem here:
--
[pid 1153372] openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 8
[pid 1153372] openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libpq.so.5", O_RDONLY|O_CLOEXEC) = 8
[pid 1153261] unlink("/var/lib/postgresql/17/main/PG_VERSION.clonetest") = 0
[pid 1153261] openat(AT_FDCWD, "/var/lib/postgresql/16/main/PG_VERSION", O_RDONLY) = 4
[pid 1153261] openat(AT_FDCWD, "/var/lib/postgresql/17/main/PG_VERSION.clonetest", O_RDWR|O_CREAT|O_EXCL, 0600) = 5
[pid 1153261] ioctl(5, BTRFS_IOC_CLONE or FICLONE, 4) = -1 EINVAL (Invalid argument)

Which I believe can be traced back to this commit 3a769d8239afdc003c91a56d2d8d5adfadacda5d

I have validated that my filesystem supports reflinks:
$  dd of=tempfile if=/dev/random bs=1M count=24
24+0 records in
24+0 records out
25165824 bytes (25 MB, 24 MiB) copied, 0.0806355 s, 312 MB/s
$  cp --reflink=always tempfile tempfile.reflink
$  btrfs filesystem du -s tempfile tempfile.reflink     Total   Exclusive  Set shared  Filename  24.00MiB       0.00B    24.00MiB  tempfile  24.00MiB       0.00B    24.00MiB  tempfile.reflink

And there's nothing interesting in `dmesg`. 

I ran this same command today on a macOS system (using apfs) and it worked great. I have no idea how to fix this problem and I'm curious if anyone has any pointers. 

Thanks,
Michael

Re: pg_upgrade cannot create btrfs clones on linux kernel 6.8.0

From
Tomas Vondra
Date:
On 12/30/24 01:36, Michael Misiewicz wrote:
> ...
> I have validated that my filesystem supports reflinks:
> 
> $  dd of=tempfile if=/dev/random bs=1M count=24
> 24+0 records in
> 24+0 records out
> 25165824 bytes (25 MB, 24 MiB) copied, 0.0806355 s, 312 MB/s
> $  cp --reflink=always tempfile tempfile.reflink
> $  btrfs filesystem du -s tempfile tempfile.reflink
>      Total   Exclusive  Set shared  Filename
>   24.00MiB       0.00B    24.00MiB  tempfile
>   24.00MiB       0.00B    24.00MiB  tempfile.reflink
> 
> 
> And there's nothing interesting in `dmesg`. 
> 
> I ran this same command today on a macOS system (using apfs) and it
> worked great. I have no idea how to fix this problem and I'm curious if
> anyone has any pointers. 
> 

What does lsattr say about the source files?

$ lsattr /var/lib/postgresql/16/main/PG_VERSION

Chances are there is "C" attribute set, i.e. NOCOW. In that case I get
exactly the same failure :

  could not clone file between old and new data directories: \
  Invalid argument


regards

-- 
Tomas Vondra




Re: pg_upgrade cannot create btrfs clones on linux kernel 6.8.0

From
Michael Misiewicz
Date:
Tomas, 

Thanks for the great suggestion. That does seem like it'd be a likely cause of the issue, but not the case here (I was thinking if the bit was set, it might indicate an issue with the debian packages). 


$  sudo lsattr /var/lib/postgresql/16/main/PG_VERSION
---------------------- /var/lib/postgresql/16/main/PG_VERSION


Michael

On December 29, 2024, Tomas Vondra <tomas@vondra.me> wrote:
On 12/30/24 01:36, Michael Misiewicz wrote:
> ...
> I have validated that my filesystem supports reflinks:

> $ dd of=tempfile if=/dev/random bs=1M count=24
> 24+0 records in
> 24+0 records out
> 25165824 bytes (25 MB, 24 MiB) copied, 0.0806355 s, 312 MB/s
> $ cp --reflink=always tempfile tempfile.reflink
> $ btrfs filesystem du -s tempfile tempfile.reflink
> Total Exclusive Set shared Filename
> 24.00MiB 0.00B 24.00MiB tempfile
> 24.00MiB 0.00B 24.00MiB tempfile.reflink


> And there's nothing interesting in `dmesg`. 

> I ran this same command today on a macOS system (using apfs) and it
> worked great. I have no idea how to fix this problem and I'm curious if
> anyone has any pointers. 


What does lsattr say about the source files?

$ lsattr /var/lib/postgresql/16/main/PG_VERSION

Chances are there is "C" attribute set, i.e. NOCOW. In that case I get
exactly the same failure :

 could not clone file between old and new data directories: \
 Invalid argument


regards

-- 
Tomas Vondra