Thread: BUG #18971: Server passes an invalid (indirect) path in PGDATA to the external program
BUG #18971: Server passes an invalid (indirect) path in PGDATA to the external program
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 18971 Logged by: Dmitry Kovalenko Email address: d.kovalenko@postgrespro.ru PostgreSQL version: 18beta1 Operating system: Ubuntu 2024.04 Description: Hello, PG version is 18beta1 (bbccf7ec) OS is Ubunty 24.04 ---------- SHORT DESCRIPTION Server passes an invalid (indirect) path in PGDATA to the external program. Server must provide to the external programs a right, absolute path in PGDATA. ---------- FULL DESCRIPTION / STEPS 1) Our SPECIAL test starts PG with an indirect path to the database cluster files: CWD is: "/home/dima/MY/Work/CurrentTasks/2025/PBCKP-2289/_Iter003/001--Vanilla-master/Probackup3/work01" COMMAND is: pg_ctl start -D "../../../../../../../../../../../..//home/dima/MY/Work/CurrentTasks/2025/PBCKP-2289/_Iter003/001--Vanilla-master/Probackup3/work01/tests/tmp_dirs/ArchiveTest/test_archive_get_relative_path/restored/data" -l "../../../../../../../../../../../..//home/dima/MY/Work/CurrentTasks/2025/PBCKP-2289/_Iter003/001--Vanilla-master/Probackup3/work01/tests/tmp_dirs/ArchiveTest/test_archive_get_relative_path/restored/logs/postgres.log" postgresql.auto.conf of this cluster has the following lines: ------ [begin of postgresql.auto.conf] archive_mode = 'on' archive_command = 'exec "/home/dima/MY/Work/CurrentTasks/2025/PBCKP-2289/_Iter003/001--Vanilla-master/Probackup3/work01/builddir/src/pg_probackup3" archive-push -B /home/dima/MY/Work/CurrentTasks/2025/PBCKP-2289/_Iter003/001--Vanilla-master/Probackup3/work01/tests/tmp_dirs/ArchiveTest/test_archive_get_relative_path/backup --instance=node --compress-algorithm=zlib --log-level-console=trace --no-sync --wal-file-path=%p --wal-file-name=%f' recovery_target_timeline = 'current' recovery_target_action = 'pause' restore_command = '"/home/dima/MY/Work/CurrentTasks/2025/PBCKP-2289/_Iter003/001--Vanilla-master/Probackup3/work01/builddir/src/pg_probackup3" archive-get -B "/home/dima/MY/Work/CurrentTasks/2025/PBCKP-2289/_Iter003/001--Vanilla-master/Probackup3/work01/tests/tmp_dirs/ArchiveTest/test_archive_get_relative_path/backup" --instance "node" --wal-file-path=%p --wal-file-name=%f' ------ [/end of postgresql.auto.conf] 2) pc_ctl starts the postmaster with this PWD: "/home/dima/MY/Work/CurrentTasks/2025/PBCKP-2289/_Iter003/001--Vanilla-master/Probackup3/work01" So, the path to the cluster files in "-D" is valid and is: "/home/dima/MY/Work/CurrentTasks/2025/PBCKP-2289/_Iter003/001--Vanilla-master/Probackup3/work01/tests/tmp_dirs/ArchiveTest/test_archive_get_relative_path/restored/data" 3) During start Postgres runs the external program pg_probackup3 with the following commandline (I got it with "ps -eo args | grep probackup"): sh -c -- "/home/dima/MY/Work/CurrentTasks/2025/PBCKP-2289/_Iter003/001--Vanilla-master/Probackup3/work01/builddir/src/pg_probackup3" archive-get -B "/home/dima/MY/Work/CurrentTasks/2025/PBCKP-2289/_Iter003/001--Vanilla-master/Probackup3/work01/tests/tmp_dirs/ArchiveTest/test_archive_get_relative_path/backup" --instance "node" --wal-file-path=pg_wal/RECOVERYXLOG --wal-file-name=000000010000000000000002 This command runs the next process: /home/dima/MY/Work/CurrentTasks/2025/PBCKP-2289/_Iter003/001--Vanilla-master/Probackup3/work01/builddir/src/pg_probackup3 archive-get -B /home/dima/MY/Work/CurrentTasks/2025/PBCKP-2289/_Iter003/001--Vanilla-master/Probackup3/work01/tests/tmp_dirs/ArchiveTest/test_archive_get_relative_path/backup --instance node --wal-file-path=pg_wal/RECOVERYXLOG --wal-file-name=000000010000000000000002 4) pg_probackup3 process has the following environment variables: PWD="/home/dima/MY/Work/CurrentTasks/2025/PBCKP-2289/_Iter003/001--Vanilla-master/Probackup3/work01/tests/tmp_dirs/ArchiveTest/test_archive_get_relative_path/restored/data" PGDATA="../../../../../../../../../../../../home/dima/MY/Work/CurrentTasks/2025/PBCKP-2289/_Iter003/001--Vanilla-master/Probackup3/work01/tests/tmp_dirs/ArchiveTest/test_archive_get_relative_path/restored/data" ---------- PROBLEM PGDATA (4) is invalid for pg_probackup3. Because PWD of probackup (4) != PWD of server (2). When probackup is trying to use this PGDATA, it fails. Server must provide to the external programs a right, absolute path in PGDATA. Thanks&Regards, Dmitry Kovalenko, Postgres Professional, Russia
Re: BUG #18971: Server passes an invalid (indirect) path in PGDATA to the external program
From
Tom Lane
Date:
PG Bug reporting form <noreply@postgresql.org> writes: > 4) pg_probackup3 process has the following environment variables: > PWD="/home/dima/MY/Work/CurrentTasks/2025/PBCKP-2289/_Iter003/001--Vanilla-master/Probackup3/work01/tests/tmp_dirs/ArchiveTest/test_archive_get_relative_path/restored/data" > PGDATA="../../../../../../../../../../../../home/dima/MY/Work/CurrentTasks/2025/PBCKP-2289/_Iter003/001--Vanilla-master/Probackup3/work01/tests/tmp_dirs/ArchiveTest/test_archive_get_relative_path/restored/data" > ---------- PROBLEM > PGDATA (4) is invalid for pg_probackup3. I do not think this is a Postgres bug. The PGDATA environment variable is not canonical, it is just the default to be used if you don't specify a -D switch on the postmaster/pg_ctl command line. In fact, it might not be set at all. (I see that pg_ctl does set it, but pg_ctl is not the only way to start the postmaster.) Therefore, relying on PGDATA in a restore_command script isn't safe. You should be relying on the current working directory (PWD) instead. regards, tom lane
Re: BUG #18971: Server passes an invalid (indirect) path in PGDATA to the external program
From
Dmitry Kovalenko
Date:
Hello Tom, > I do not think this is a Postgres bug. The PGDATA environment > variable is not canonical, it is just the default to be used if > you don't specify a -D switch on the postmaster/pg_ctl command line. > In fact, it might not be set at all. (I see that pg_ctl does set > it, but pg_ctl is not the only way to start the postmaster.) > Therefore, relying on PGDATA in a restore_command script isn't safe. > You should be relying on the current working directory (PWD) instead. A test executes pg_ctl with the correct path to cluster and the correct CWD. pg_ctl (and so on) executes an external program with inconsistent PGDATA and CWD. This is clearly a problem on your side ) I wonder that it was not detected earlier. For my point of view, if you can't provider a valid path in PGDATA, do not do it at all. With Best Regards, Dmitry Kovalenko
Re: BUG #18971: Server passes an invalid (indirect) path in PGDATA to the external program
From
"David G. Johnston"
Date:
On Sat, Jun 28, 2025 at 1:19 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Dmitry Kovalenko <d.kovalenko@postgrespro.ru> writes:
> For my point of view, if you can't provider a valid path in PGDATA, do
> not do it at all.
I'm not sure which part of "don't rely on PGDATA" you didn't get.
Under normal circumstances, Postgres itself doesn't set that at all.
If we have to tell someone to not rely on a value that we've set we've done something wrong. Either haven't explained how to use it correctly, we've set it in unusual circumstances when we should not have, or we are choosing a bad value to set it in the first place. Setting it to an absolute path would seem like the least problematic approach to take to make our system more usable.
David J.
Re: BUG #18971: Server passes an invalid (indirect) path in PGDATA to the external program
From
Laurenz Albe
Date:
On Sat, 2025-06-28 at 14:12 -0700, David G. Johnston wrote: > On Sat, Jun 28, 2025 at 1:19 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I'm not sure which part of "don't rely on PGDATA" you didn't get. > > Under normal circumstances, Postgres itself doesn't set that at all. > > If we have to tell someone to not rely on a value that we've set we've > done something wrong. I am not sure that that is an accurate description. Look at the code in pg_ctl.c: /* process command-line options */ while ((c = getopt_long(argc, argv, "cD:e:l:m:N:o:p:P:sS:t:U:wW", long_options, &option_index)) != -1) { switch (c) { case 'D': { char *pgdata_D; pgdata_D = pg_strdup(optarg); canonicalize_path(pgdata_D); setenv("PGDATA", pgdata_D, 1); /* * We could pass PGDATA just in an environment variable * but we do -D too for clearer postmaster 'ps' display */ pgdata_opt = psprintf("-D \"%s\" ", pgdata_D); free(pgdata_D); break; } So we take the value specified with the -D option, canonicalize_path() it and set it in the environment. canonicalize_path() does the following: /* * canonicalize_path() * * Clean up path by: * o make Win32 path use Unix slashes * o remove trailing quote on Win32 * o remove trailing slash * o remove duplicate (adjacent) separators * o remove '.' (unless path reduces to only '.') * o process '..' ourselves, removing it if possible * Modifies path in-place. So, essentially, Dmitry feeds a weird path to the -D option and then complains that PGDATA is set to a weird value. Yours, Laurenz Albe