Thread: pg_tablespace_location() failure with allow_in_place_tablespaces
Hi all, While playing with tablespaces and recovery in a TAP test, I have noticed that retrieving the location of a tablespace created with allow_in_place_tablespaces enabled fails in pg_tablespace_location(), because readlink() sees a directory in this case. The use may be limited to any automated testing and allow_in_place_tablespaces is a developer GUC, still it seems to me that there is an argument to allow the case rather than tweak any tests to hardcode a path with the tablespace OID. And any other code paths are able to handle such tablespaces, be they in recovery or in tablespace create/drop. A junction point is a directory on WIN32 as far as I recall, but pgreadlink() is here to ensure that we get the correct path on a source found as pgwin32_is_junction(), so we can rely on that. This stuff has led me to the attached. Thoughts? -- Michael
Attachment
At Fri, 4 Mar 2022 15:44:22 +0900, Michael Paquier <michael@paquier.xyz> wrote in > Hi all, > > While playing with tablespaces and recovery in a TAP test, I have > noticed that retrieving the location of a tablespace created with > allow_in_place_tablespaces enabled fails in pg_tablespace_location(), > because readlink() sees a directory in this case. ERROR: could not read symbolic link "pg_tblspc/16407": Invalid argument > The use may be limited to any automated testing and > allow_in_place_tablespaces is a developer GUC, still it seems to me > that there is an argument to allow the case rather than tweak any > tests to hardcode a path with the tablespace OID. And any other code > paths are able to handle such tablespaces, be they in recovery or in > tablespace create/drop. +1 > A junction point is a directory on WIN32 as far as I recall, but > pgreadlink() is here to ensure that we get the correct path on > a source found as pgwin32_is_junction(), so we can rely on that. This > stuff has led me to the attached. > > Thoughts? The function I think is expected to return a absolute path but it returns a relative path for in-place tablespaces. While it is apparently incovenient for general use, there might be a case where we want to know whether the tablespace is in-place or not. So I'm not sure which is better.. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
At Fri, 04 Mar 2022 16:41:03 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > At Fri, 4 Mar 2022 15:44:22 +0900, Michael Paquier <michael@paquier.xyz> wrote in > > The use may be limited to any automated testing and > > allow_in_place_tablespaces is a developer GUC, still it seems to me > > that there is an argument to allow the case rather than tweak any > > tests to hardcode a path with the tablespace OID. And any other code > > paths are able to handle such tablespaces, be they in recovery or in > > tablespace create/drop. > > +1 By the way, regardless of the patch, I got an error from pg_basebackup for an in-place tablespace. pg_do_start_backup calls readlink believing pg_tblspc/* is always a symlink. # Running: pg_basebackup -D /home/horiguti/work/worktrees/tsp_replay_2/src/test/recovery/tmp_check/t_029_replay_tsp_drops_primary1_data/backup/my_backup -h/tmp/X8E4nbF4en -p 51584 --checkpoint fast --no-sync WARNING: could not read symbolic link "pg_tblspc/16384": Invalid argument regards. -- Kyotaro Horiguchi NTT Open Source Software Center
At Fri, 04 Mar 2022 16:54:49 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > At Fri, 04 Mar 2022 16:41:03 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > > At Fri, 4 Mar 2022 15:44:22 +0900, Michael Paquier <michael@paquier.xyz> wrote in > > > The use may be limited to any automated testing and > > > allow_in_place_tablespaces is a developer GUC, still it seems to me > > > that there is an argument to allow the case rather than tweak any > > > tests to hardcode a path with the tablespace OID. And any other code > > > paths are able to handle such tablespaces, be they in recovery or in > > > tablespace create/drop. > > > > +1 > > By the way, regardless of the patch, I got an error from pg_basebackup > for an in-place tablespace. pg_do_start_backup calls readlink > believing pg_tblspc/* is always a symlink. > > # Running: pg_basebackup -D /home/horiguti/work/worktrees/tsp_replay_2/src/test/recovery/tmp_check/t_029_replay_tsp_drops_primary1_data/backup/my_backup -h/tmp/X8E4nbF4en -p 51584 --checkpoint fast --no-sync > WARNING: could not read symbolic link "pg_tblspc/16384": Invalid argument So now we know that there are three places that needs the same processing. pg_tablespace_location: this patch tries to fix sendDir: it already supports in-place tsp do_pg_start_backup: not supports in-place tsp yet. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
At Fri, 04 Mar 2022 17:28:45 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > > By the way, regardless of the patch, I got an error from pg_basebackup > > for an in-place tablespace. pg_do_start_backup calls readlink > > believing pg_tblspc/* is always a symlink. > > > > # Running: pg_basebackup -D /home/horiguti/work/worktrees/tsp_replay_2/src/test/recovery/tmp_check/t_029_replay_tsp_drops_primary1_data/backup/my_backup -h/tmp/X8E4nbF4en -p 51584 --checkpoint fast --no-sync > > WARNING: could not read symbolic link "pg_tblspc/16384": Invalid argument > > So now we know that there are three places that needs the same > processing. > > pg_tablespace_location: this patch tries to fix > sendDir: it already supports in-place tsp > do_pg_start_backup: not supports in-place tsp yet. And I made a quick hack on do_pg_start_backup. And I found that pg_basebackup copies in-place tablespaces under the *current directory*, which is not ok at all:( regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Fri, Mar 4, 2022 at 10:04 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > And I made a quick hack on do_pg_start_backup. And I found that > pg_basebackup copies in-place tablespaces under the *current > directory*, which is not ok at all:( Hmm. Which OS are you on? Looks OK here -- the "in place" tablespace gets copied as a directory under pg_tblspc, no symlink: postgres=# set allow_in_place_tablespaces = on; SET postgres=# create tablespace ts1 location ''; CREATE TABLESPACE postgres=# create table t (i int) tablespace ts1; CREATE TABLE postgres=# insert into t values (1), (2); INSERT 0 2 postgres=# create user replication replication; CREATE ROLE $ pg_basebackup --user replication -D pgdata2 $ ls -slaph pgdata/pg_tblspc/ total 4.0K 0 drwx------ 3 tmunro tmunro 19 Mar 4 23:16 ./ 4.0K drwx------ 19 tmunro tmunro 4.0K Mar 4 23:16 ../ 0 drwx------ 3 tmunro tmunro 29 Mar 4 23:16 16384/ $ ls -slaph pgdata2/pg_tblspc/ total 4.0K 0 drwx------ 3 tmunro tmunro 19 Mar 4 23:16 ./ 4.0K drwx------ 19 tmunro tmunro 4.0K Mar 4 23:16 ../ 0 drwx------ 3 tmunro tmunro 29 Mar 4 23:16 16384/ $ ls -slaph pgdata/pg_tblspc/16384/PG_15_202203031/5/ total 8.0K 0 drwx------ 2 tmunro tmunro 19 Mar 4 23:16 ./ 0 drwx------ 3 tmunro tmunro 15 Mar 4 23:16 ../ 8.0K -rw------- 1 tmunro tmunro 8.0K Mar 4 23:16 16385 $ ls -slaph pgdata2/pg_tblspc/16384/PG_15_202203031/5/ total 8.0K 0 drwx------ 2 tmunro tmunro 19 Mar 4 23:16 ./ 0 drwx------ 3 tmunro tmunro 15 Mar 4 23:16 ../ 8.0K -rw------- 1 tmunro tmunro 8.0K Mar 4 23:16 16385 The warning from readlink() while making the mapping file isn't ideal, and perhaps we should suppress that with something like the attached. Or does the missing map file entry break something on Windows?
Attachment
On Fri, Mar 04, 2022 at 06:04:00PM +0900, Kyotaro Horiguchi wrote: > And I made a quick hack on do_pg_start_backup. And I found that > pg_basebackup copies in-place tablespaces under the *current > directory*, which is not ok at all:( Yeah, I have noticed that as well while testing such configurations a couple of hours ago, but I am not sure yet how much we need to care about that as in-place tablespaces are included in the main data directory anyway, which would be fine for most test purposes we usually care about. Perhaps this has an impact on the patch posted on the thread that wants to improve the guarantees around tablespace directory structures, but I have not studied this thread much to have an opinion. And it is Friday. -- Michael
Attachment
At Fri, 4 Mar 2022 23:26:43 +1300, Thomas Munro <thomas.munro@gmail.com> wrote in > On Fri, Mar 4, 2022 at 10:04 PM Kyotaro Horiguchi > <horikyota.ntt@gmail.com> wrote: > > And I made a quick hack on do_pg_start_backup. And I found that > > pg_basebackup copies in-place tablespaces under the *current > > directory*, which is not ok at all:( > > Hmm. Which OS are you on? Looks OK here -- the "in place" tablespace > gets copied as a directory under pg_tblspc, no symlink: > The warning from readlink() while making the mapping file isn't ideal, > and perhaps we should suppress that with something like the attached. > Or does the missing map file entry break something on Windows? Ah.. Ok, somehow I thought that pg_basebackup failed for readlink failure and the tweak I made made things worse. I got to make it work. Thanks! regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Fri, Mar 04, 2022 at 03:44:22PM +0900, Michael Paquier wrote: > The use may be limited to any automated testing and > allow_in_place_tablespaces is a developer GUC, still it seems to me > that there is an argument to allow the case rather than tweak any > tests to hardcode a path with the tablespace OID. And any other code > paths are able to handle such tablespaces, be they in recovery or in > tablespace create/drop. > > A junction point is a directory on WIN32 as far as I recall, but > pgreadlink() is here to ensure that we get the correct path on > a source found as pgwin32_is_junction(), so we can rely on that. This > stuff has led me to the attached. Thomas, I'd rather fix this for the sake of the tests. One point is that the function returns a relative path for in-place tablespaces, but it would be easy enough to append a DataDir. What do you think? -- Michael
Attachment
On Fri, Mar 04, 2022 at 11:26:43PM +1300, Thomas Munro wrote: > The warning from readlink() while making the mapping file isn't ideal, > and perhaps we should suppress that with something like the attached. > Or does the missing map file entry break something on Windows? > @@ -8292,6 +8293,10 @@ do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p, > > snprintf(fullpath, sizeof(fullpath), "pg_tblspc/%s", de->d_name); > > + /* Skip in-place tablespaces (testing use only) */ > + if (get_dirent_type(fullpath, de, false, ERROR) == PGFILETYPE_DIR) > + continue; I saw the warning when testing base backups with in-place tablespaces and it did not annoy me much, but, yes, that can be confusing. Junction points are directories, no? Are you sure that this works correctly on WIN32? It seems to me that we'd better use readlink() only for entries in pg_tlbspc/ that are PGFILETYPE_LNK on non-WIN32 and pgwin32_is_junction() on WIN32. -- Michael
Attachment
On Tue, Mar 8, 2022 at 12:58 AM Michael Paquier <michael@paquier.xyz> wrote: > On Fri, Mar 04, 2022 at 11:26:43PM +1300, Thomas Munro wrote: > > + /* Skip in-place tablespaces (testing use only) */ > > + if (get_dirent_type(fullpath, de, false, ERROR) == PGFILETYPE_DIR) > > + continue; > > I saw the warning when testing base backups with in-place tablespaces > and it did not annoy me much, but, yes, that can be confusing. > > Junction points are directories, no? Are you sure that this works > correctly on WIN32? It seems to me that we'd better use readlink() > only for entries in pg_tlbspc/ that are PGFILETYPE_LNK on non-WIN32 > and pgwin32_is_junction() on WIN32. Thanks, you're right. Test on a Win10 VM. Here's a new version.
Attachment
On Tue, Mar 8, 2022 at 10:39 AM Thomas Munro <thomas.munro@gmail.com> wrote: > Test on a Win10 VM. Erm, "Tested" (as in, I tested), I meant to write...
At Tue, 8 Mar 2022 10:39:06 +1300, Thomas Munro <thomas.munro@gmail.com> wrote in > On Tue, Mar 8, 2022 at 12:58 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Fri, Mar 04, 2022 at 11:26:43PM +1300, Thomas Munro wrote: > > > + /* Skip in-place tablespaces (testing use only) */ > > > + if (get_dirent_type(fullpath, de, false, ERROR) == PGFILETYPE_DIR) > > > + continue; > > > > I saw the warning when testing base backups with in-place tablespaces > > and it did not annoy me much, but, yes, that can be confusing. > > > > Junction points are directories, no? Are you sure that this works > > correctly on WIN32? It seems to me that we'd better use readlink() > > only for entries in pg_tlbspc/ that are PGFILETYPE_LNK on non-WIN32 > > and pgwin32_is_junction() on WIN32. > > Thanks, you're right. Test on a Win10 VM. Here's a new version. Thanks! It works for me on CentOS8 and Windows11. FYI, on Windows11, pg_basebackup didn't work correctly without the patch. So this looks like fixing an undiscovered bug as well. === > pg_basebackup -D copy WARNING: could not read symbolic link "pg_tblspc/16384": Invalid argument pg_basebackup: error: tar member has empty name > dir copy Volume in drive C has no label. Volume serial number: 10C6-4BA6 Directory of c:\..\copy 2022/03/08 09:53 <DIR> . 2022/03/08 09:53 <DIR> .. 2022/03/08 09:53 0 nbase.tar 2022/03/08 09:53 <DIR> pg_wal 1 File(s) 0 bytes 3 Dir(s) 171,920,613,376 bytes free regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Tue, Mar 08, 2022 at 10:06:50AM +0900, Kyotaro Horiguchi wrote: > At Tue, 8 Mar 2022 10:39:06 +1300, Thomas Munro <thomas.munro@gmail.com> wrote in >> Thanks, you're right. Test on a Win10 VM. Here's a new version. Looks fine to me. > FYI, on Windows11, pg_basebackup didn't work correctly without the > patch. So this looks like fixing an undiscovered bug as well. Well, that's not really a long-time bug but just a side effect of in-place tablespaces because we don't use them in many test cases yet, is it? >> pg_basebackup -D copy > WARNING: could not read symbolic link "pg_tblspc/16384": Invalid argument > pg_basebackup: error: tar member has empty name > > 1 File(s) 0 bytes > 3 Dir(s) 171,920,613,376 bytes free That's a lot of free space. -- Michael
Attachment
At Tue, 8 Mar 2022 10:28:46 +0900, Michael Paquier <michael@paquier.xyz> wrote in > On Tue, Mar 08, 2022 at 10:06:50AM +0900, Kyotaro Horiguchi wrote: > > At Tue, 8 Mar 2022 10:39:06 +1300, Thomas Munro <thomas.munro@gmail.com> wrote in > >> Thanks, you're right. Test on a Win10 VM. Here's a new version. > > Looks fine to me. > > > FYI, on Windows11, pg_basebackup didn't work correctly without the > > patch. So this looks like fixing an undiscovered bug as well. > > Well, that's not really a long-time bug but just a side effect of > in-place tablespaces because we don't use them in many test cases > yet, is it? No, we don't. So just FYI. > >> pg_basebackup -D copy > > WARNING: could not read symbolic link "pg_tblspc/16384": Invalid argument > > pg_basebackup: error: tar member has empty name > > > > 1 File(s) 0 bytes > > 3 Dir(s) 171,920,613,376 bytes free > > That's a lot of free space. The laptop has a 512GB storage, so 160GB is pretty normal, maybe. 129GB of the storage is used by some VMs.. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Tue, Mar 8, 2022 at 4:01 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > At Tue, 8 Mar 2022 10:28:46 +0900, Michael Paquier <michael@paquier.xyz> wrote in > > On Tue, Mar 08, 2022 at 10:06:50AM +0900, Kyotaro Horiguchi wrote: > > > At Tue, 8 Mar 2022 10:39:06 +1300, Thomas Munro <thomas.munro@gmail.com> wrote in > > >> Thanks, you're right. Test on a Win10 VM. Here's a new version. > > > > Looks fine to me. > > > > > FYI, on Windows11, pg_basebackup didn't work correctly without the > > > patch. So this looks like fixing an undiscovered bug as well. > > > > Well, that's not really a long-time bug but just a side effect of > > in-place tablespaces because we don't use them in many test cases > > yet, is it? > > No, we don't. So just FYI. Ok, I pushed the fix for pg_basebackup. As for the complaint about pg_tablespace_location() failing, would it be better to return an empty string? That's what was passed in as LOCATION. Something like the attached.
Attachment
On Tue, Mar 15, 2022 at 2:33 PM Thomas Munro <thomas.munro@gmail.com> wrote: > As for the complaint about pg_tablespace_location() failing, would it > be better to return an empty string? That's what was passed in as > LOCATION. Something like the attached. (Hrrmm, the contract for pgwin32_is_junction() is a little weird: false means "success, but no" and also "failure, you should check errno". But we never do.)
On Tue, Mar 15, 2022 at 02:33:17PM +1300, Thomas Munro wrote: > Ok, I pushed the fix for pg_basebackup. > > As for the complaint about pg_tablespace_location() failing, would it > be better to return an empty string? That's what was passed in as > LOCATION. Something like the attached. Hmm, I don't think so. The point of the function is to be able to know the location of a tablespace at SQL level so as we don't have any need to hardcode its location within any external tests (be it a pg_regress test or a TAP test) based on how in-place tablespace paths are built in the backend, so I think that we'd better report either a relative path from data_directory or an absolute path, but not an empty string. In any case, I'd suggest to add a regression test. What I have sent upthread would be portable enough. -- Michael
Attachment
On Tue, Mar 15, 2022 at 2:50 PM Michael Paquier <michael@paquier.xyz> wrote: > On Tue, Mar 15, 2022 at 02:33:17PM +1300, Thomas Munro wrote: > > As for the complaint about pg_tablespace_location() failing, would it > > be better to return an empty string? That's what was passed in as > > LOCATION. Something like the attached. > > Hmm, I don't think so. The point of the function is to be able to > know the location of a tablespace at SQL level so as we don't have any > need to hardcode its location within any external tests (be it a > pg_regress test or a TAP test) based on how in-place tablespace paths > are built in the backend, so I think that we'd better report either a > relative path from data_directory or an absolute path, but not an > empty string. > > In any case, I'd suggest to add a regression test. What I have sent > upthread would be portable enough. Fair enough. No objections here.
On Tue, Mar 15, 2022 at 03:55:56PM +1300, Thomas Munro wrote: > On Tue, Mar 15, 2022 at 2:50 PM Michael Paquier <michael@paquier.xyz> wrote: >> On Tue, Mar 15, 2022 at 02:33:17PM +1300, Thomas Munro wrote: >> > As for the complaint about pg_tablespace_location() failing, would it >> > be better to return an empty string? That's what was passed in as >> > LOCATION. Something like the attached. >> >> Hmm, I don't think so. The point of the function is to be able to >> know the location of a tablespace at SQL level so as we don't have any >> need to hardcode its location within any external tests (be it a >> pg_regress test or a TAP test) based on how in-place tablespace paths >> are built in the backend, so I think that we'd better report either a >> relative path from data_directory or an absolute path, but not an >> empty string. >> >> In any case, I'd suggest to add a regression test. What I have sent >> upthread would be portable enough. > > Fair enough. No objections here. So, which one of a relative path or an absolute path do you think would be better for the user? My preference tends toward the relative path, as we know that all those tablespaces stay in pg_tblspc/ so one can make the difference with normal tablespaces more easily. The barrier is thin, though :p -- Michael
Attachment
On Tue, Mar 15, 2022 at 10:30 PM Michael Paquier <michael@paquier.xyz> wrote: > So, which one of a relative path or an absolute path do you think > would be better for the user? My preference tends toward the relative > path, as we know that all those tablespaces stay in pg_tblspc/ so one > can make the difference with normal tablespaces more easily. The > barrier is thin, though :p Sounds good to me.
At Tue, 15 Mar 2022 23:16:52 +1300, Thomas Munro <thomas.munro@gmail.com> wrote in > On Tue, Mar 15, 2022 at 10:30 PM Michael Paquier <michael@paquier.xyz> wrote: > > So, which one of a relative path or an absolute path do you think > > would be better for the user? My preference tends toward the relative > > path, as we know that all those tablespaces stay in pg_tblspc/ so one > > can make the difference with normal tablespaces more easily. The > > barrier is thin, though :p > > Sounds good to me. +1. Desn't the doc need to mention that? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Wed, Mar 16, 2022 at 10:34:15AM +0900, Kyotaro Horiguchi wrote: > +1. Desn't the doc need to mention that? Yes, I agree that it makes sense to add a note, even if allow_in_place_tablespaces is a developer option. I have added the following paragraph in the docs: + A full path of the symbolic link in <filename>pg_tblspc/</filename> + is returned. A relative path to the data directory is returned + for tablespaces created with + <xref linkend="guc-allow-in-place-tablespaces"/> enabled. Another thing that was annoying in the first version of the patch is the useless call to lstat() on Windows, not needed because it is possible to rely just on pgwin32_is_junction() to check if readlink() should be called or not. This leads me to the revised version attached. What do you think? -- Michael
Attachment
At Wed, 16 Mar 2022 15:42:52 +0900, Michael Paquier <michael@paquier.xyz> wrote in > On Wed, Mar 16, 2022 at 10:34:15AM +0900, Kyotaro Horiguchi wrote: > > +1. Desn't the doc need to mention that? > > Yes, I agree that it makes sense to add a note, even if > allow_in_place_tablespaces is a developer option. I have added the > following paragraph in the docs: > + A full path of the symbolic link in <filename>pg_tblspc/</filename> > + is returned. A relative path to the data directory is returned > + for tablespaces created with > + <xref linkend="guc-allow-in-place-tablespaces"/> enabled. I'm not sure that the "of the symbolic link in pg_tblspc/" is needed. And allow_in_place_tablespaces alone doesn't create in-place tablesapce. So this might need rethink at least for the second point. > Another thing that was annoying in the first version of the patch is > the useless call to lstat() on Windows, not needed because it is > possible to rely just on pgwin32_is_junction() to check if readlink() > should be called or not. Agreed. And v2 looks cleaner. The test detects the lack of the feature. It successfully builds and runs on Rocky8 and Windows11. > This leads me to the revised version attached. What do you think? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Wed, Mar 16, 2022 at 05:15:58PM +0900, Kyotaro Horiguchi wrote: > I'm not sure that the "of the symbolic link in pg_tblspc/" is > needed. And allow_in_place_tablespaces alone doesn't create in-place > tablespace. So this might need rethink at least for the second point. Surely this can be improved. I was not satisfied with this paragraph after re-reading it this morning, so I have just removed it, rewording slightly the part for in-place tablespaces that is still necessary. > Agreed. And v2 looks cleaner. > > The test detects the lack of the feature. > It successfully builds and runs on Rocky8 and Windows11. Thanks for the review. After a second look, it seemed fine so I have applied it. (I'll try to jump on the tablespace patch for recovery soon-ish-ly if nobody beats me to it.) -- Michael
Attachment
On Thu, Mar 17, 2022 at 3:53 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Wed, Mar 16, 2022 at 05:15:58PM +0900, Kyotaro Horiguchi wrote: > > I'm not sure that the "of the symbolic link in pg_tblspc/" is > > needed. And allow_in_place_tablespaces alone doesn't create in-place > > tablespace. So this might need rethink at least for the second point. > > Surely this can be improved. I was not satisfied with this paragraph > after re-reading it this morning, so I have just removed it, rewording > slightly the part for in-place tablespaces that is still necessary. + <para> + A relative path to the data directory is returned for tablespaces + created when <xref linkend="guc-allow-in-place-tablespaces"/> is + enabled. + </para> + </entry> I think what Horiguchi-san was pointing out above is that you need to enable the GUC *and* say LOCATION '', which the new paragraph doesn't capture. What do you think about this: A path relative to the data directory is returned for in-place tablespaces (see <xref ...>).
On Thu, Mar 17, 2022 at 04:34:30PM +1300, Thomas Munro wrote: > I think what Horiguchi-san was pointing out above is that you need to > enable the GUC *and* say LOCATION '', which the new paragraph doesn't > capture. What do you think about this: > > A path relative to the data directory is returned for in-place > tablespaces (see <xref ...>). An issue I have with this wording is that we give nowhere in the docs an explanation of about the term "in-place tablespace", even if it can be guessed from the name of the GUC. Another idea would be something like that: "A relative path to the data directory is returned for tablespaces created with an empty location string specified in the CREATE TABLESPACE query when allow_in_place_tablespaces is enabled (see link blah)." But perhaps that's just me being overly pedantic :p -- Michael
Attachment
On Thu, Mar 17, 2022 at 7:18 PM Michael Paquier <michael@paquier.xyz> wrote: > On Thu, Mar 17, 2022 at 04:34:30PM +1300, Thomas Munro wrote: > > I think what Horiguchi-san was pointing out above is that you need to > > enable the GUC *and* say LOCATION '', which the new paragraph doesn't > > capture. What do you think about this: > > > > A path relative to the data directory is returned for in-place > > tablespaces (see <xref ...>). > > An issue I have with this wording is that we give nowhere in the docs > an explanation of about the term "in-place tablespace", even if it can > be guessed from the name of the GUC. Maybe we don't need this paragraph at all. Who is it aimed at? > Another idea would be something like that: > "A relative path to the data directory is returned for tablespaces > created with an empty location string specified in the CREATE > TABLESPACE query when allow_in_place_tablespaces is enabled (see link > blah)." > > But perhaps that's just me being overly pedantic :p I don't really want to spill details of this developer-only stuff onto more manual sections... It's not really helping users if we confuse them with irrelevant details of a feature they shouldn't be using, is it? And the existing treatment "Returns the file system path that this tablespace is located in" is not invalidated by this special case, so maybe we shouldn't mention it?
On Thu, Mar 17, 2022 at 07:55:30PM +1300, Thomas Munro wrote: > I don't really want to spill details of this developer-only stuff onto > more manual sections... It's not really helping users if we confuse > them with irrelevant details of a feature they shouldn't be using, is > it? And the existing treatment "Returns the file system path that > this tablespace is located in" is not invalidated by this special > case, so maybe we shouldn't mention it? Right, I see your point. The existing description is not wrong either. Fine by me to just drop all that. -- Michael
Attachment
At Thu, 17 Mar 2022 16:39:52 +0900, Michael Paquier <michael@paquier.xyz> wrote in > On Thu, Mar 17, 2022 at 07:55:30PM +1300, Thomas Munro wrote: > > I don't really want to spill details of this developer-only stuff onto > > more manual sections... It's not really helping users if we confuse > > them with irrelevant details of a feature they shouldn't be using, is > > it? And the existing treatment "Returns the file system path that > > this tablespace is located in" is not invalidated by this special > > case, so maybe we shouldn't mention it? > > Right, I see your point. The existing description is not wrong > either. Fine by me to just drop all that. +1. Sorry for my otiose comment.. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Tue, Mar 8, 2022 at 12:58 AM Michael Paquier <michael@paquier.xyz> wrote: > Junction points are directories, no? Are you sure that this works > correctly on WIN32? It seems to me that we'd better use readlink() > only for entries in pg_tlbspc/ that are PGFILETYPE_LNK on non-WIN32 > and pgwin32_is_junction() on WIN32. Hmm. So the code we finished up with in the tree looks like this: #ifdef WIN32 if (!pgwin32_is_junction(fullpath)) continue; #else if (get_dirent_type(fullpath, de, false, ERROR) != PGFILETYPE_LNK) continue; #endif As mentioned, I was unhappy with the lack of error checking for that interface, and I've started a new thread about that, but then I started wondering if we missed a trick here: get_dirent_type() contain code that wants to return PGFILETYPE_LNK for reparse points. Clearly it's not working, based on results reported in this thread. Is that explained by your comment above, "junction points _are_ directories", and we're testing the attribute flags in the wrong order here? if ((fd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) != 0) d->ret.d_type = DT_DIR; /* For reparse points dwReserved0 field will contain the ReparseTag */ else if ((fd.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) != 0 && (fd.dwReserved0 == IO_REPARSE_TAG_MOUNT_POINT)) d->ret.d_type = DT_LNK; else d->ret.d_type = DT_REG;
On Thu, Mar 24, 2022 at 04:41:30PM +1300, Thomas Munro wrote: > As mentioned, I was unhappy with the lack of error checking for that > interface, and I've started a new thread about that, but then I > started wondering if we missed a trick here: get_dirent_type() contain > code that wants to return PGFILETYPE_LNK for reparse points. Clearly > it's not working, based on results reported in this thread. Is that > explained by your comment above, "junction points _are_ directories", > and we're testing the attribute flags in the wrong order here? > > if ((fd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) != 0) > d->ret.d_type = DT_DIR; > /* For reparse points dwReserved0 field will contain the ReparseTag */ > else if ((fd.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) != 0 && > (fd.dwReserved0 == IO_REPARSE_TAG_MOUNT_POINT)) > d->ret.d_type = DT_LNK; > else > d->ret.d_type = DT_REG; Ah, good point. I have not tested on Windows so I am not 100% sure, but indeed it would make sense to reverse both conditions if a junction point happens to be marked as both FILE_ATTRIBUTE_DIRECTORY and FILE_ATTRIBUTE_REPARSE_POINT when scanning a directory. Based on a read of the the upstream docs, I guess that this is the case: https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/ca28ec38-f155-4768-81d6-4bfeb8586fc9 Note the "A file or directory that has an associated reparse point." for the description of FILE_ATTRIBUTE_REPARSE_POINT. -- Michael
Attachment
On Sat, Mar 26, 2022 at 6:33 PM Michael Paquier <michael@paquier.xyz> wrote: > Ah, good point. I have not tested on Windows so I am not 100% sure, > but indeed it would make sense to reverse both conditions if a > junction point happens to be marked as both FILE_ATTRIBUTE_DIRECTORY > and FILE_ATTRIBUTE_REPARSE_POINT when scanning a directory. Based on > a read of the the upstream docs, I guess that this is the case: > https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/ca28ec38-f155-4768-81d6-4bfeb8586fc9 > > Note the "A file or directory that has an associated reparse point." > for the description of FILE_ATTRIBUTE_REPARSE_POINT. That leads to the attached patches, the first of which I'd want to back-patch. Unfortunately while testing this I realised there is something else wrong here: if you take a basebackup using tar format, in-place tablespaces are skipped (they should get their own OID.tar file, but they don't, also no error). While it wasn't one of my original goals for in-place tablespaces to work in every way (and I'm certain some external tools would be confused by them), it seems we're pretty close so we should probably figure out that piece of the puzzle. It may be obvious why but I didn't have time to dig into that today... perhaps instead of just skipping the readlink() we should be writing something different into the mapfile and then restoring as appropriate...
Attachment
On Wed, Mar 30, 2022 at 08:23:25PM +1300, Thomas Munro wrote: > That leads to the attached patches, the first of which I'd want to back-patch. Makes sense. - if ((fd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) != 0) - d->ret.d_type = DT_DIR; /* For reparse points dwReserved0 field will contain the ReparseTag */ - else if ((fd.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) != 0 && - (fd.dwReserved0 == IO_REPARSE_TAG_MOUNT_POINT)) + if ((fd.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) != 0 && + (fd.dwReserved0 == IO_REPARSE_TAG_MOUNT_POINT)) d->ret.d_type = DT_LNK; + else if ((fd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) != 0) + d->ret.d_type = DT_DIR; This should also work for plain files, so that looks fine to me. > Unfortunately while testing this I realised there is something else > wrong here: if you take a basebackup using tar format, in-place > tablespaces are skipped (they should get their own OID.tar file, but > they don't, also no error). While it wasn't one of my original goals > for in-place tablespaces to work in every way (and I'm certain some > external tools would be confused by them), it seems we're pretty close > so we should probably figure out that piece of the puzzle. It may be > obvious why but I didn't have time to dig into that today... perhaps > instead of just skipping the readlink() we should be writing something > different into the mapfile and then restoring as appropriate... Yeah, I saw that in-place tablespaces were part of the main tarball in base backups as we rely on the existence of a link to decide if the contents of a path should be separated in a different tarball or not. This does not strike me as a huge problem in itself, TBH, as the improvement would be limited to make sure that the base backups could be correctly restored with multiple tablespaces. And you can get pretty much the same amount of coverage to make sure that the backup contents are correct without fully restoring them. -- Michael
Attachment
On Thu, Mar 31, 2022 at 5:01 PM Michael Paquier <michael@paquier.xyz> wrote: > On Wed, Mar 30, 2022 at 08:23:25PM +1300, Thomas Munro wrote: > > That leads to the attached patches, the first of which I'd want to back-patch. > > Makes sense. > ... > This should also work for plain files, so that looks fine to me. Thanks. Pushed. Also CC'ing Alvaro who expressed an interest in this problem[1]. > > Unfortunately while testing this I realised there is something else > > wrong here: if you take a basebackup using tar format, in-place > > tablespaces are skipped (they should get their own OID.tar file, but > > they don't, also no error). While it wasn't one of my original goals > > for in-place tablespaces to work in every way (and I'm certain some > > external tools would be confused by them), it seems we're pretty close > > so we should probably figure out that piece of the puzzle. It may be > > obvious why but I didn't have time to dig into that today... perhaps > > instead of just skipping the readlink() we should be writing something > > different into the mapfile and then restoring as appropriate... > > Yeah, I saw that in-place tablespaces were part of the main tarball in > base backups as we rely on the existence of a link to decide if the > contents of a path should be separated in a different tarball or not. > This does not strike me as a huge problem in itself, TBH, as the > improvement would be limited to make sure that the base backups could > be correctly restored with multiple tablespaces. And you can get > pretty much the same amount of coverage to make sure that the backup > contents are correct without fully restoring them. Are they in the main tar file, or are they just missing? [1] https://postgr.es/m/20220721111751.x7hod2xgrd76xr5c%40alvherre.pgsql
On 2022-Jul-22, Thomas Munro wrote: > Thanks. Pushed. Also CC'ing Alvaro who expressed an interest in this > problem[1]. > [1] https://postgr.es/m/20220721111751.x7hod2xgrd76xr5c%40alvherre.pgsql Yay! Thanks. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Los dioses no protegen a los insensatos. Éstos reciben protección de otros insensatos mejor dotados" (Luis Wu, Mundo Anillo)
On Fri, Jul 22, 2022 at 05:50:58PM +1200, Thomas Munro wrote: > On Thu, Mar 31, 2022 at 5:01 PM Michael Paquier <michael@paquier.xyz> wrote: >> Yeah, I saw that in-place tablespaces were part of the main tarball in >> base backups as we rely on the existence of a link to decide if the >> contents of a path should be separated in a different tarball or not. >> This does not strike me as a huge problem in itself, TBH, as the >> improvement would be limited to make sure that the base backups could >> be correctly restored with multiple tablespaces. And you can get >> pretty much the same amount of coverage to make sure that the backup >> contents are correct without fully restoring them. > > Are they in the main tar file, or are they just missing? So, coming back to this thread.. Sorry for the late reply. Something is still broken here with in-place tablespaces on HEAD. When taking a base backup in plain format, in-place tablespaces are correctly in the stream. However, when using the tar format, these are not streamed. c6f2f01 has cleaned the WARNING "could not read symbolic link", still we have the following, when having an in-place tablespace on a primary: - For a base backup in plain format, the in-place tablespace path is included in the base backup. - For a base backup in tar format, the in-place tablespace path is not included in the base backup. It is not in base.tar, and there is no additional tar file. c6f2f01 does not change this result. So they are missing, to answer your question. -- Michael