Re: standby recovery fails (tablespace related) (tentative patch and discussion) - Mailing list pgsql-hackers

From Paul Guo
Subject Re: standby recovery fails (tablespace related) (tentative patch and discussion)
Date
Msg-id CAEET0ZGpfnTdRN4GCKPPPsFK03VnqiyGvyRPW+cY5STbdvyB0w@mail.gmail.com
Whole thread Raw
In response to Re: standby recovery fails (tablespace related) (tentative patch anddiscussion)  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: standby recovery fails (tablespace related) (tentative patch and discussion)  (Paul Guo <pguo@pivotal.io>)
List pgsql-hackers


On Fri, Jan 10, 2020 at 9:43 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
On 2020-Jan-09, Alvaro Herrera wrote:

> I looked at this a little while and was bothered by the perl changes; it
> seems out of place to have RecursiveCopy be thinking about tablespaces,
> which is way out of its league.  So I rewrote that to use a callback:
> the PostgresNode code passes a callback that's in charge to handle the
> case of a symlink.  Things look much more in place with that.  I didn't
> verify that all places that should use this are filled.
>
> In 0002 I found adding a new function unnecessary: we can keep backwards
> compat by checking 'ref' of the third argument.  With that we don't have
> to add a new function.  (POD changes pending.)

I forgot to add that something in these changes is broken (probably the
symlink handling callback) so the tests fail, but I couldn't stay away
from my daughter's birthday long enough to figure out what or how.  I'm
on something else today, so if one of you can research and submit fixed
versions, that'd be great.

Thanks,

I spent some time on this before getting off work today.

With below fix, the 4th test is now ok but the 5th (last one) hangs due to panic.

(gdb) bt
#0  0x0000003397e32625 in raise () from /lib64/libc.so.6
#1  0x0000003397e33e05 in abort () from /lib64/libc.so.6
#2  0x0000000000a90506 in errfinish (dummy=0) at elog.c:590
#3  0x0000000000a92b4b in elog_finish (elevel=22, fmt=0xb2d580 "cannot find directory %s tablespace %d database %d") at elog.c:1465
#4  0x000000000057aa0a in XLogLogMissingDir (spcNode=16384, dbNode=0, path=0x1885100 "pg_tblspc/16384/PG_13_202001091/16389") at xlogutils.c:104
#5  0x000000000065e92e in dbase_redo (record=0x1841568) at dbcommands.c:2225
#6  0x000000000056ac94 in StartupXLOG () at xlog.c:7200


diff --git a/src/include/commands/dbcommands.h b/src/include/commands/dbcommands.h
index b71b400e700..f8f6d5ffd03 100644
--- a/src/include/commands/dbcommands.h
+++ b/src/include/commands/dbcommands.h
@@ -19,8 +19,6 @@
 #include "lib/stringinfo.h"
 #include "nodes/parsenodes.h"

-extern void CheckMissingDirs4DbaseRedo(void);
-
 extern Oid createdb(ParseState *pstate, const CreatedbStmt *stmt);
 extern void dropdb(const char *dbname, bool missing_ok, bool force);
 extern void DropDatabase(ParseState *pstate, DropdbStmt *stmt);
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index e6e7ea505d9..4eef8bb1985 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -615,11 +615,11 @@ sub _srcsymlink
    my $srcrealdir = readlink($srcpath);

    opendir(my $dh, $srcrealdir);
-   while (readdir $dh)
+   while (my $entry = (readdir $dh))
    {
-       next if (/^\.\.?$/);
-       my $spath = "$srcrealdir/$_";
-       my $dpath = "$dstrealdir/$_";
+       next if ($entry eq '.' or $entry eq '..');
+       my $spath = "$srcrealdir/$entry";
+       my $dpath = "$dstrealdir/$entry";
        RecursiveCopy::copypath($spath, $dpath);
    }
    closedir $dh;

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Comment fix in session.h
Next
From: John Naylor
Date:
Subject: Re: benchmarking Flex practices