Re: .ready and .done files considered harmful - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: .ready and .done files considered harmful
Date
Msg-id CAFiTN-ucivOLXNPehf0vj3mR63GEkiTatGjkcZ3B5e5_B6+tTg@mail.gmail.com
Whole thread Raw
In response to Re: .ready and .done files considered harmful  (Dipesh Pandit <dipesh.pandit@gmail.com>)
List pgsql-hackers
On Mon, Jul 19, 2021 at 5:43 PM Dipesh Pandit <dipesh.pandit@gmail.com> wrote:
>
> Hi,
>
> > I agree, I missed this part. The .history file should be given higher preference.
> > I will take care of it in the next patch.
>
> Archiver does not have access to shared memory and the current timeline ID
> is not available at archiver. In order to keep track of timeline switch we have
> to push a notification from backend to archiver.  Backend can send a signal
> to notify archiver about the timeline change. Archiver can register this
> notification and perform a full directory scan to make sure that archiving
> history files take precedence over archiving WAL files.

Yeah, that makes sense, some comments on v2.

1.
+pgarch_timeline_switch(SIGNAL_ARGS)
+{
+    int            save_errno = errno;
+
+    /* Set the flag to register a timeline switch */
+    timeline_switch = true;
+    SetLatch(MyLatch);
+

On the timeline switch, setting a flag should be enough, I don't think
that we need to wake up the archiver.  Because it will just waste the
scan cycle.  We have set the flag and that should be enough and let
the XLogArchiveNotify() wake this up when something is ready to be
archived and that time we will scan the directory first based on the
flag.


2.
+     */
+    if (XLogArchivingActive() && ArchiveRecoveryRequested)
+        XLogArchiveNotifyTLISwitch();
+
+
.....

 /*
+ * Signal archiver to notify timeline switch
+ */
+void
+XLogArchiveNotifyTLISwitch(void)
+{
+    if (IsUnderPostmaster)
+        PgArchNotifyTLISwitch();
+}

Why do we need multi level interfaces? I mean instead of calling first
XLogArchiveNotifyTLISwitch and then calling PgArchNotifyTLISwitch,
can't we directly call PgArchNotifyTLISwitch()?

3.
+        if (timeline_switch)
+        {
+            /* Perform a full directory scan in next cycle */
+            dirScan = true;
+            timeline_switch = false;
+        }

I suggest you can add some comments atop this check.

4.
+PgArchNotifyTLISwitch(void)
+{
+    int            arch_pgprocno = PgArch->pgprocno;
+
+    if (arch_pgprocno != INVALID_PGPROCNO)
+    {
+        int        archiver_pid = ProcGlobal->allProcs[arch_pgprocno].pid;
+
+        if (kill(archiver_pid, SIGINT) < 0)
+            elog(ERROR, "could not notify timeline change to archiver");


I think you should use %m in the error message so that it also prints
the OS error code.

5.
+/* Flag to specify a full directory scan to find next log file */
+static bool dirScan = true;

Why is this a global variable?  I mean whenever you enter the function
pgarch_ArchiverCopyLoop(), this can be set to true and after that you
can pass this as inout parameter to pgarch_readyXlog() there in it can
be conditionally set to false once we get some segment and whenever
the timeline switch we can set it back to the true.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: row filtering for logical replication
Next
From: Greg Nancarrow
Date:
Subject: Re: row filtering for logical replication