Re: recovery modules - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: recovery modules
Date
Msg-id Y+nopoSsxz0t3Orz@paquier.xyz
Whole thread Raw
In response to Re: recovery modules  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: recovery modules
List pgsql-hackers
On Thu, Feb 09, 2023 at 02:48:26PM -0800, Nathan Bossart wrote:
> On Thu, Feb 09, 2023 at 12:18:55PM -0800, Andres Freund wrote:
>>>  <programlisting>
>>> -typedef bool (*ArchiveCheckConfiguredCB) (void);
>>> +typedef bool (*ArchiveCheckConfiguredCB) (ArchiveModuleState *state);
>>>  </programlisting>
>>>
>>>      If <literal>true</literal> is returned, the server will proceed with
>>
>> Hm. I wonder if ArchiveCheckConfiguredCB() should actually work without the
>> state. We're not really doing anything yet, at that point, so it shouldn't
>> really need state?
>>
>> The reason I'm wondering is that I think we should consider calling this from
>> the GUC assignment hook, at least in postmaster. Which would make it more
>> convenient to not have state, I think?
>
> I agree that this callback should typically not need the state, but I'm not
> sure whether it fits into the assignment hook for archive_library.  This
> callback is primarily meant for situations when you have archiving enabled,
> but your module isn't configured yet (e.g., archive_command is empty).  In
> this case, we keep the WAL around, but we don't try to archive it until
> this hook returns true.  It's up to each module to define that criteria.  I
> can imagine someone introducing a GUC in their archive module that
> temporarily halts archiving via this callback.  In that case, calling it
> via a GUC assignment hook probably won't work.  In fact, checking whether
> archive_command is empty in that hook might not work either.

Keeping the state in the configure check callback does not strike me
as a problem, FWIW.

>>>  <programlisting>
>>> -typedef void (*ArchiveShutdownCB) (void);
>>> +typedef void (*ArchiveShutdownCB) (ArchiveModuleState *state);
>>>  </programlisting>
>>>     </para>
>>>    </sect2>
>>
>> Perhaps mention that this needs to free state it allocated in the
>> ArchiveModuleState, or it'll be leaked?
>
> done
>
> I left this out originally because the archiver exits shortly after calling
> this.  However, if you have DSM segments or something, it's probably wise
> to make sure those are cleaned up.  And I suppose we might not always exit
> immediately after this callback, so establishing the habit of freeing the
> state could be a good idea.  In addition to updating this part of the docs,
> I wrote a shutdown callback for basic_archive that frees its state.

This makes sense to me.  Still, DSM segments had better be cleaned up
with dsm_backend_shutdown().

+   basic_archive_context = data->context;
+   if (CurrentMemoryContext == basic_archive_context)
+       MemoryContextSwitchTo(TopMemoryContext);
+
+   if (MemoryContextIsValid(basic_archive_context))
+       MemoryContextDelete(basic_archive_context);

This is a bit confusing, because it means that we enter in the
shutdown callback with one context, but exit it under
TopMemoryContext.  Are you sure that this will be OK when there could
be multiple callbacks piled up with before_shmem_exit()?  shmem_exit()
has nothing specific to memory contexts.

Is putting the new headers in src/include/postmaster/ the best move in
the long term?  Perhaps that's fine, but I was wondering whether a new
location like archive/ would make more sense.  pg_arch.h being in the
postmaster section is fine.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Richard Guo
Date:
Subject: Re: Making Vars outer-join aware
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry