Re: Weird failure with latches in curculio on v15 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Weird failure with latches in curculio on v15
Date
Msg-id 20230225190031.e3vesk22q5wpmmhc@awork3.anarazel.de
Whole thread Raw
In response to Re: Weird failure with latches in curculio on v15  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Weird failure with latches in curculio on v15
List pgsql-hackers
Hi,

On 2023-02-19 20:06:24 +0530, Robert Haas wrote:
> On Sun, Feb 19, 2023 at 2:45 AM Andres Freund <andres@anarazel.de> wrote:
> > To me that seems even simpler? Nothing but the archiver is supposed to create
> > .done files and nothing is supposed to remove .ready files without archiver
> > having created the .done files.  So the archiver process can scan
> > archive_status until its done or until N archives have been collected, and
> > then process them at once?  Only the creation of the .done files would be
> > serial, but I don't think that's commonly a problem (and could be optimized as
> > well, by creating multiple files and then fsyncing them in a second pass,
> > avoiding N filesystem journal flushes).
> >
> > Maybe I am misunderstanding what you see as the problem?
> 
> Well right now the archiver process calls ArchiveFileCB when there's a
> file ready for archiving, and that process is supposed to archive the
> whole thing before returning. That pretty obviously seems to preclude
> having more than one file being archived at the same time. What
> callback structure do you have in mind to allow for that?

TBH, I think the current archive and restore module APIs aren't useful. I
think it was a mistake to add archive modules without having demonstrated that
one can do something useful with them that the restore_command didn't already
do. If anything, archive modules have made it harder to improve archiving
performance via concurrency.

My point was that it's easy to have multiple archive commands in process at
the same time, because we already have a queuing system, and that
archive_command is entire compatible with doing that, because running multiple
subprocesses is pretty trivial. It wasn't that the archive API is suitable for
that.


> I mean, my idea was to basically just have one big callback:
> ArchiverModuleMainLoopCB(). Which wouldn't return, or perhaps, would
> only return when archiving was totally caught up and there was nothing
> more to do right now. And then that callback could call functions like
> AreThereAnyMoreFilesIShouldBeArchivingAndIfYesWhatIsTheNextOne(). So
> it would call that function and it would find out about a file and
> start an HTTP session or whatever and then call that function again
> and start another HTTP session for the second file and so on until it
> had as much concurrency as it wanted. And then when it hit the
> concurrency limit, it would wait until at least one HTTP request
> finished. At that point it would call
> HeyEverybodyISuccessfullyArchivedAWalFile(), after which it could
> again ask for the next file and start a request for that one and so on
> and so forth.

> I don't really understand what the other possible model is here,
> honestly. Right now, control remains within the archive module for the
> entire time that a file is being archived. If we generalize the model
> to allow multiple files to be in the process of being archived at the
> same time, the archive module is going to need to have control as long
> as >= 1 of them are in progress, at least AFAICS. If you have some
> other idea how it would work, please explain it to me...

I don't think that a main loop approach is the only viable one. It might be
the most likely to succeed one though. As an alternative, consider something
like

struct ArchiveFileState {
   int fd;
   enum WaitFor { READ, WRITE, CONNECT };
   void *file_private;
}

typedef bool (*ArchiveFileStartCB)(ArchiveModuleState *state,
   ArchiveFileState *file_state,
   const char *file, const char *path);

typedef bool (*ArchiveFileContinueCB)(ArchiveModuleState *state,
   ArchiveFileState *file_state);

An archive module could open an HTTP connection, do IO until it's blocked, put
the fd in file_state, return. The main loop could do big event loop around all
of the file descriptors and whenever any of FDs signal IO is ready, call
ArchiveFileContinueCB() for that file.

I don't know if that's better than ArchiverModuleMainLoopCB(). I can see both
advantages and disadvantages.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Gilles Darold
Date:
Subject: Re: [Proposal] Allow pg_dump to include all child tables with the root table
Next
From: Andres Freund
Date:
Subject: Re: stopgap fix for signal handling during restore_command