Re: parallelizing the archiver - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: parallelizing the archiver
Date
Msg-id CAOBaU_Ybyu3ror1UhoZP8hemnX-eA-kGtFa_ez+kRm4xdedEzQ@mail.gmail.com
Whole thread Raw
In response to Re: parallelizing the archiver  (Andrey Borodin <x4mmm@yandex-team.ru>)
Responses Re: parallelizing the archiver  (Andrey Borodin <x4mmm@yandex-team.ru>)
List pgsql-hackers
On Fri, Sep 10, 2021 at 1:28 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
>
> It's OK if external tool is responsible for concurrency. Do we want this complexity in core? Many users do not enable
archivingat all.
 
> Maybe just add parallelism API for external tool?
> It's much easier to control concurrency in external tool that in PostgreSQL core. Maintaining parallel worker is a
tremendouslyharder than spawning goroutine, thread, task or whatever.
 

Yes, but it also means that it's up to every single archiving tool to
implement a somewhat hackish parallel version of an archive_command,
hoping that core won't break it.  If this problem is solved in
postgres core whithout API change, then all existing tool will
automatically benefit from it (maybe not the one who used to have
hacks to make it parallel though, but it seems easier to disable it
rather than implement it).

> External tool needs to know when xlog segment is ready and needs to report when it's done. Postgres should just
ensurethat external archiever\restorer is running.
 
> For example external tool could read xlog names from stdin and report finished files from stdout. I can prototype
suchtool swiftly :)
 
> E.g. postgres runs ```wal-g wal-archiver``` and pushes ready segment filenames on stdin. And no more listing of
archive_statusand hacky algorithms to predict next WAL name and completition time!
 

Yes, but that requires fundamental design changes for the archive
commands right?  So while I agree it could be a better approach
overall, it seems like a longer term option.  As far as I understand,
what Nathan suggested seems more likely to be achieved in pg15 and
could benefit from a larger set of backup solutions.  This can give us
enough time to properly design a better approach for designing a new
archiving approach.



pgsql-hackers by date:

Previous
From: "houzj.fnst@fujitsu.com"
Date:
Subject: RE: Added schema level support for publication.
Next
From: Andrey Borodin
Date:
Subject: Re: parallelizing the archiver