On Wed, 2021-05-05 at 11:04 -0400, Robert Haas wrote:
> You might want to use pg_receivewal to save all of your WAL segments
> somewhere instead of relying on archive_command. It has, at the least,
> the advantage of working on the byte level rather than the segment
> level. But it seems to me that it is not entirely suitable as a
> substitute for archiving, for a couple of reasons. One is that as soon
> as it runs into a problem, it exits, which is not really what you want
> out of a daemon that's critical to the future availability of your
> system. Another is that you can't monitor it aside from looking at
> what it prints out, which is also not really what you want for a piece
> of critical infrastructure.
>
> The first problem seems somewhat more straightforward. Suppose we add
> a new command-line option, perhaps --daemon but we can bikeshed. If
> this option is specified, then it tries to keep going when it hits a
> problem, rather than just giving up. [...]
That sounds like a good idea.
I don't know what it takes to make that perfect (if such a thing exists),
but simply trying to re-establish database connections and dying when
we hit an I/O problem seems like a clear improvement.
> The second problem is a bit more complex. [...]
If I wanted to monitor pg_receivewal, I'd have it use a replication
slot and monitor "pg_replication_slots" on the primary. That way I see
if there is a WAL sender process, and I can measure the lag in bytes.
What more could you want?
Yours,
Laurenz Albe