Re: Sample archive_command is still problematic - Mailing list pgsql-docs

From MauMau
Subject Re: Sample archive_command is still problematic
Date
Msg-id 35326E59461948B394A861F69C272795@maumau
Whole thread Raw
In response to Re: Sample archive_command is still problematic  (Peter Eisentraut <peter_e@gmx.net>)
List pgsql-docs
From: "Peter Eisentraut" <peter_e@gmx.net>
> I realize that there are about 128 different ways people set this up
> (which is itself a problem), but it appears to me that a solution like
> pg_copy only provides local copying, which implies the use of something
> like NFS.  Which may be OK, but then we'd need to get into the details
> of how to set up NFS properly for this.

Yes, I think the flexibility of archive_command is nice.  The problem I want
to address is that users don't have a simple way to realiably archive files
in very simple use cases -- local copying to local or network storage.
pg_copy is a low-level command to fill the gap.


> Also, I think you can get local copy+fsync with dd.

Yes, dd on Linux has "sync" option.  But dd on Solaris doesn't.  I can't
find a command on Windows which is installed by default.

> The alternatives of doing remote copying inside archive_command are also
> questionable if you have multiple standbys.

Yes, we may need another interface than archive_command for archiving files
to multiple locations.  That's another issue.


> Basically, this whole interface is terrible.  Maybe it's time to phase
> it out and start looking into pg_receivexlog.

pg_receivexlog seems difficult to me.  Users have to start, stop, and
monitor pg_receivexlog.  That's burdonsome.  For example, how do we start
pg_receivexlog easily on Windows when the PostgreSQL is configured to
start/stop automatically on OS startup/shutdown with Windows service?  In
addition, users have to be aware of connection slots (max_connections and
max_wal_senders) and replication slots.

pg_receivexlog impose extra overhead even on simple use cases.  I want
backup-related facilities to use as less resources as possible.  e.g., with
archive_command, the data flows like this:

disk -> OS cache -> copy command's buffer -> OS cache -> disk

OTOH, with pg_receivexlog:

disk -> OS cache -> walsender's buffer -> socket send buffer -> kernel
buffer? -> socket receive buffer -> pg_receivexlog's buffer -> OS cache ->
disk

For reference, \copy of psql is described like this:

Tip: This operation is not as efficient as the SQL COPY command because all
data must pass through the client/server connection. For large amounts of
data the SQL command might be preferable.

Regards
MauMau



pgsql-docs by date:

Previous
From: "MauMau"
Date:
Subject: Re: Sample archive_command is still problematic
Next
From: Magnus Hagander
Date:
Subject: Re: Sample archive_command is still problematic