Hi,
Right now, the session that starts the backup with pg_backup_start()
has to end it with pg_backup_stop() which returns the backup_label and
tablespace_map contents (commit 39969e2a1). If the backups were to be
taken using custom disk snapshot tools on production servers,
following are the high-level steps involved:
1) open a session
2) run pg_backup_start() using the same session opened in (1)
3) run custom disk snapshot tools which may, many-a-times, will copy
the entire data directory over the network
4) run pg_backup_stop() using the same session opened in (1)
Typically, step (3) takes a good amount of time in production
environments with terabytes or petabytes scale of data and keeping the
session alive from step (1) to (4) has overhead and it wastes the
resources. And the session can get closed for various reasons - idle
in session timeout, tcp/ip keepalive timeout, network problems etc.
All of these can render the backup useless.
What if the backup started by a session can also be closed by another
session? This seems to be achievable, if we can place the
backup_label, tablespace_map and other required session/backend level
contents in shared memory with the key as backup_label name. It's a
long way to go. The idea may be naive at this stage and there might be
something important that doesn't let us do the proposed solution. I
would like to hear more thoughts from the hackers.
Thanks to Sameer, Satya (cc-ed) for the offlist discussion.
Regards,
Bharath Rupireddy.