Re: refactoring basebackup.c - Mailing list pgsql-hackers

From Jeevan Ladhe
Subject Re: refactoring basebackup.c
Date
Msg-id CAOgcT0P0OGFfNhCKRUyV-rGMXk+JnsYUKGF3tPYhog2c1EQAmg@mail.gmail.com
Whole thread Raw
In response to Re: refactoring basebackup.c  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: refactoring basebackup.c  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Thanks, Robert for the patches.

I tried to take a backup using gzip compression and got a core.

$ pg_basebackup -t server:/tmp/data_gzip -Xnone --server-compression=gzip
NOTICE:  WAL archiving is not enabled; you must ensure that all required WAL segments are copied through other means to complete the backup
pg_basebackup: error: could not read COPY data: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.


The backtrace:

gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x0000558264bfc40a in bbsink_cleanup (sink=0x55826684b5f8) at ../../../src/include/replication/basebackup_sink.h:268
#2  0x0000558264bfc838 in bbsink_forward_cleanup (sink=0x55826684b710) at basebackup_sink.c:124
#3  0x0000558264bf4cab in bbsink_cleanup (sink=0x55826684b710) at ../../../src/include/replication/basebackup_sink.h:268
#4  0x0000558264bf7738 in SendBaseBackup (cmd=0x55826683bd10) at basebackup.c:1020
#5  0x0000558264c10915 in exec_replication_command (
    cmd_string=0x5582667bc580 "BASE_BACKUP ( LABEL 'pg_basebackup base backup',  PROGRESS,  MANIFEST 'yes',  TABLESPACE_MAP,  TARGET 'server',  TARGET_DETAIL '/tmp/data_g
zip',  COMPRESSION 'gzip')") at walsender.c:1731
#6  0x0000558264c8a69b in PostgresMain (dbname=0x5582667e84d8 "", username=0x5582667e84b8 "hadoop") at postgres.c:4493
#7  0x0000558264bb10a6 in BackendRun (port=0x5582667de160) at postmaster.c:4560
#8  0x0000558264bb098b in BackendStartup (port=0x5582667de160) at postmaster.c:4288
#9  0x0000558264bacb55 in ServerLoop () at postmaster.c:1801
#10 0x0000558264bac2ee in PostmasterMain (argc=3, argv=0x5582667b68c0) at postmaster.c:1473
#11 0x0000558264aa0950 in main (argc=3, argv=0x5582667b68c0) at main.c:198


bbsink_gzip_ops have the cleanup() callback set to NULL, and when the
bbsink_cleanup() callback is triggered, it tries to invoke a function that
is NULL. I think either bbsink_gzip_ops should set the cleanup callback
to bbsink_forward_cleanup or we should be calling the cleanup() callback
from PG_CATCH instead of PG_FINALLY()? But in the latter case, even if
we call from PG_CATCH, it will have a similar problem for gzip and other
sinks which may not need a custom cleanup() callback in case there is any
error before the backup could finish up normally.

I have attached a patch to fix this.

Thoughts?

Regards,
Jeevan Ladhe

On Tue, Oct 26, 2021 at 1:45 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Oct 15, 2021 at 8:05 AM Robert Haas <robertmhaas@gmail.com> wrote:
> > You mean the way gzip allows us to use our own alloc and free functions
> > by means of providing the function pointers for them. Unfortunately,
> > no, LZ4 does not have that kind of provision. Maybe that makes a
> > good proposal for LZ4 library ;-).
> > I cannot think of another solution to it right away.
>
> OK. Will give it some thought.

Here's a new patch set. I've tried adding a "cleanup" callback to the
bbsink method and ensuring that it gets called even in case of an
error. The code for that is untested since I have no use for it with
the existing basebackup sink types, so let me know how it goes when
you try to use it for LZ4.

I've also added documentation for the new pg_basebackup options in
this version, and I fixed up a couple of these patches to be
pgindent-clean when they previously were not.

--
Robert Haas
EDB: http://www.enterprisedb.com
Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: when the startup process doesn't (logging startup delays)
Next
From: Nitin Jadhav
Date:
Subject: Re: when the startup process doesn't (logging startup delays)