Hello.
At Wed, 11 Dec 2019 17:32:05 -0500, Robert Haas <robertmhaas@gmail.com> wrote in
> While reviewing the first patch in Asif Rehman's series of patches for
> parallel backup over at
> http://postgr.es/m/CADM=Jeg3ZN+kPQpiSfeWCXr=xgpLrq4cBQE5ZviUxygKq3VqiA@mail.gmail.com
> I discovered that commit 7117685461af50f50c03f43e6a622284c8d54694
> introduced a use of cancel_before_shmem_exit which falsifies the
> comment for that function. So you can cause a spurious WARNING in the
> logs by doing something like this, with max_prepared_transactions set
> to a non-zero value:
>
> select pg_start_backup('one', false, false);
> begin;
> prepare transaction 'nothing';
> select pg_stop_backup(false);
> \q
>
> in the server log:
> WARNING: aborting backup due to backend exiting before pg_stop_backup
> was called
>
> And you can cause an assertion failure like this:
>
> select pg_start_backup('one', false, false);
> begin;
> prepare transaction 'nothing';
> select pg_stop_backup(false);
> select pg_start_backup('two');
> \q
>
> We've discussed before the idea that it might be good to remove the
> limitation that before_shmem_exit() can only remove the
> most-recently-added callback, which would be one way to fix this
> problem, but I'd like to propose an alternative solution which I think
> will work out more nicely for the patch mentioned above: don't use
> cancel_before_shmem_exit, and just leave the callback registered for
> the lifetime of the backend. That requires some adjustment of the
> callback, since it needs to tolerate exclusive backup mode being in
> effect.
>
> The attached patch takes that approach. Thoughts welcome on (1) the
> approach and (2) whether to back-patch. I think there's little doubt
> that this is formally a bug, but it's a pretty minor one.
The direction seems reasonable, but the patch doesn't free up the
before_shmem_exec slot nor avoid duplicate registration of the
callback. Actually before_shmem_exit_list gets bloat with multiple
do_pg_abort_backup entries through repeated rounds of non-exclusive
backups.
As the result, if one ends a session while a non-exclusive backup is
active after closing the previous non-exclusive backup,
do_pg_abort_backup aborts for assertion failure.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center