Thread: Move backup-related code to xlogbackup.c/.h

Move backup-related code to xlogbackup.c/.h

From
Bharath Rupireddy
Date:
Hi,

xlog.c currently has ~9000 LOC, out of which ~700 LOC is backup
related, making the file really unmanageable. The commit
7d708093b7400327658a30d1aa1d5e284d37622c added new files
xlogbackup.c/.h for hosting all backup related code eventually. I
propose to move all the backp related code from xlog.c and xlogfuncs.c
to xlogbackup.c/.h. In doing so, I had to add a few Get/Set functions
for XLogCtl variables so that xlogbackup.c can use them.

I'm attaching a patch set where 0001 and 0002 move backup code from
xlogfuncs.c and xlog.c to xlogbackup.c/.h respectively. The advantage
is that all the core's backup code is in one single file making it
more readable and manageable while reducing the xlog.c's file size.

Thoughts?

Thanks Michael Paquier for suggesting to have new files for backup related code.

[1] https://www.postgresql.org/message-id/CALj2ACX0wjo%2B49hbUmvc_zT1zwdqFOQyhorN0Ox-Rk6v97Nejw%40mail.gmail.com

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Move backup-related code to xlogbackup.c/.h

From
Nathan Bossart
Date:
On Wed, Sep 28, 2022 at 08:16:08PM +0530, Bharath Rupireddy wrote:
> In doing so, I had to add a few Get/Set functions
> for XLogCtl variables so that xlogbackup.c can use them.

I would suggest moving this to a separate prerequisite patch that can be
reviewed independently from the patches that simply move code to a
different file.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: Move backup-related code to xlogbackup.c/.h

From
Michael Paquier
Date:
On Tue, Oct 04, 2022 at 03:54:20PM -0700, Nathan Bossart wrote:
> I would suggest moving this to a separate prerequisite patch that can be
> reviewed independently from the patches that simply move code to a
> different file.

And FWIW, the SQL interfaces for pg_backup_start() and
pg_backup_stop() could stay in xlogfuncs.c.  This has the advantage to
centralize in the same file all the SQL-function-specific checks.
--
Michael

Attachment

Re: Move backup-related code to xlogbackup.c/.h

From
Bharath Rupireddy
Date:
On Wed, Oct 5, 2022 at 1:20 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, Oct 04, 2022 at 03:54:20PM -0700, Nathan Bossart wrote:
> > I would suggest moving this to a separate prerequisite patch that can be
> > reviewed independently from the patches that simply move code to a
> > different file.

I added the new functions in 0001 patch for ease of review.

> And FWIW, the SQL interfaces for pg_backup_start() and
> pg_backup_stop() could stay in xlogfuncs.c.  This has the advantage to
> centralize in the same file all the SQL-function-specific checks.

Agreed.

+extern void WALInsertLockAcquire(void);
+extern void WALInsertLockAcquireExclusive(void);
+extern void WALInsertLockRelease(void);
+extern void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);

Note that I had moved all WAL insert lock related functions to xlog.h
despite xlogbackup.c using 2 of them. This is done to keep all the
functions together.

Please review the attached v2 patch set.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Move backup-related code to xlogbackup.c/.h

From
Alvaro Herrera
Date:
On 2022-Oct-05, Michael Paquier wrote:

> And FWIW, the SQL interfaces for pg_backup_start() and
> pg_backup_stop() could stay in xlogfuncs.c.  This has the advantage to
> centralize in the same file all the SQL-function-specific checks.

As I recall, that has the disadvantage that the API exposure is a bit
higher -- I mean, with the patch as originally posted, there was less
cross-inclusion of header files, but that is gone in the version Bharat
posted as reply to this.  I'm not sure if that's caused by *this*
comment, or even that it's terribly significant, but it seems worth
considering at least.

xlog.h is included by a lot of stuff, so it would be great if it
itself included the smallest set of other files possible.

... that said, looking at the chart in
https://doxygen.postgresql.org//xlog_8h.html looks like the only file
we'd avoid indirectly including is pgtime.h (in addition to xlogbackup.h
itself).


(It's strange that xlog.h seems to have become included into rel.h by
commit 848ef42bb8c7 that did not otherwise touch either rel.h nor xlog.h.)

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Nunca se desea ardientemente lo que solo se desea por razón" (F. Alexandre)



Re: Move backup-related code to xlogbackup.c/.h

From
Nathan Bossart
Date:
On Wed, Oct 05, 2022 at 03:22:01PM +0530, Bharath Rupireddy wrote:
>> On Tue, Oct 04, 2022 at 03:54:20PM -0700, Nathan Bossart wrote:
>> > I would suggest moving this to a separate prerequisite patch that can be
>> > reviewed independently from the patches that simply move code to a
>> > different file.
> 
> I added the new functions in 0001 patch for ease of review.

Can we also replace the relevant code with calls to these functions in
0001?  That way, we can more easily review the changes you are making to
this code separately from the large patch that just moves the code.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: Move backup-related code to xlogbackup.c/.h

From
Andres Freund
Date:
Hi,

On 2022-10-05 15:22:01 +0530, Bharath Rupireddy wrote:
> +extern void WALInsertLockAcquire(void);
> +extern void WALInsertLockAcquireExclusive(void);
> +extern void WALInsertLockRelease(void);
> +extern void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
> 
> Note that I had moved all WAL insert lock related functions to xlog.h
> despite xlogbackup.c using 2 of them. This is done to keep all the
> functions together.
> 
> Please review the attached v2 patch set.

I'm doubtful it's a good idea to expose these to outside of xlog.c - they are
very low level, and it's very easy to break stuff by using them wrongly. IMO,
if that's necessary, the split isn't right.

Greetings,

Andres Freund



Re: Move backup-related code to xlogbackup.c/.h

From
Bharath Rupireddy
Date:
On Thu, Oct 6, 2022 at 4:50 AM Andres Freund <andres@anarazel.de> wrote:
>
> I'm doubtful it's a good idea to expose these to outside of xlog.c - they are
> very low level, and it's very easy to break stuff by using them wrongly.

Hm. Here's the v3 patch set without exposing WAL insert lock related
functions. Please have a look.

On Thu, Oct 6, 2022 at 4:22 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> Can we also replace the relevant code with calls to these functions in
> 0001?  That way, we can more easily review the changes you are making to
> this code separately from the large patch that just moves the code.

Done. Please have a look at 0001.

On Wed, Oct 5, 2022 at 11:28 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2022-Oct-05, Michael Paquier wrote:
>
> > And FWIW, the SQL interfaces for pg_backup_start() and
> > pg_backup_stop() could stay in xlogfuncs.c.  This has the advantage to
> > centralize in the same file all the SQL-function-specific checks.
>
> As I recall, that has the disadvantage that the API exposure is a bit
> higher -- I mean, with the patch as originally posted, there was less
> cross-inclusion of header files, but that is gone in the version Bharat
> posted as reply to this.  I'm not sure if that's caused by *this*
> comment, or even that it's terribly significant, but it seems worth
> considering at least.

FWIW, I'm attaching 0003 patch for moving backup functions from
xlogfuncs.c to xlogbackup.c. It's natural to have them there when
we're moving backup related things, this also reduces backup code
footprint. We can leave xlogfuncs.c for WAL related SQL-callable
functions.

Please review the attached v3 patch set.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Move backup-related code to xlogbackup.c/.h

From
Alvaro Herrera
Date:
On 2022-Oct-06, Bharath Rupireddy wrote:

> On Thu, Oct 6, 2022 at 4:50 AM Andres Freund <andres@anarazel.de> wrote:
> >
> > I'm doubtful it's a good idea to expose these to outside of xlog.c - they are
> > very low level, and it's very easy to break stuff by using them wrongly.
> 
> Hm. Here's the v3 patch set without exposing WAL insert lock related
> functions. Please have a look.

Hmm, I don't like your 0001 very much.  This sort of thing:

+/*
+ * Get the ControlFile.
+ */
+ControlFileData *
+GetControlFile(void)
+{
+       return ControlFile;
+}

looks too easy to misuse; what about locking?  Also, isn't the addition
of ControlFile as a variable in do_pg_backup_start going to cause shadow
variable warnings?  Given the locking requirements, I think it would be
feasible to copy stuff out of ControlFile under lock, then return the
copies.


+/*
+ * Increment runningBackups and forcePageWrites.
+ *
+ * NOTE: This function is tailor-made for use in xlogbackup.c. It doesn't set
+ * the respective XLogCtl members directly, and acquires and releases locks.
+ * Hence be careful when using it elsewhere.
+ */
+void
+SetXLogBackupRelatedInfo(void)

I understand that naming is difficult, but I think "Set foo Related
Info" seems way too vague.  And the comment says "it doesn't set stuff
directly", and then it goes and sets stuff directly.  What gives?

You added some commentary that these functions are tailor-made for
internal operations, and then declared them in the most public header
function that xlog has?  I think at the bare minimum, these prototypes
should be in xlog_internal.h, not xlog.h.


I didn't look at 0002 and 0003 other than to notice that xlogbackup.h is
no longer removed from xlog.h.  So what is the point of all this?

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/



Re: Move backup-related code to xlogbackup.c/.h

From
Bharath Rupireddy
Date:
On Wed, Oct 12, 2022 at 1:04 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> > Hm. Here's the v3 patch set without exposing WAL insert lock related
> > functions. Please have a look.
>
> Hmm, I don't like your 0001 very much.  This sort of thing:

Thanks for reviewing.

> +ControlFileData *
> +GetControlFile(void)
>
> looks too easy to misuse; what about locking?  Also, isn't the addition
> of ControlFile as a variable in do_pg_backup_start going to cause shadow
> variable warnings?  Given the locking requirements, I think it would be
> feasible to copy stuff out of ControlFile under lock, then return the
> copies.

+1. Done that way.

> +/*
> + * Increment runningBackups and forcePageWrites.
> + *
> + * NOTE: This function is tailor-made for use in xlogbackup.c. It doesn't set
> + * the respective XLogCtl members directly, and acquires and releases locks.
> + * Hence be careful when using it elsewhere.
> + */
> +void
> +SetXLogBackupRelatedInfo(void)
>
> I understand that naming is difficult, but I think "Set foo Related
> Info" seems way too vague.

I've used SetXLogBackupActivity() and ResetXLogBackupActivity()
because they match with the members that these functions deal with.

> And the comment says "it doesn't set stuff
> directly", and then it goes and sets stuff directly.  What gives?

My bad. That comment was meant for the reset function above. However,
I've removed it entirely now because one can look at the function and
infer that the forcePageWrites isn't set directly but only when
runningBackups is 0.

> You added some commentary that these functions are tailor-made for
> internal operations, and then declared them in the most public header
> function that xlog has?  I think at the bare minimum, these prototypes
> should be in xlog_internal.h, not xlog.h.

I removed such comments. These are functions used by xlogbackup.c to
call back into xlog.c similar to the call back functions defined in
xlog.h for xlogrecovery.c. And, most of the XLogCtl set/get sort of
function declarations are in xlog.h. So, I'm retaining them in xlog.h.

> I didn't look at 0002 and 0003 other than to notice that xlogbackup.h is
> no longer removed from xlog.h.  So what is the point of all this?

The whole idea is to move as much as possible backup related code to
xlogbackup.c/.h because xlog.c has already grown.

I've earlier moved macros BACKUP_LABEL_FILE, TABLESPACE_MAP to
xlogbackup.h, but I think they're good to stay in xlog.h as they're
being used in xlog.c and xlogrecovery.c. This reduces the xlogbackup.h
footprint a bit - we don't need xlogbackup.h in xlogrecovery.c.

Another reason we need xlogbackup.h in xlog.h is for
SessionBackupState and it needs to be set before we release WAL insert
locks, see the comment [1]. Well, for this reason, should we move all
xlogbackup.c callbacks for xlog.c to xlog_internal.h? Or should we
just remove the SessionBackupState enum and convert
SESSION_BACKUP_NONE and SESSION_BACKUP_RUNNING to just macros in
xlogbackup.h and use integer type to pass the state across? I don't
know what's better here. Thoughts?

I'm attaching the v4 patch set, please review it further.

[1]
     * You might think that WALInsertLockRelease() can be called before
     * cleaning up session-level lock because session-level lock doesn't need
     * to be protected with WAL insertion lock. But since
     * CHECK_FOR_INTERRUPTS() can occur in it, session-level lock must be
     * cleaned up before it.
     */

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Move backup-related code to xlogbackup.c/.h

From
Alvaro Herrera
Date:
On 2022-Oct-13, Bharath Rupireddy wrote:

> On Wed, Oct 12, 2022 at 1:04 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

> > You added some commentary that these functions are tailor-made for
> > internal operations, and then declared them in the most public header
> > function that xlog has?  I think at the bare minimum, these prototypes
> > should be in xlog_internal.h, not xlog.h.
> 
> I removed such comments. These are functions used by xlogbackup.c to
> call back into xlog.c similar to the call back functions defined in
> xlog.h for xlogrecovery.c. And, most of the XLogCtl set/get sort of
> function declarations are in xlog.h. So, I'm retaining them in xlog.h.

As I see it, xlog.h is a header that exports XLOG manipulations to the
outside world (everything that produces WAL, as well as stuff that
controls operation); xlog_internal is the header that exports xlog*.c
internal stuff for other xlog*.c files and specialized frontends to use.
These new functions are internal to xlogbackup.c and xlog.c, so IMO they
belong in xlog_internal.h.

Stuff that is used from xlog.c only by xlogrecovery.c should also appear
in xlog_internal.h only, not xlog.h, so I suggest not to take that as
precedent.  Also, that file (xlogrecovery.c) is pretty new so we haven't
had time to nail down the .h layout yet.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/



Re: Move backup-related code to xlogbackup.c/.h

From
Bharath Rupireddy
Date:
On Thu, Oct 13, 2022 at 3:12 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> As I see it, xlog.h is a header that exports XLOG manipulations to the
> outside world (everything that produces WAL, as well as stuff that
> controls operation); xlog_internal is the header that exports xlog*.c
> internal stuff for other xlog*.c files and specialized frontends to use.
> These new functions are internal to xlogbackup.c and xlog.c, so IMO they
> belong in xlog_internal.h.
>
> Stuff that is used from xlog.c only by xlogrecovery.c should also appear
> in xlog_internal.h only, not xlog.h, so I suggest not to take that as
> precedent.  Also, that file (xlogrecovery.c) is pretty new so we haven't
> had time to nail down the .h layout yet.

Hm. Agree. But, that requires us to include xlogbackup.h in
xlog_internal.h for SessionBackupState enum in
ResetXLogBackupActivity(). Is that okay?

SessionBackupState and it needs to be set before we release WAL insert
locks, see the comment [1]. Should we just remove the
SessionBackupState enum and convert SESSION_BACKUP_NONE and
SESSION_BACKUP_RUNNING to just macros in xlogbackup.h and use integer
type to pass the state across? I don't know what's better here. Do you
have any thoughts on this?

[1]
     * You might think that WALInsertLockRelease() can be called before
     * cleaning up session-level lock because session-level lock doesn't need
     * to be protected with WAL insertion lock. But since
     * CHECK_FOR_INTERRUPTS() can occur in it, session-level lock must be
     * cleaned up before it.
     */

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Move backup-related code to xlogbackup.c/.h

From
Alvaro Herrera
Date:
On 2022-Oct-13, Bharath Rupireddy wrote:

> Hm. Agree. But, that requires us to include xlogbackup.h in
> xlog_internal.h for SessionBackupState enum in
> ResetXLogBackupActivity(). Is that okay?

It's not great, but it's not *that* bad, ISTM, mainly because
xlog_internal.h will affect less stuff than xlog.h.

> SessionBackupState and it needs to be set before we release WAL insert
> locks, see the comment [1].

I see.  Maybe we could keep that enum in xlog.h, instead.

While looking at how that works: I think calling a local variable
"session_backup_state" is super confusing, seeing that we have a
file-global variable called sessionBackupState.  I recommend naming the
local "newstate" or something along those lines instead.

I wonder why does pg_backup_start_callback() not change the backup state
before your patch.  This seems a gratuitous difference, or is it?  If
you change that code so that it also sets the status to BACKUP_NONE,
then you can pass a bare SessionBackupState to ResetXLogBackupActivity
rather than a pointer to one, which is a very strange arrangement that
exists only so that you can have a third state (NULL) meaning "don't
change state" -- that looks quite weird.

Alternatively, if you don't want or can't change
pg_backup_start_callback to pass a valid state value, another solution
might be to pass a separate boolean "change state".

But I would look at having another patch before your series that changes
pg_backup_start_callback to make the code identical for the three
callers, then you can simplify the patched code.

> Should we just remove the
> SessionBackupState enum and convert SESSION_BACKUP_NONE and
> SESSION_BACKUP_RUNNING to just macros in xlogbackup.h and use integer
> type to pass the state across? I don't know what's better here. Do you
> have any thoughts on this?

No, please, no passing of unadorned magic numbers.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/



Re: Move backup-related code to xlogbackup.c/.h

From
Robert Haas
Date:
On Thu, Oct 13, 2022 at 7:13 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> On 2022-Oct-13, Bharath Rupireddy wrote:
> > Hm. Agree. But, that requires us to include xlogbackup.h in
> > xlog_internal.h for SessionBackupState enum in
> > ResetXLogBackupActivity(). Is that okay?
>
> It's not great, but it's not *that* bad, ISTM, mainly because
> xlog_internal.h will affect less stuff than xlog.h.

This is unfortunately a lot less true than I would like. I count 75
places where we #include "access/xlog.h" and 53 where we #include
"access/xlog_internal.h". And many of those are in frontend code. I
feel like the contents of xlog_internal.h are a bit too eclectic.
Maybe stuff that has to do with the on-disk directory structure, like
XLOGDIR and XLOG_FNAME_LEN, as well as stuff that has to do with where
bytes are located, like XLByteToSeg, should move to another file.
Besides that, which is the biggest part of the file, there's also
stuff that has to do with the page and record format generally (like
XLOG_PAGE_MAGIC and SizeOfXLogShortPHD) and stuff that is used for
certain specific WAL record types (like xl_parameter_change and
xl_restore_point) and some random rmgr-related things (like RmgrData
and the stuff that folllows) and the usual assortment of random GUCs
and global variables (like RecoveryTargetAction and
ArchiveRecoveryRequested). Maybe it doesn't make sense to split this
up into a thousand tiny little header files, but I think some
rethinking would be a good idea, because it really doesn't make much
sense to me to mix stuff that has to do with file-naming conventions,
which a bunch of frontend code needs to know about, together with a
bunch of backend-only things.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Move backup-related code to xlogbackup.c/.h

From
Bharath Rupireddy
Date:
On Thu, Oct 13, 2022 at 4:43 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>

Thanks for reviewing.

> > Hm. Agree. But, that requires us to include xlogbackup.h in
> > xlog_internal.h for SessionBackupState enum in
> > ResetXLogBackupActivity(). Is that okay?
>
> It's not great, but it's not *that* bad, ISTM, mainly because
> xlog_internal.h will affect less stuff than xlog.h.

Moved them to xlog_internal.h without xlogbackup.h included, please see below.

> > SessionBackupState and it needs to be set before we release WAL insert
> > locks, see the comment [1].
>
> I see.  Maybe we could keep that enum in xlog.h, instead.

It's not required now, please see below.

> While looking at how that works: I think calling a local variable
> "session_backup_state" is super confusing, seeing that we have a
> file-global variable called sessionBackupState.  I recommend naming the
> local "newstate" or something along those lines instead.
>
> I wonder why does pg_backup_start_callback() not change the backup state
> before your patch.  This seems a gratuitous difference, or is it?  If
> you change that code so that it also sets the status to BACKUP_NONE,
> then you can pass a bare SessionBackupState to ResetXLogBackupActivity
> rather than a pointer to one, which is a very strange arrangement that
> exists only so that you can have a third state (NULL) meaning "don't
> change state" -- that looks quite weird.
>
> Alternatively, if you don't want or can't change
> pg_backup_start_callback to pass a valid state value, another solution
> might be to pass a separate boolean "change state".
>
> But I would look at having another patch before your series that changes
> pg_backup_start_callback to make the code identical for the three
> callers, then you can simplify the patched code.

The pg_backup_start_callback() can just go ahead and reset
sessionBackupState. However, it leads us to the complete removal of
pg_backup_start_callback() itself and use do_pg_abort_backup()
consistently across, saving 20 LOC attached as v5-0001.

With this, the other patches would get simplified a bit too,
xlogbackup.h footprint got reduced now.

Please find the v5 patch-set. 0002-0004 moves the backup code to
xlogbackup.c/.h.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Move backup-related code to xlogbackup.c/.h

From
Alvaro Herrera
Date:
On 2022-Oct-13, Bharath Rupireddy wrote:

> The pg_backup_start_callback() can just go ahead and reset
> sessionBackupState. However, it leads us to the complete removal of
> pg_backup_start_callback() itself and use do_pg_abort_backup()
> consistently across, saving 20 LOC attached as v5-0001.

OK, that's not bad -- but there is a fatal flaw here: do_pg_backup_start
only sets sessionBackupState *after* it has finished setting things up,
so if you only change it like this, do_pg_abort_backup will indeed run,
but it'll do nothing because it hits the "quick exit" test.  Therefore,
if a backup aborts while setting up, you'll keep running with forced
page writes until next postmaster crash or restart.  Not good.

ISTM we need to give another flag to the callback function besides
emit_warning: one that says whether to test sessionBackupState or not.
I suppose the easiest way to do it with no other changes is to turn
'arg' into a bitmask.
But alternatively, we could just remove emit_warning as a flag and have
the warning be emitted always; then we can use the boolean for the other
purpose.  I don't think the extra WARNING thrown during backup set-up is
going to be a problem, since it will mostly never be seen anyway (and if
you do see it, it's not a lie.)

However, what's most problematic about this patch is that it introduces
a pretty serious bug, yet that bug goes unnoticed if you just run the
builtin test suites.  I only noticed because I added an elog(ERROR,
"oops") in the area protected by ENSURE_ERROR_CLEANUP and a debug
elog(WARNING) in the cleanup area, then examined the server log after
the pg_basebackup test filed; but this is not very workable.  I wonder
what would be a good way to keep this in check.  The naive way seems to
be to run a pg_basebackup, have it abort partway through (how?), then
test the server and see if forced page writes are enabled or not.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"The problem with the facetime model is not just that it's demoralizing, but
that the people pretending to work interrupt the ones actually working."
                                                           (Paul Graham)



Re: Move backup-related code to xlogbackup.c/.h

From
Michael Paquier
Date:
On Fri, Oct 14, 2022 at 10:24:41AM +0200, Alvaro Herrera wrote:
> However, what's most problematic about this patch is that it introduces
> a pretty serious bug, yet that bug goes unnoticed if you just run the
> builtin test suites.  I only noticed because I added an elog(ERROR,
> "oops") in the area protected by ENSURE_ERROR_CLEANUP and a debug
> elog(WARNING) in the cleanup area, then examined the server log after
> the pg_basebackup test filed; but this is not very workable.  I wonder
> what would be a good way to keep this in check.  The naive way seems to
> be to run a pg_basebackup, have it abort partway through (how?), then
> test the server and see if forced page writes are enabled or not.

See around the bottom of 010_pg_basebackup.pl, where a combination of
IPC::Run::start('pg_basebackup') with --max-rate and
pg_terminate_backend() is able to achieve that.
--
Michael

Attachment

Re: Move backup-related code to xlogbackup.c/.h

From
Bharath Rupireddy
Date:
On Fri, Oct 14, 2022 at 1:54 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2022-Oct-13, Bharath Rupireddy wrote:
>
> > The pg_backup_start_callback() can just go ahead and reset
> > sessionBackupState. However, it leads us to the complete removal of
> > pg_backup_start_callback() itself and use do_pg_abort_backup()
> > consistently across, saving 20 LOC attached as v5-0001.
>
> OK, that's not bad -- but there is a fatal flaw here: do_pg_backup_start
> only sets sessionBackupState *after* it has finished setting things up,
> so if you only change it like this, do_pg_abort_backup will indeed run,
> but it'll do nothing because it hits the "quick exit" test.  Therefore,
> if a backup aborts while setting up, you'll keep running with forced
> page writes until next postmaster crash or restart.  Not good.

Ugh.

> ISTM we need to give another flag to the callback function besides
> emit_warning: one that says whether to test sessionBackupState or not.

I think this needs a new structure, something like below, which makes
things complex.
typedef struct pg_abort_backup_params
{
    /* This tells whether or not the do_pg_abort_backup callback can
quickly exit. */
    bool    can_quick_exit;
    /* This tells whether or not the do_pg_abort_backup callback can
emit a warning. */
    bool    emit_warning;
} pg_abort_backup_params;

> I suppose the easiest way to do it with no other changes is to turn
> 'arg' into a bitmask.

This one too isn't good IMO.

> But alternatively, we could just remove emit_warning as a flag and have
> the warning be emitted always; then we can use the boolean for the other
> purpose.  I don't think the extra WARNING thrown during backup set-up is
> going to be a problem, since it will mostly never be seen anyway (and if
> you do see it, it's not a lie.)

+1 for this.

Please review the v6 patch-set further.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Move backup-related code to xlogbackup.c/.h

From
Alvaro Herrera
Date:
On 2022-Oct-14, Bharath Rupireddy wrote:

> On Fri, Oct 14, 2022 at 1:54 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> > But alternatively, we could just remove emit_warning as a flag and have
> > the warning be emitted always; then we can use the boolean for the other
> > purpose.  I don't think the extra WARNING thrown during backup set-up is
> > going to be a problem, since it will mostly never be seen anyway (and if
> > you do see it, it's not a lie.)
> 
> +1 for this.

OK, pushed 0001, but I modified it some more, because the flag is not
really a "quick exit" optimization but actually critical for
correctness; so I reworked the function to have an if block around it
rather than an early return, and I added an assert about the flag and
session backup state.  CI was green for it and on manual testing it
seems to work correctly.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"No es bueno caminar con un hombre muerto"



Re: Move backup-related code to xlogbackup.c/.h

From
Alvaro Herrera
Date:
Another point before we move on with your 0002 is that forcePageWrites
is no longer useful and we can remove it, as per the attached.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"No deja de ser humillante para una persona de ingenio saber
que no hay tonto que no le pueda enseñar algo." (Jean B. Say)

Attachment

Re: Move backup-related code to xlogbackup.c/.h

From
Bharath Rupireddy
Date:
On Wed, Oct 19, 2022 at 2:30 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> Another point before we move on with your 0002 is that forcePageWrites
> is no longer useful and we can remove it, as per the attached.

+1. The following comment enables us to rely on runningBackups and get
rid of forcePageWrites completely.

     * in progress. forcePageWrites is set to true when runningBackups is
     * non-zero. lastBackupStart is the latest checkpoint redo location used

When the standby is in recovery, calls to XLogInsertRecord() or
AdvanceXLInsertBuffer()) where forcePageWrites is being used, won't
happen, no?

      * Note that forcePageWrites has no effect during an online backup from
-     * the standby.
+     * the standby.   XXX what does this mean??

I removed the 2 more instances of forcePageWrites left-out and tweaked
the comments a little and attached 0002.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Move backup-related code to xlogbackup.c/.h

From
Alvaro Herrera
Date:
On 2022-Oct-19, Bharath Rupireddy wrote:

> When the standby is in recovery, calls to XLogInsertRecord() or
> AdvanceXLInsertBuffer()) where forcePageWrites is being used, won't
> happen, no?
> 
>       * Note that forcePageWrites has no effect during an online backup from
> -     * the standby.
> +     * the standby.   XXX what does this mean??

Well, yes, but when looking at this comment I wonder why do I *care*
about this point.  I left this comment has you changed it, but I wonder
if we shouldn't just remove it.

> I removed the 2 more instances of forcePageWrites left-out and tweaked
> the comments a little and attached 0002.

Thanks for looking.  Pushed now.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"People get annoyed when you try to debug them."  (Larry Wall)



Re: Move backup-related code to xlogbackup.c/.h

From
Bharath Rupireddy
Date:
On Wed, Oct 19, 2022 at 4:23 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2022-Oct-19, Bharath Rupireddy wrote:
>
> > When the standby is in recovery, calls to XLogInsertRecord() or
> > AdvanceXLInsertBuffer()) where forcePageWrites is being used, won't
> > happen, no?
> >
> >       * Note that forcePageWrites has no effect during an online backup from
> > -     * the standby.
> > +     * the standby.   XXX what does this mean??
>
> Well, yes, but when looking at this comment I wonder why do I *care*
> about this point.  I left this comment has you changed it, but I wonder
> if we shouldn't just remove it.

Well,retaining that comment does no harm IMO.

> > I removed the 2 more instances of forcePageWrites left-out and tweaked
> > the comments a little and attached 0002.
>
> Thanks for looking.  Pushed now.

Thanks. I will rebase and post the other patches soon.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Move backup-related code to xlogbackup.c/.h

From
Bharath Rupireddy
Date:
On Wed, Oct 19, 2022 at 4:26 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Wed, Oct 19, 2022 at 4:23 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> > Thanks for looking.  Pushed now.
>
> Thanks. I will rebase and post the other patches soon.

Please review the attached v7 patch-set.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Move backup-related code to xlogbackup.c/.h

From
Alvaro Herrera
Date:
0001 seems mostly OK, but I don't like some of these new function names.
I see you've named them so that they are case-consistent with the name
of the struct member that they affect, but I don't think that's a good
criterion.  I propose

SetrunningBackups -> XLogBackupSetRunning()
ResetXLogBackupActivity -> XLogBackupNotRunning()
// or maybe SetNotRunning, or ResetRunning?  I prefer the one above
SetlastBackupStart -> XLogBackupSetLastStart()

GetlastFpwDisableRecPtr -> XLogGetLastFPWDisableRecptr()
GetminRecoveryPoint -> XLogGetMinRecoveryPoint()

I wouldn't say in the xlog_internal.h comment that these new functions
are for xlogbackup.c to use.  The API definition doesn't have to concern
itself with that.  Maybe one day xlogrecovery.c or some other xlog*.c
would like to call those functions, and then the comment becomes a lie;
and what for?


0002 is where the interesting stuff happens.  I have not reviewed that
part with any care, but it appears that set_backup_state is pretty much
useless.  Let's get rid of it instead of moving it.  Which also means
that we shouldn't introduce reset_backup_status in 0001, I suppose.
I think xlogfuncs.c is content with having just get_backup_status().

Speaking of which -- I'm not sure we really want to do 0003.
xlogfuncs.c is not a big file, the functions are not complex, and there
are no interesting interactions in those functions with the internals
(other than get_backup_status).  I see that Michael advised the same.
I propose we keep those functions where they are.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"I suspect most samba developers are already technically insane...
Of course, since many of them are Australians, you can't tell." (L. Torvalds)



Re: Move backup-related code to xlogbackup.c/.h

From
Bharath Rupireddy
Date:
On Wed, Oct 19, 2022 at 6:30 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> 0001 seems mostly OK, but I don't like some of these new function names.
> I see you've named them so that they are case-consistent with the name
> of the struct member that they affect, but I don't think that's a good
> criterion.  I propose
>
> SetrunningBackups -> XLogBackupSetRunning()
> ResetXLogBackupActivity -> XLogBackupNotRunning()
> // or maybe SetNotRunning, or ResetRunning?  I prefer the one above
> SetlastBackupStart -> XLogBackupSetLastStart()
>
> GetlastFpwDisableRecPtr -> XLogGetLastFPWDisableRecptr()
> GetminRecoveryPoint -> XLogGetMinRecoveryPoint()

XLogBackupResetRunning() seemed better. +1 for above function names.

> I wouldn't say in the xlog_internal.h comment that these new functions
> are for xlogbackup.c to use.  The API definition doesn't have to concern
> itself with that.  Maybe one day xlogrecovery.c or some other xlog*.c
> would like to call those functions, and then the comment becomes a lie;
> and what for?

Removed.

> 0002 is where the interesting stuff happens.  I have not reviewed that
> part with any care, but it appears that set_backup_state is pretty much
> useless.  Let's get rid of it instead of moving it.  Which also means
> that we shouldn't introduce reset_backup_status in 0001, I suppose.
> I think xlogfuncs.c is content with having just get_backup_status().

There's no set_backup_state() at all. We need get_backup_status() for
xlogfuncs.c and basebackup.c and we need reset_backup_status() for
XLogBackupResetRunning() sitting in xlog.c.

> Speaking of which -- I'm not sure we really want to do 0003.
> xlogfuncs.c is not a big file, the functions are not complex, and there
> are no interesting interactions in those functions with the internals
> (other than get_backup_status).  I see that Michael advised the same.
> I propose we keep those functions where they are.

I'm okay either way.

Please see the attached v8 patch set.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Move backup-related code to xlogbackup.c/.h

From
Michael Paquier
Date:
On Wed, Oct 19, 2022 at 09:07:04PM +0530, Bharath Rupireddy wrote:
> XLogBackupResetRunning() seemed better. +1 for above function names.

I see what you are doing here.  XLogCtl would still live in xlog.c,
but we want to have functions that are able to manipulate some of its
fields.  I am not sure to like that much because it introduces a
circling dependency between xlog.c and xlogbackup.c.  As of HEAD,
xlog.c calls build_backup_content() from xlogbackup.c, which is fine
as xlog.c is kind of a central piece that feeds on the insert and
recovery pieces.  However your patch makes some code paths of
xlogbackup.c call routines from xlog.c, and I don't think that we
should do that.

> I'm okay either way.
>
> Please see the attached v8 patch set.

Among all that, CleanupBackupHistory() is different, still it has a
dependency with some of the archiving pieces..
--
Michael

Attachment

Re: Move backup-related code to xlogbackup.c/.h

From
Bharath Rupireddy
Date:
On Mon, Oct 24, 2022 at 1:00 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Oct 19, 2022 at 09:07:04PM +0530, Bharath Rupireddy wrote:
> > XLogBackupResetRunning() seemed better. +1 for above function names.
>
> I see what you are doing here.  XLogCtl would still live in xlog.c,
> but we want to have functions that are able to manipulate some of its
> fields.

Right.

> I am not sure to like that much because it introduces a
> circling dependency between xlog.c and xlogbackup.c.  As of HEAD,
> xlog.c calls build_backup_content() from xlogbackup.c, which is fine
> as xlog.c is kind of a central piece that feeds on the insert and
> recovery pieces.  However your patch makes some code paths of
> xlogbackup.c call routines from xlog.c, and I don't think that we
> should do that.

If you're talking about header file dependency, there's already header
file dependency between them - xlog.c includes xlogbackup.h for
build_backup_content() and xlogbackup.c includes xlog.h for
wal_segment_size. And, I think the same kind of dependency exists
between xlog.c and xlogrecovery.c.

Please note that we're trying to reduce xlog.c file size apart from
centralizing backup related code.

> > I'm okay either way.
> >
> > Please see the attached v8 patch set.
>
> Among all that, CleanupBackupHistory() is different, still it has a
> dependency with some of the archiving pieces..

Is there a problem with that? This function is used solely by backup
functions and it happens to use one of the archiving utility
functions. Please see the other archiving utility functions being used
elsewhere in the code, not only in xlog.c  -
for instance, KeepFileRestoredFromArchive() and XLogArchiveNotify().

I'm attaching the v9 patch set herewith after rebasing. Please review
it further.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Move backup-related code to xlogbackup.c/.h

From
"Gregory Stark (as CFM)"
Date:
On Wed, 26 Oct 2022 at 02:08, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> I'm attaching the v9 patch set herewith after rebasing. Please review
> it further.

It looks like neither reviewer has been really convinced this is the
direction they want to go and I think that's why the thread has been
pretty dead since last October. I think people are pretty hesitant to
give bad news but I don't think we're doing you any favours having you
rebasing and rebasing and trying to justify specific code changes when
it looks like people are skeptical about the basic approach.

So I'm going to mark this Rejected for now. Perhaps a fresh approach
next release cycle starting with a discussion of the specific goals
rather than starting with a patch would be better.

-- 
Gregory Stark
As Commitfest Manager