Thread: WAL Archiving and base backup
Hello all, I've been upgrading some of our PostgreSQL instances from 9.x to 12.x and we're using a master/slave setup based on continuous archiving. I can read on https://www.postgresql.org/docs/12/continuous-archiving.html 25.3.3.2. Making an Exclusive Low-Level Backup that the exclusive method is deprecated. But then this leads me to the following problem: - Non-Exclusive Low-Level Backup requires that the functions pg_start_backup() and pg_stop_backup() be called in the *same* connection to the database - we use scripts to automate the creation of the backup and it's deployment on the slave, after a server maintenance for example - the example given in 25.3.6.1. Standalone Hot Backups still uses the exclusive (deprecated) method So I have this question, how to script the making of base backup for transfer on the slave server when the two SQL functions must be called in the same connection, in Bash for example; is this doable? Thanks for any pointers. Regards -- Issa
So I have this question, how to script the making of base backup for
transfer on the slave server when the two SQL functions must be called
in the same connection, in Bash for example; is this doable?
pg () {
local sql exp
sql="$1"
[ "$2" ] && { sql="$2"; exp="$1"; }
echo "$sql" >&${COPROC[1]}
read -u ${COPROC[0]} || return
[ "$exp" ] || return 0
[ "$REPLY" = "$exp" ] || return 64
return 0
}
pg BEGIN 'BEGIN;'
This sends a BEGIN command and expects the word BEGIN as reply.
if pg 'LOCK TABLE' '
LOCK TABLE some_table
IN SHARE ROW EXCLUSIVE MODE NOWAIT;
';
then
:
elif (( $? > 128 )); then # read timeout exceeded
die "Cannot lock some_table";
else
die "Unexpected error while locking some_table";
fi
unset TMOUT
pg '
SELECT coalesce(min(id), -1)
, coalesce(max(id), -1)
FROM some_table'\;
Now we want to read some data. So, TMOUT is unset. The REPLY variable will have the answer.
IFS=$'\t' read mn mx <<<"$REPLY"
And send \q to finish psql. If "set -e" mode is active, make sure to negate the result.
# expecting read to fail after \q. Hence the negation.
! pg '\q'
Thx a lot. I thought about it but was not so sure about having a complex script (compared to the very simple version when using the exclusive backup - but this this is deprecated...). I will test your option with the simpler version and post it back to it can maybe land in PostgreSQL documentation. Regards
Greetings, * Issa Gorissen (issa-gorissen@usa.net) wrote: > Thx a lot. I thought about it but was not so sure about having a complex > script (compared to the very simple version when using the exclusive backup > - but this this is deprecated...). > > I will test your option with the simpler version and post it back to it can > maybe land in PostgreSQL documentation. The PG docs show how the command works and that's it. The commands in the docs aren't intended to be actually used in production environments. Writing a full solution involves having a good understanding of the PG code and how WAL archiving, backups, et al, are done. If you're not familiar with this portion of the PG code base, I'd strongly suggest you look at using solutions written and maintained by folks who are. Trying to write documentation on how to develop a complete solution would be quite an effort and would certainly go beyond bash scripting and likely wouldn't end up getting used anyway- those who are developing such solutions are already reading through the actual code. Thanks, Stephen
Attachment
On 1/14/22 12:31 PM, Stephen Frost wrote: > Greetings, > > * Issa Gorissen (issa-gorissen@usa.net) wrote: >> Thx a lot. I thought about it but was not so sure about having a complex >> script (compared to the very simple version when using the exclusive backup >> - but this this is deprecated...). >> >> I will test your option with the simpler version and post it back to it can >> maybe land in PostgreSQL documentation. > The PG docs show how the command works and that's it. The commands > in the docs aren't intended to be actually used in production > environments. Writing a full solution involves having a good > understanding of the PG code and how WAL archiving, backups, et al, are > done. If you're not familiar with this portion of the PG code base, I'd > strongly suggest you look at using solutions written and maintained by > folks who are. Needing to read the PG source code to write a workable PITR recovery solution is a serious flaw in PG documentation (and why I use PgBackRest). The documentation of two other RDBMSs that I've worked with (Rdb/VMS and SQL Server) are perfectly clear on how to do such backups and restores with relatively small amounts of scripting. > Trying to write documentation on how to develop a complete solution > would be quite an effort and would certainly go beyond bash scripting > and likely wouldn't end up getting used anyway- those who are developing > such solutions are already reading through the actual code. > > Thanks, > > Stephen -- Angular momentum makes the world go 'round.
Greetings, * Ron (ronljohnsonjr@gmail.com) wrote: > On 1/14/22 12:31 PM, Stephen Frost wrote: > >* Issa Gorissen (issa-gorissen@usa.net) wrote: > >>Thx a lot. I thought about it but was not so sure about having a complex > >>script (compared to the very simple version when using the exclusive backup > >>- but this this is deprecated...). > >> > >>I will test your option with the simpler version and post it back to it can > >>maybe land in PostgreSQL documentation. > >The PG docs show how the command works and that's it. The commands > >in the docs aren't intended to be actually used in production > >environments. Writing a full solution involves having a good > >understanding of the PG code and how WAL archiving, backups, et al, are > >done. If you're not familiar with this portion of the PG code base, I'd > >strongly suggest you look at using solutions written and maintained by > >folks who are. > > Needing to read the PG source code to write a workable PITR recovery > solution is a serious flaw in PG documentation (and why I use PgBackRest). I disagree that it's a flaw in the documentation- it's an unfortunate reality of the current core code. We shouldn't be trying to provide documentation around how to write a tool like pgbackrest, we should, instead, have a tool like pgbackrest in core with its own documentation, as most other RDBMS's do. > The documentation of two other RDBMSs that I've worked with (Rdb/VMS and SQL > Server) are perfectly clear on how to do such backups and restores with > relatively small amounts of scripting. ... using tools which are purpose built to the task, no? Thanks, Stephen
Attachment
On 1/14/22 1:40 PM, Stephen Frost wrote: [snip] > We shouldn't be trying to provide > documentation around how to write a tool like pgbackrest, we should, > instead, have a tool like pgbackrest in core with its own documentation, > as most other RDBMS's do. That's an excellent solution to this problem. -- Angular momentum makes the world go 'round.
On 1/14/22 1:40 PM, Stephen Frost wrote:
[snip]
> We shouldn't be trying to provide
> documentation around how to write a tool like pgbackrest, we should,
> instead, have a tool like pgbackrest in core with its own documentation,
> as most other RDBMS's do.
That's an excellent solution to this problem.
>On 1/14/22 1:40 PM, Stephen Frost wrote: >snip] >> We shouldn't be trying to provide >> documentation around how to write a tool like pgbackrest, we should, >> instead, have a tool like pgbackrest in core with its own documentation, >> as most other RDBMS's do. >That's an excellent solution to this problem. While I know the reasons for not having something like pgbackrest in core, I think this is a major drawback for the project.People coming from commercial systems are used to have that tooling included by default. +1 for having that in core Regards Daniel
On 1/14/22 13:04, Daniel Westermann (DWE) wrote: >> On 1/14/22 1:40 PM, Stephen Frost wrote: >> snip] >>> We shouldn't be trying to provide >>> documentation around how to write a tool like pgbackrest, we should, >>> instead, have a tool like pgbackrest in core with its own documentation, >>> as most other RDBMS's do. > >> That's an excellent solution to this problem. > > While I know the reasons for not having something like pgbackrest in core, I think this is a major drawback for the project.People coming from commercial systems are used to have that tooling included by default. They are also used to paying a big whopping bill, which is why they go looking for alternatives. So like most things is it a tradeoff, sweat equity vs hard cash. > > +1 for having that in core > > Regards > Daniel > -- Adrian Klaver adrian.klaver@aklaver.com
On 1/14/22 3:42 PM, Adrian Klaver wrote: > On 1/14/22 13:04, Daniel Westermann (DWE) wrote: >>> On 1/14/22 1:40 PM, Stephen Frost wrote: >>> snip] >>>> We shouldn't be trying to provide >>>> documentation around how to write a tool like pgbackrest, we should, >>>> instead, have a tool like pgbackrest in core with its own documentation, >>>> as most other RDBMS's do. >> >>> That's an excellent solution to this problem. >> >> While I know the reasons for not having something like pgbackrest in >> core, I think this is a major drawback for the project. People coming >> from commercial systems are used to have that tooling included by default. > > They are also used to paying a big whopping bill, which is why they go > looking for alternatives. So like most things is it a tradeoff, sweat > equity vs hard cash. David Johnston's comment about the docs saying "use pgbackrest or barman, etc" is a good compromise. -- Angular momentum makes the world go 'round.
I still don't really understand what is so great about it. About its only redeeming feature is a declaration that "it is in core" and that newcomers can just default to it without thinking. I'd rather just play favorites and write "use pgbackrest" in our documentation. Or some hybrid approach where we don't just pick one but instead guide people to the community solutions that are out there. I don't think I really want the people responsible for core to spend time on writing end-user backup tooling. Their time is much more valuably spent working on the core product.David J.
Well, the "without thinking" part of your post can be rephrased as "ease of use". Do database administrators really need to think about which backup software to use? What kind of knowledge will such an evaluation provide? All commercial databases have some form of backup software included into the core database. After all, backup and restore are extremely important functions which IMHO should be provided along with the database software.
-- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
On 1/15/22 16:23, Mladen Gogala wrote: > On 1/14/22 16:00, David G. Johnston wrote: >> I still don't really understand what is so great about it. About its >> only redeeming feature is a declaration that "it is in core" and that >> newcomers can just default to it without thinking. I'd rather just >> play favorites and write "use pgbackrest" in our documentation. Or >> some hybrid approach where we don't just pick one but instead guide >> people to the community solutions that are out there. I don't think I >> really want the people responsible for core to spend time on writing >> end-user backup tooling. Their time is much more valuably spent >> working on the core product. >> >> David J. >> > Well, the "without thinking" part of your post can be rephrased as "ease > of use". Do database administrators really need to think about which > backup software to use? What kind of knowledge will such an evaluation > provide? All commercial databases have some form of backup software > included into the core database. After all, backup and restore are > extremely important functions which IMHO should be provided along with > the database software. That is easy to say the reality is: Previous CommitFest: https://commitfest.postgresql.org/35/ Note the number of 'Moved to next CF' Current CommitFest ending 1/31/2022: https://commitfest.postgresql.org/36/ Note number of 'Needs review' This is for the current Core software. Adding more to that will just add to the above 'Moved to next CF' and 'Needs review'. > > > -- > Mladen Gogala > Database Consultant > Tel: (347) 321-1217 > https://dbwhisperer.wordpress.com > -- Adrian Klaver adrian.klaver@aklaver.com
On 1/14/22 16:00, David G. Johnston wrote:I still don't really understand what is so great about it. About its only redeeming feature is a declaration that "it is in core" and that newcomers can just default to it without thinking. I'd rather just play favorites and write "use pgbackrest" in our documentation. Or some hybrid approach where we don't just pick one but instead guide people to the community solutions that are out there. I don't think I really want the people responsible for core to spend time on writing end-user backup tooling. Their time is much more valuably spent working on the core product.David J.Well, the "without thinking" part of your post can be rephrased as "ease of use". Do database administrators really need to think about which backup software to use? What kind of knowledge will such an evaluation provide? All commercial databases have some form of backup software included into the core database. After all, backup and restore are extremely important functions which IMHO should be provided along with the database software.
Greetings,
* Issa Gorissen (issa-gorissen@usa.net) wrote:
> Thx a lot. I thought about it but was not so sure about having a complex
> script (compared to the very simple version when using the exclusive backup
> - but this this is deprecated...).
>
> I will test your option with the simpler version and post it back to it can
> maybe land in PostgreSQL documentation.
The PG docs show how the command works and that's it. The commands
in the docs aren't intended to be actually used in production
environments. Writing a full solution involves having a good
understanding of the PG code and how WAL archiving, backups, et al, are
done.
On 1/16/22 01:23, Mladen Gogala wrote: > On 1/14/22 16:00, David G. Johnston wrote: >> I still don't really understand what is so great about it. About its >> only redeeming feature is a declaration that "it is in core" and that >> newcomers can just default to it without thinking. I'd rather just >> play favorites and write "use pgbackrest" in our documentation. Or >> some hybrid approach where we don't just pick one but instead guide >> people to the community solutions that are out there. I don't think I >> really want the people responsible for core to spend time on writing >> end-user backup tooling. Their time is much more valuably spent >> working on the core product. >> >> David J. >> > Well, the "without thinking" part of your post can be rephrased as "ease > of use". Do database administrators really need to think about which > backup software to use? What kind of knowledge will such an evaluation > provide? All commercial databases have some form of backup software > included into the core database. After all, backup and restore are > extremely important functions which IMHO should be provided along with > the database software. > > > -- > Mladen Gogala > Database Consultant > Tel: (347) 321-1217 > https://dbwhisperer.wordpress.com > Just to avoid any misunderstanding. I am perfectly happy using the backup/restore with pg_dump and we use it for at least a decade with success for our need of backups. My question is really in the context of WAL archiving and preparing a slave instance with a base 'non-exclusive' backup from a script (to be precise, on reboot of the server running the slave instance) since the exclusive way is deprecated. Thx for all your valuable comments -- Issa
On 1/16/22 13:12, Issa Gorissen wrote: > > Just to avoid any misunderstanding. I am perfectly happy using the > backup/restore with pg_dump and we use it for at least a decade with > success for our need of backups. I am using pgbackrest. I used to use pg_rman, being well acquainted with rman backup tool belonging to another database system, but pg_rman was removing WAL archives before they were delivered to replica. I have replication slots from the OLTP database to the DW database with partitions, hash and bloom indexes and massive amount of processors, for parallel query. I need the replication to work. In addition to that, pgbackrest can employ parallelism while pg_rman cannot do that. I find pgbackrest by far the best Postgres backup system. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Greetings, * David G. Johnston (david.g.johnston@gmail.com) wrote: > On Fri, Jan 14, 2022 at 11:31 AM Stephen Frost <sfrost@snowman.net> wrote: > > * Issa Gorissen (issa-gorissen@usa.net) wrote: > > > Thx a lot. I thought about it but was not so sure about having a complex > > > script (compared to the very simple version when using the exclusive > > backup > > > - but this this is deprecated...). > > > > > > I will test your option with the simpler version and post it back to it > > can > > > maybe land in PostgreSQL documentation. > > > > The PG docs show how the command works and that's it. The commands > > in the docs aren't intended to be actually used in production > > environments. Writing a full solution involves having a good > > understanding of the PG code and how WAL archiving, backups, et al, are > > done. > > For all my suggestions of "use third-party where possible" I do think that > we should have, probably as part of pg_basebackup, a mode that performs a > filesystem copy of WAL to an archive location and verifies that the archive > is valid otherwise it write to the PostgreSQL log that there was a > problem. pg_basebackup should either be able to install that command (with > maybe some CLI prompts for settings or something) using ALTER SYSTEM. Our > documentation can then demonstrate the example usage of the archive_command > parameters as being a generic form of that command. We already have pg_receivewal, which is part of pg_basebackup, and is able to use a slot and such. I'm not sure that making pg_basebackup somehow also work as an archive command makes much sense- it's really intended to be a tool that's used remotely and that isn't going to work when being called out of archive_command. That is, if I'm on system B and doing pg_basebackup against system A, all of which happens using the PG replication protocol, how would pg_basebackup called out of archive_command be able to get the WAL over to system B..? Thanks, Stephen
Attachment
We already have pg_receivewal, which is part of pg_basebackup, and is
able to use a slot and such. I'm not sure that making pg_basebackup
somehow also work as an archive command makes much sense
Greetings, * David G. Johnston (david.g.johnston@gmail.com) wrote: > On Tue, Jan 18, 2022 at 10:53 AM Stephen Frost <sfrost@snowman.net> wrote: > > We already have pg_receivewal, which is part of pg_basebackup, and is > > able to use a slot and such. I'm not sure that making pg_basebackup > > somehow also work as an archive command makes much sense > > I suppose my proposal should have been: > > Create and document a new "PostgreSQL Server Application" [1] and name it: > pg_archive_wal > Advise people to set their archive_command to "pg_archive_wal > --path=/location/of/archive %p/%f > > 1. https://www.postgresql.org/docs/current/reference-server.html Or just link to a proper solution that already exists and includes support for object stores, multiple repositories, checksums all files and stores them in a manifest to validate them ... > Having pg_basebackup still prompt for permission to add that command to the > system via ALTER SYSTEM (probably will some other logic) seems doable. It doesn't seem to follow, to me anyway, that if you're using pg_basebackup that you'd want to use this proposed pg_archive_wal. Much more likely would be that you'd want to use pg_receivewal, which doesn't need any archive command to be set. Seems like a feature in need of a use case is what I'm getting at here. > Having created a base backup one still must decide on a wal archiving > strategy. There appear to be two options, though as far as I can tell if > one simply reads the documentation regarding backups they will not discover > the pg_receivewal option. I, not knowing of that option myself, have been > operating under the assumption that if one uses pg_basebackup that one > would be required to setup an archive_command as well. Adding a reference to pg_receivewal under the continuous archiving section would probably make sense, similar to how pg_basebackup is referenced from the base backup section. Might even make sense to have a 'low level API' section for WAL archiving which mentions archive_command ... or maybe not and rip out the existing 'low level API' section as it really isn't nearly detailed enough for someone to be able to write a proper tool and having it there implies that it does provide that. > The superior option is having a persistently running pg_receivewal command > on a server. As noted above, the documentation does not do this option > justice. Not sure that I'd say that it's the superior option, but it depends on the options that are being considered and if you're limiting those to "just what exists in core." > The alternative option is to set archive_command; which at present is also > poorly documented. My proposal above simply tries to improve on this. And > while that is a good and easy starting point if there is agreement on > pg_receivewal being a superior archiving option (leaving archive_command > unset) reworking the documentation to guide the inexperience PostgreSQL DBA > toward a "minimal but effective" backup procedure is needed. Adding a new PG server application isn't exactly what I'd call just a simple improvement to the documentation ... The existing documentation for base backups does, in fact, link to pg_basebackup to guide the new DBA to a solution for base backups that's minimal but effective. Doing PITR is getting beyond just the minimal, but even so, linking to pg_receivewal in a similar manner probably does make sense since we link to pg_basebackup, so I think I agree with you about that specific change. Hopefully, we can remove the long deprecated exclusive backup option and further simplify the backup documentation. Thanks, Stephen
Attachment
On 1/14/22 12:31 PM, Stephen Frost wrote:
> Greetings,
>
> * Issa Gorissen (issa-gorissen@usa.net) wrote:
>> Thx a lot. I thought about it but was not so sure about having a complex
>> script (compared to the very simple version when using the exclusive backup
>> - but this this is deprecated...).
>>
>> I will test your option with the simpler version and post it back to it can
>> maybe land in PostgreSQL documentation.
> The PG docs show how the command works and that's it. The commands
> in the docs aren't intended to be actually used in production
> environments. Writing a full solution involves having a good
> understanding of the PG code and how WAL archiving, backups, et al, are
> done. If you're not familiar with this portion of the PG code base, I'd
> strongly suggest you look at using solutions written and maintained by
> folks who are.
Needing to read the PG source code to write a workable PITR recovery
solution is a serious flaw in PG documentation (and why I use PgBackRest).
The documentation of two other RDBMSs that I've worked with (Rdb/VMS and SQL
Server) are perfectly clear on how to do such backups and restores with
relatively small amounts of scripting.
> Trying to write documentation on how to develop a complete solution
> would be quite an effort and would certainly go beyond bash scripting
> and likely wouldn't end up getting used anyway- those who are developing
> such solutions are already reading through the actual code.
>
> Thanks,
>
> Stephen
--
Angular momentum makes the world go 'round.