Thread: cleanup patches for incremental backup
Hi, I discovered that my patch to add WAL summarization added two new SQL-callable functions but failed to document them. 0001 fixes that. An outstanding item from the original thread was to write a better test for the not-yet-committed pg_walsummary utility. But I discovered that I couldn't do that because there were some race conditions that couldn't easily be cured. So 0002 therefore adds a new function pg_get_wal_summarizer_state() which returns various items of in-memory state related to WAL summarization. We had some brief discussion of this being desirable for other reasons; it's nice for users to be able to look at this information in case of trouble (say, if the summarizer is not keeping up). 0003 then adds the previously-proposed pg_walsummary utility, with tests that depend on 0002. 0004 attempts to fix some problems detected by Coverity and subsequent code inspection. -- Robert Haas EDB: http://www.enterprisedb.com
Attachment
Hello again, I have now committed 0001. I got some off-list review of 0004; specifically, Michael Paquier said it looked OK, and Tom Lane found another bug. So I've added a fix for that in what's now 0003. Here's v2. I plan to commit the rest of this fairly soon if there are no comments. ...Robert
Attachment
On Tue, Jan 9, 2024 at 1:18 PM Robert Haas <robertmhaas@gmail.com> wrote: > Here's v2. I plan to commit the rest of this fairly soon if there are > no comments. Done, with a minor adjustment to 0003. -- Robert Haas EDB: http://www.enterprisedb.com
RE: cleanup patches for incremental backup
From
"Shinoda, Noriyoshi (HPE Services Japan - FSIP)"
Date:
Hi, Thank you for developing the new tool. I have attached a patch that corrects the spelling of the --individual option in theSGML file. Regards, Noriyoshi Shinoda -----Original Message----- From: Robert Haas <robertmhaas@gmail.com> Sent: Friday, January 12, 2024 3:13 AM To: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>; Jakub Wartak <jakub.wartak@enterprisedb.com> Subject: Re: cleanup patches for incremental backup On Tue, Jan 9, 2024 at 1:18 PM Robert Haas <robertmhaas@gmail.com> wrote: > Here's v2. I plan to commit the rest of this fairly soon if there are > no comments. Done, with a minor adjustment to 0003. -- Robert Haas EDB: http://www.enterprisedb.com
Attachment
On Thu, Jan 11, 2024 at 8:00 PM Shinoda, Noriyoshi (HPE Services Japan - FSIP) <noriyoshi.shinoda@hpe.com> wrote: > Thank you for developing the new tool. I have attached a patch that corrects the spelling of the --individual option inthe SGML file. Thanks, committed. -- Robert Haas EDB: http://www.enterprisedb.com
Hello Robert, 12.01.2024 17:50, Robert Haas wrote: > On Thu, Jan 11, 2024 at 8:00 PM Shinoda, Noriyoshi (HPE Services Japan > - FSIP) <noriyoshi.shinoda@hpe.com> wrote: >> Thank you for developing the new tool. I have attached a patch that corrects the spelling of the --individual option inthe SGML file. > Thanks, committed. I've found one more typo in the sgml: summarized_pid And one in a comment: sumamry A trivial fix is attached. Best regards, Alexander
Attachment
On Sat, Jan 13, 2024 at 1:00 PM Alexander Lakhin <exclusion@gmail.com> wrote: > I've found one more typo in the sgml: > summarized_pid > And one in a comment: > sumamry > > A trivial fix is attached. Thanks, committed. -- Robert Haas EDB: http://www.enterprisedb.com
On Mon, 15 Jan 2024 at 17:58, Robert Haas <robertmhaas@gmail.com> wrote: > > On Sat, Jan 13, 2024 at 1:00 PM Alexander Lakhin <exclusion@gmail.com> wrote: > > I've found one more typo in the sgml: > > summarized_pid > > And one in a comment: > > sumamry > > > > A trivial fix is attached. > > Thanks, committed. Off-list I was notified that the new WAL summarizer process was not yet added to the glossary, so PFA a patch that does that. In passing, it also adds "incremental backup" to the glossary, and updates the documented types of backends in monitoring.sgml with the new backend type, too. Kind regards, Matthias van de Meent.
Attachment
On Mon, Jan 15, 2024 at 3:31 PM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > Off-list I was notified that the new WAL summarizer process was not > yet added to the glossary, so PFA a patch that does that. > In passing, it also adds "incremental backup" to the glossary, and > updates the documented types of backends in monitoring.sgml with the > new backend type, too. I wonder if it's possible that you sent the wrong version of this patch, because: (1) The docs don't build with this applied. I'm not sure if it's the only problem, but <glossterm linkend="glossary-db-cluster" is missing the closing >. (2) The changes to monitoring.sgml contain an unrelated change, about pg_stat_all_indexes.idx_scan. Also, I think the "For more information, see <xref linkend="whatever" /> bit should have a period after the markup tag, as we seem to do in other cases. One other thought is that the incremental backup only replaces relation files with incremental files, and it never does anything about FSM files. So the statement that it only contains data that was potentially changed isn't quite correct. It might be better to phrase it the other way around i.e. it is like a full backup, except that some files can be replaced by incremental files which omit blocks to which no WAL-logged changes have been made. -- Robert Haas EDB: http://www.enterprisedb.com
On Tue, 16 Jan 2024 at 16:39, Robert Haas <robertmhaas@gmail.com> wrote: > > On Mon, Jan 15, 2024 at 3:31 PM Matthias van de Meent > <boekewurm+postgres@gmail.com> wrote: > > Off-list I was notified that the new WAL summarizer process was not > > yet added to the glossary, so PFA a patch that does that. > > In passing, it also adds "incremental backup" to the glossary, and > > updates the documented types of backends in monitoring.sgml with the > > new backend type, too. > > I wonder if it's possible that you sent the wrong version of this > patch, because: > > (1) The docs don't build with this applied. I'm not sure if it's the > only problem, but <glossterm linkend="glossary-db-cluster" is missing > the closing >. That's my mistake, I didn't check install-world yet due to unrelated issues building the docs. I've since sorted out these issues (this was a good stick to get that done), so this issue is fixed in the attached patch. > (2) The changes to monitoring.sgml contain an unrelated change, about > pg_stat_all_indexes.idx_scan. Thanks for noticing, fixed in attached. > Also, I think the "For more information, see <xref linkend="whatever" > /> bit should have a period after the markup tag, as we seem to do in > other cases. Fixed. > One other thought is that the incremental backup only replaces > relation files with incremental files, and it never does anything > about FSM files. So the statement that it only contains data that was > potentially changed isn't quite correct. It might be better to phrase > it the other way around i.e. it is like a full backup, except that > some files can be replaced by incremental files which omit blocks to > which no WAL-logged changes have been made. How about the attached? Kind regards, Matthias van de Meent
Attachment
On Tue, Jan 16, 2024 at 3:22 PM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > > One other thought is that the incremental backup only replaces > > relation files with incremental files, and it never does anything > > about FSM files. So the statement that it only contains data that was > > potentially changed isn't quite correct. It might be better to phrase > > it the other way around i.e. it is like a full backup, except that > > some files can be replaced by incremental files which omit blocks to > > which no WAL-logged changes have been made. > > How about the attached? I like the direction. + A special <glossterm linkend="glossary-basebackup">base backup</glossterm> + that for some WAL-logged relations only contains the pages that were + modified since a previous backup, as opposed to the full relation data of + normal base backups. Like base backups, it is generated by the tool + <xref linkend="app-pgbasebackup"/>. Could we say "that for some files may contain only those pages that were modified since a previous backup, as opposed to the full contents of every file"? My thoughts are (1) there's no hard guarantee that an incremental backup will replace even one file with an incremental file, although in practice it is probably almost always going to happen and (2) pg_combinebackup would actually be totally fine with any file at all being sent incrementally; it's only that the server isn't smart enough to figure out how to do this with e.g. SLRU data right now. + To restore incremental backups the tool <xref linkend="app-pgcombinebackup"/> + is used, which combines the incremental backups with a base backup and + <glossterm linkend="glossary-wal">WAL</glossterm> to restore a + <glossterm linkend="glossary-db-cluster">database cluster</glossterm> to + a consistent state. I wondered if this needed to be clearer that the chain of backups could have length > 2. But on further reflection, I think it's fine, unless you feel otherwise. The rest LGTM. -- Robert Haas EDB: http://www.enterprisedb.com
On Tue, 16 Jan 2024 at 21:49, Robert Haas <robertmhaas@gmail.com> wrote: > > On Tue, Jan 16, 2024 at 3:22 PM Matthias van de Meent > <boekewurm+postgres@gmail.com> wrote: > + A special <glossterm linkend="glossary-basebackup">base backup</glossterm> > + that for some WAL-logged relations only contains the pages that were > + modified since a previous backup, as opposed to the full relation data of > + normal base backups. Like base backups, it is generated by the tool > + <xref linkend="app-pgbasebackup"/>. > > Could we say "that for some files may contain only those pages that > were modified since a previous backup, as opposed to the full contents > of every file"? Sure, added in attached. > + To restore incremental backups the tool <pgcombinebackup> > + is used, which combines the incremental backups with a base backup and > + [...] > I wondered if this needed to be clearer that the chain of backups > could have length > 2. But on further reflection, I think it's fine, > unless you feel otherwise. I removed "the" from the phrasing "the incremental backups", which makes it a bit less restricted. > The rest LGTM. In the latest patch I also fixed the casing of "Incremental Backup" to "... backup", to be in line with most other multi-word items. Thanks! Kind regards, Matthias van de Meent Neon (https://neon.tech)
Attachment
On Wed, Jan 17, 2024 at 1:42 PM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > Sure, added in attached. I think this mostly looks good now but I just realized that I think this needs rephrasing: + To restore incremental backups the tool <xref linkend="app-pgcombinebackup"/> + is used, which combines incremental backups with a base backup and + <glossterm linkend="glossary-wal">WAL</glossterm> to restore a + <glossterm linkend="glossary-db-cluster">database cluster</glossterm> to + a consistent state. The way this is worded, at least to me, it makes it sound like pg_combinebackup is going to do the WAL recovery for you, which it isn't. Maybe: To restore incremental backups the tool <xref linkend="app-pgcombinebackup"/> is used, which combines incremental backups with a base backup. Afterwards, recovery can use <glossterm linkend="glossary-wal">WAL</glossterm> to bring the <glossterm linkend="glossary-db-cluster">database cluster</glossterm> to a consistent state. -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, 17 Jan 2024 at 21:10, Robert Haas <robertmhaas@gmail.com> wrote: > > On Wed, Jan 17, 2024 at 1:42 PM Matthias van de Meent > <boekewurm+postgres@gmail.com> wrote: > > Sure, added in attached. > > I think this mostly looks good now but I just realized that I think > this needs rephrasing: > > + To restore incremental backups the tool <xref > linkend="app-pgcombinebackup"/> > + is used, which combines incremental backups with a base backup and > + <glossterm linkend="glossary-wal">WAL</glossterm> to restore a > + <glossterm linkend="glossary-db-cluster">database cluster</glossterm> to > + a consistent state. > > The way this is worded, at least to me, it makes it sound like > pg_combinebackup is going to do the WAL recovery for you, which it > isn't. Maybe: > > To restore incremental backups the tool <xref > linkend="app-pgcombinebackup"/> is used, which combines incremental > backups with a base backup. Afterwards, recovery can use <glossterm > linkend="glossary-wal">WAL</glossterm> to bring the <glossterm > linkend="glossary-db-cluster">database cluster</glossterm> to a > consistent state. Sure, that's fine with me. Kind regards, Matthias van de Meent Neon (https://neon.tech)
On Thu, Jan 18, 2024 at 4:50 AM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > Sure, that's fine with me. OK, committed that way. -- Robert Haas EDB: http://www.enterprisedb.com
I'm seeing some recent buildfarm failures for pg_walsummary: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer&dt=2024-01-14%2006%3A21%3A58 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=idiacanthus&dt=2024-01-17%2021%3A10%3A36 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=serinus&dt=2024-01-20%2018%3A58%3A49 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=taipan&dt=2024-01-23%2002%3A46%3A57 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=serinus&dt=2024-01-23%2020%3A23%3A36 The signature looks nearly identical in each: # Failed test 'WAL summary file exists' # at t/002_blocks.pl line 79. # Failed test 'stdout shows block 0 modified' # at t/002_blocks.pl line 85. # '' # doesn't match '(?^m:FORK main: block 0$)' I haven't been able to reproduce the issue on my machine, and I haven't figured out precisely what is happening yet, but I wanted to make sure there is awareness. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Wed, Jan 24, 2024 at 12:08 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > I'm seeing some recent buildfarm failures for pg_walsummary: > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer&dt=2024-01-14%2006%3A21%3A58 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=idiacanthus&dt=2024-01-17%2021%3A10%3A36 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=serinus&dt=2024-01-20%2018%3A58%3A49 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=taipan&dt=2024-01-23%2002%3A46%3A57 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=serinus&dt=2024-01-23%2020%3A23%3A36 > > The signature looks nearly identical in each: > > # Failed test 'WAL summary file exists' > # at t/002_blocks.pl line 79. > > # Failed test 'stdout shows block 0 modified' > # at t/002_blocks.pl line 85. > # '' > # doesn't match '(?^m:FORK main: block 0$)' > > I haven't been able to reproduce the issue on my machine, and I haven't > figured out precisely what is happening yet, but I wanted to make sure > there is awareness. This is weird. There's a little more detail in the log file, regress_log_002_blocks, e.g. from the first failure you linked: [11:18:20.683](96.787s) # before insert, summarized TLI 1 through 0/14E09D0 [11:18:21.188](0.505s) # after insert, summarized TLI 1 through 0/14E0D08 [11:18:21.326](0.138s) # examining summary for TLI 1 from 0/14E0D08 to 0/155BAF0 # 1 ... [11:18:21.349](0.000s) # got: 'pg_walsummary: error: could not open file "/home/nm/farm/gcc64/HEAD/pgsql.build/src/bin/pg_walsummary/tmp_check/t_002_blocks_node1_data/pgdata/pg_wal/summaries/0000000100000000014E0D0800000000155BAF0 # 1.summary": No such file or directory' The "examining summary" line is generated based on the output of pg_available_wal_summaries(). The way that works is that the server calls readdir(), disassembles the filename into a TLI and two LSNs, and returns the result. Then, a fraction of a second later, the test script reassembles those components into a filename and finds the file missing. If the logic to translate between filenames and TLIs & LSNs were incorrect, the test would fail consistently. So the only explanation that seems to fit the facts is the file disappearing out from under us. But that really shouldn't happen. We do have code to remove such files in MaybeRemoveOldWalSummaries(), but it's only supposed to be nuking files more than 10 days old. So I don't really have a theory here as to what could be happening. :-( -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, Jan 24, 2024 at 12:46:16PM -0500, Robert Haas wrote: > The "examining summary" line is generated based on the output of > pg_available_wal_summaries(). The way that works is that the server > calls readdir(), disassembles the filename into a TLI and two LSNs, > and returns the result. Then, a fraction of a second later, the test > script reassembles those components into a filename and finds the file > missing. If the logic to translate between filenames and TLIs & LSNs > were incorrect, the test would fail consistently. So the only > explanation that seems to fit the facts is the file disappearing out > from under us. But that really shouldn't happen. We do have code to > remove such files in MaybeRemoveOldWalSummaries(), but it's only > supposed to be nuking files more than 10 days old. > > So I don't really have a theory here as to what could be happening. :-( There might be an overflow risk in the cutoff time calculation, but I doubt that's the root cause of these failures: /* * Files should only be removed if the last modification time precedes the * cutoff time we compute here. */ cutoff_time = time(NULL) - 60 * wal_summary_keep_time; Otherwise, I think we'll probably need to add some additional logging to figure out what is happening... -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Wed, Jan 24, 2024 at 1:05 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > There might be an overflow risk in the cutoff time calculation, but I doubt > that's the root cause of these failures: > > /* > * Files should only be removed if the last modification time precedes the > * cutoff time we compute here. > */ > cutoff_time = time(NULL) - 60 * wal_summary_keep_time; > > Otherwise, I think we'll probably need to add some additional logging to > figure out what is happening... Where, though? I suppose we could: 1. Change the server code so that it logs each WAL summary file removed at a log level high enough to show up in the test logs. 2. Change the TAP test so that it prints out readdir(WAL summary directory) at various points in the test. -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, Jan 24, 2024 at 02:08:08PM -0500, Robert Haas wrote: > On Wed, Jan 24, 2024 at 1:05 PM Nathan Bossart <nathandbossart@gmail.com> wrote: >> Otherwise, I think we'll probably need to add some additional logging to >> figure out what is happening... > > Where, though? I suppose we could: > > 1. Change the server code so that it logs each WAL summary file > removed at a log level high enough to show up in the test logs. > > 2. Change the TAP test so that it prints out readdir(WAL summary > directory) at various points in the test. That seems like a reasonable starting point. Even if it doesn't help determine the root cause, it should at least help rule out concurrent summary removal. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Wed, Jan 24, 2024 at 2:39 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > That seems like a reasonable starting point. Even if it doesn't help > determine the root cause, it should at least help rule out concurrent > summary removal. Here is a patch for that. -- Robert Haas EDB: http://www.enterprisedb.com
Attachment
On Thu, Jan 25, 2024 at 10:06:41AM -0500, Robert Haas wrote: > On Wed, Jan 24, 2024 at 2:39 PM Nathan Bossart <nathandbossart@gmail.com> wrote: >> That seems like a reasonable starting point. Even if it doesn't help >> determine the root cause, it should at least help rule out concurrent >> summary removal. > > Here is a patch for that. LGTM. The only thing I might add is the cutoff_time in that LOG. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Thu, Jan 25, 2024 at 11:08 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > On Thu, Jan 25, 2024 at 10:06:41AM -0500, Robert Haas wrote: > > On Wed, Jan 24, 2024 at 2:39 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > >> That seems like a reasonable starting point. Even if it doesn't help > >> determine the root cause, it should at least help rule out concurrent > >> summary removal. > > > > Here is a patch for that. > > LGTM. The only thing I might add is the cutoff_time in that LOG. Here is v2 with that addition. -- Robert Haas EDB: http://www.enterprisedb.com
Attachment
On Fri, Jan 26, 2024 at 11:04:37AM -0500, Robert Haas wrote: > Here is v2 with that addition. Looks reasonable. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Fri, Jan 26, 2024 at 12:39 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > On Fri, Jan 26, 2024 at 11:04:37AM -0500, Robert Haas wrote: > > Here is v2 with that addition. > > Looks reasonable. Thanks for the report & review. I have committed that version. -- Robert Haas EDB: http://www.enterprisedb.com
Hello Robert,
26.01.2024 21:37, Robert Haas wrote:
26.01.2024 21:37, Robert Haas wrote:
On Fri, Jan 26, 2024 at 12:39 PM Nathan Bossart <nathandbossart@gmail.com> wrote:On Fri, Jan 26, 2024 at 11:04:37AM -0500, Robert Haas wrote:Here is v2 with that addition.Looks reasonable.Thanks for the report & review. I have committed that version.
While trying to reproduce the 002_blocks test failure, I've encountered
another anomaly (or two):
make -s check -C src/bin/pg_walsummary/ PROVE_TESTS="t/002*" PROVE_FLAGS="--timer"
# +++ tap check in src/bin/pg_walsummary +++
[05:40:38] t/002_blocks.pl .. # poll_query_until timed out executing this query:
# SELECT EXISTS (
# SELECT * from pg_available_wal_summaries()
# WHERE tli = 0 AND end_lsn > '0/0'
# )
#
# expecting this output:
# t
# last actual query output:
# f
# with stderr:
[05:40:38] t/002_blocks.pl .. ok 266739 ms ( 0.00 usr 0.01 sys + 17.51 cusr 26.79 csys = 44.31 CPU)
[05:45:05]
All tests successful.
Files=1, Tests=3, 267 wallclock secs ( 0.02 usr 0.02 sys + 17.51 cusr 26.79 csys = 44.34 CPU)
Result: PASS
It looks like the test may call pg_get_wal_summarizer_state() when
WalSummarizerCtl->initialized is false yet, i. e. before the first
GetOldestUnsummarizedLSN() call.
I could reproduce the issue easily (within 10 test runs) with
pg_usleep(100000L);
added inside WalSummarizerMain() just below:
sigprocmask(SIG_SETMASK, &UnBlockSig, NULL);
But the fact that the test passes regardless of the timeout, make me
wonder, whether any test should fail when such timeout occurs?
Best regards,
Alexander
24.01.2024 20:46, Robert Haas wrote: > This is weird. There's a little more detail in the log file, > regress_log_002_blocks, e.g. from the first failure you linked: > > [11:18:20.683](96.787s) # before insert, summarized TLI 1 through 0/14E09D0 > [11:18:21.188](0.505s) # after insert, summarized TLI 1 through 0/14E0D08 > [11:18:21.326](0.138s) # examining summary for TLI 1 from 0/14E0D08 to 0/155BAF0 > # 1 > ... > [11:18:21.349](0.000s) # got: 'pg_walsummary: error: could > not open file "/home/nm/farm/gcc64/HEAD/pgsql.build/src/bin/pg_walsummary/tmp_check/t_002_blocks_node1_data/pgdata/pg_wal/summaries/0000000100000000014E0D0800000000155BAF0 > # 1.summary": No such file or directory' > > The "examining summary" line is generated based on the output of > pg_available_wal_summaries(). The way that works is that the server > calls readdir(), disassembles the filename into a TLI and two LSNs, > and returns the result. I'm discouraged by "\n1" in the file name and in the "examining summary..." message. regress_log_002_blocks from the following successful test run on the same sungazer node contains: [15:21:58.924](0.106s) # examining summary for TLI 1 from 0/155BAE0 to 0/155E750 [15:21:58.925](0.001s) ok 1 - WAL summary file exists Best regards, Alexander
On Sat, Jan 27, 2024 at 11:00:01AM +0300, Alexander Lakhin wrote: > 24.01.2024 20:46, Robert Haas wrote: >> This is weird. There's a little more detail in the log file, >> regress_log_002_blocks, e.g. from the first failure you linked: >> >> [11:18:20.683](96.787s) # before insert, summarized TLI 1 through 0/14E09D0 >> [11:18:21.188](0.505s) # after insert, summarized TLI 1 through 0/14E0D08 >> [11:18:21.326](0.138s) # examining summary for TLI 1 from 0/14E0D08 to 0/155BAF0 >> # 1 >> ... >> [11:18:21.349](0.000s) # got: 'pg_walsummary: error: could >> not open file "/home/nm/farm/gcc64/HEAD/pgsql.build/src/bin/pg_walsummary/tmp_check/t_002_blocks_node1_data/pgdata/pg_wal/summaries/0000000100000000014E0D0800000000155BAF0 >> # 1.summary": No such file or directory' >> >> The "examining summary" line is generated based on the output of >> pg_available_wal_summaries(). The way that works is that the server >> calls readdir(), disassembles the filename into a TLI and two LSNs, >> and returns the result. > > I'm discouraged by "\n1" in the file name and in the > "examining summary..." message. > regress_log_002_blocks from the following successful test run on the same > sungazer node contains: > [15:21:58.924](0.106s) # examining summary for TLI 1 from 0/155BAE0 to 0/155E750 > [15:21:58.925](0.001s) ok 1 - WAL summary file exists Ah, I think this query: SELECT tli, start_lsn, end_lsn from pg_available_wal_summaries() WHERE tli = $summarized_tli AND end_lsn > '$summarized_lsn' is returning more than one row in some cases. I attached a quick sketch of an easy way to reproduce the issue as well as one way to fix it. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On Sat, Jan 27, 2024 at 10:31:09AM -0600, Nathan Bossart wrote: > On Sat, Jan 27, 2024 at 11:00:01AM +0300, Alexander Lakhin wrote: >> I'm discouraged by "\n1" in the file name and in the >> "examining summary..." message. >> regress_log_002_blocks from the following successful test run on the same >> sungazer node contains: >> [15:21:58.924](0.106s) # examining summary for TLI 1 from 0/155BAE0 to 0/155E750 >> [15:21:58.925](0.001s) ok 1 - WAL summary file exists > > Ah, I think this query: > > SELECT tli, start_lsn, end_lsn from pg_available_wal_summaries() > WHERE tli = $summarized_tli AND end_lsn > '$summarized_lsn' > > is returning more than one row in some cases. I attached a quick sketch of > an easy way to reproduce the issue as well as one way to fix it. The buildfarm just caught a failure with the new logging in place: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae&dt=2024-01-29%2018%3A09%3A10 I'm not totally sure my "fix" is correct, but I think this does confirm the theory. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Mon, Jan 29, 2024 at 1:21 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > > Ah, I think this query: > > > > SELECT tli, start_lsn, end_lsn from pg_available_wal_summaries() > > WHERE tli = $summarized_tli AND end_lsn > '$summarized_lsn' > > > > is returning more than one row in some cases. I attached a quick sketch of > > an easy way to reproduce the issue as well as one way to fix it. > > The buildfarm just caught a failure with the new logging in place: > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae&dt=2024-01-29%2018%3A09%3A10 > > I'm not totally sure my "fix" is correct, but I think this does confirm the > theory. Ah. The possibilities of ending up with TWO new WAL summaries never occurred to me. Things that never occurred to the developer are a leading cause of bugs, and so here. I'm wondering if what we need to do is run pg_walsummary on both summary files in that case. If we just pick one or the other, how do we know which one to pick? -- Robert Haas EDB: http://www.enterprisedb.com
On Mon, Jan 29, 2024 at 03:18:50PM -0500, Robert Haas wrote: > I'm wondering if what we need to do is run pg_walsummary on both > summary files in that case. If we just pick one or the other, how do > we know which one to pick? Even if we do that, isn't it possible that none of the summaries will include the change? Presently, we get the latest summarized LSN, make a change, and then wait for the next summary file with a greater LSN than what we saw before the change. But AFAICT there's no guarantee that means the change has been summarized yet, although the chances of that happening in a test are probably pretty small. Could we get the LSN before and after making the change and then inspect all summaries that include that LSN range? -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Mon, Jan 29, 2024 at 4:13 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > On Mon, Jan 29, 2024 at 03:18:50PM -0500, Robert Haas wrote: > > I'm wondering if what we need to do is run pg_walsummary on both > > summary files in that case. If we just pick one or the other, how do > > we know which one to pick? > > Even if we do that, isn't it possible that none of the summaries will > include the change? Presently, we get the latest summarized LSN, make a > change, and then wait for the next summary file with a greater LSN than > what we saw before the change. But AFAICT there's no guarantee that means > the change has been summarized yet, although the chances of that happening > in a test are probably pretty small. > > Could we get the LSN before and after making the change and then inspect > all summaries that include that LSN range? The trick here is that each WAL summary file covers one checkpoint cycle. The intent of the test is to load data into the table, checkpoint, see what summaries exist, then update a row, checkpoint again, and see what summaries now exist. We expect one new summary because there's been one new checkpoint. When I was thinking about this yesterday, I was imagining that we were somehow getting an extra checkpoint in some cases. But it looks like it's actually an off-by-one situation. In https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae&dt=2024-01-29%2018%3A09%3A10 the new files that show up between "after insert" and "after new summary" are: 00000001000000000152FAE000000000015AAAC8.summary (LSN distance ~500k) 00000001000000000152F7A8000000000152FAE0.summary (LSN distance 824 bytes) The checkpoint after the inserts says: LOG: checkpoint complete: wrote 14 buffers (10.9%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.956 s, sync=0.929 s, total=3.059 s; sync files=39, longest=0.373 s, average=0.024 s; distance=491 kB, estimate=491 kB; lsn=0/15AAB20, redo lsn=0/15AAAC8 And the checkpoint after the single-row update says: LOG: checkpoint complete: wrote 4 buffers (3.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.648 s, sync=0.355 s, total=2.798 s; sync files=3, longest=0.348 s, average=0.119 s; distance=11 kB, estimate=443 kB; lsn=0/15AD770, redo lsn=0/15AD718 So both of the new WAL summary files that are appearing here are from checkpoints that happened before the single-row update. The larger file is the one covering the 400 inserts, and the smaller one is the checkpoint before that. Which means that the "Wait for a new summary to show up." code isn't actually waiting long enough, and then the whole thing goes haywire. The problem is, I think, that this code naively thinks it can just wait for summarized_lsn and everything will be fine ... but that assumes we were caught up when we first measured the summarized_lsn, and that need not be so, because it takes some short but non-zero amount of time for the summarizer to catch up with the WAL generated during initdb. I think the solution here is to find a better way to wait for the inserts to be summarized, one that actually does wait for that to happen. -- Robert Haas EDB: http://www.enterprisedb.com
On Tue, Jan 30, 2024 at 10:51 AM Robert Haas <robertmhaas@gmail.com> wrote: > I think the solution here is to find a better way to wait for the > inserts to be summarized, one that actually does wait for that to > happen. Here's a patch for that. I now think a7097ca630a25dd2248229f21ebce4968d85d10a was actually misguided, and served only to mask some of the failures caused by waiting for the WAL summary file. -- Robert Haas EDB: http://www.enterprisedb.com
Attachment
On Tue, Jan 30, 2024 at 11:52 AM Robert Haas <robertmhaas@gmail.com> wrote: > Here's a patch for that. I now think > a7097ca630a25dd2248229f21ebce4968d85d10a was actually misguided, and > served only to mask some of the failures caused by waiting for the WAL > summary file. Committed. -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, Jan 24, 2024 at 12:05:15PM -0600, Nathan Bossart wrote: > There might be an overflow risk in the cutoff time calculation, but I doubt > that's the root cause of these failures: > > /* > * Files should only be removed if the last modification time precedes the > * cutoff time we compute here. > */ > cutoff_time = time(NULL) - 60 * wal_summary_keep_time; I've attached a short patch for fixing this overflow risk. Specifically, it limits wal_summary_keep_time to INT_MAX / SECS_PER_MINUTE, just like log_rotation_age. I considering checking for overflow when we subtract the keep-time from the result of time(2), but AFAICT there's only a problem if time_t is unsigned, which Wikipedia leads me to believe is unusual [0], so I figured we might be able to just wait this one out until 2038. > Otherwise, I think we'll probably need to add some additional logging to > figure out what is happening... Separately, I suppose it's probably time to revert the temporary debugging code adding by commit 5ddf997. I can craft a patch for that, too. [0] https://en.wikipedia.org/wiki/Unix_time#Representing_the_number -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On Thu, Mar 14, 2024 at 04:00:10PM -0500, Nathan Bossart wrote: > Separately, I suppose it's probably time to revert the temporary debugging > code adding by commit 5ddf997. I can craft a patch for that, too. As promised... -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On Thu, Mar 14, 2024 at 08:52:55PM -0500, Nathan Bossart wrote: > Subject: [PATCH v1 1/2] Revert "Temporary patch to help debug pg_walsummary > test failures." > Subject: [PATCH v1 2/2] Fix possible overflow in MaybeRemoveOldWalSummaries(). Assuming there are no objections or feedback, I plan to commit these two patches within the next couple of days. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Tue, Mar 19, 2024 at 01:15:02PM -0500, Nathan Bossart wrote: > Assuming there are no objections or feedback, I plan to commit these two > patches within the next couple of days. Committed. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Wed, Mar 20, 2024 at 2:35 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > On Tue, Mar 19, 2024 at 01:15:02PM -0500, Nathan Bossart wrote: > > Assuming there are no objections or feedback, I plan to commit these two > > patches within the next couple of days. > > Committed. Thanks. Sorry you had to clean up after me. :-( -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, Mar 20, 2024 at 02:53:01PM -0400, Robert Haas wrote: > On Wed, Mar 20, 2024 at 2:35 PM Nathan Bossart <nathandbossart@gmail.com> wrote: >> Committed. > > Thanks. Sorry you had to clean up after me. :-( No worries. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com