Re: pgsql: Add contrib/pg_walinspect. - Mailing list pgsql-hackers

From Tom Lane
Subject Re: pgsql: Add contrib/pg_walinspect.
Date
Msg-id 295868.1651024073@sss.pgh.pa.us
Whole thread Raw
In response to Re: pgsql: Add contrib/pg_walinspect.  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: pgsql: Add contrib/pg_walinspect.  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
Thomas Munro <thomas.munro@gmail.com> writes:
> Hrmph...  Are you sure you rebuilt the contrib module?   Assuming so,
> maybe it's failing in a different way for you and me.  For me, it
> always fails after this break is reached in xlogutil.c:

>             /* If asked, let's not wait for future WAL. */
>             if (!wait_for_wal)
>                 break;

Hmm.  For me, that statement is not reached at all in successful
(make installcheck) runs.  In a failing run, it's reached with
wait_for_wal = false, after which we get the "could not read WAL"
failure.  Usually that happens twice, as per attached.

            regards, tom lane

diff --git a/contrib/pg_walinspect/Makefile b/contrib/pg_walinspect/Makefile
index 960530e..85dcb3a 100644
--- a/contrib/pg_walinspect/Makefile
+++ b/contrib/pg_walinspect/Makefile
@@ -15,7 +15,6 @@ REGRESS_OPTS = --temp-config $(top_srcdir)/contrib/pg_walinspect/walinspect.conf

 # Disabled because these tests require "wal_level=replica", which
 # some installcheck users do not have (e.g. buildfarm clients).
-NO_INSTALLCHECK = 1

 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index cf5db23..33d2c73 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -373,6 +373,9 @@ XLogNextRecord(XLogReaderState *state, char **errormsg)
          */
         Assert(!XLogRecPtrIsInvalid(state->EndRecPtr));

+        if(!*errormsg)
+          *errormsg = "decode_queue_head is null";
+
         return NULL;
     }

diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 4257026..4089efd 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -955,6 +955,8 @@ read_local_xlog_page_guts(XLogReaderState *state, XLogRecPtr targetPagePtr,
             if (loc <= read_upto)
                 break;

+            elog(LOG, "let's wait for wal: %d", wait_for_wal);
+
             /* If asked, let's not wait for future WAL. */
             if (!wait_for_wal)
                 break;
2022-04-27 03:40:09.253 CEST postmaster[32569] LOG:  starting PostgreSQL 15devel on mipsel-unknown-linux-gnu, compiled
bygcc (Debian 4.9.2-10+deb8u1) 4.9.2, 32-bit 
2022-04-27 03:40:09.254 CEST postmaster[32569] LOG:  listening on Unix socket "/tmp/pg_regress-g0HWzi/.s.PGSQL.51696"
2022-04-27 03:40:09.284 CEST startup[32574] LOG:  database system was shut down at 2022-04-27 03:40:09 CEST
2022-04-27 03:40:09.314 CEST postmaster[32569] LOG:  database system is ready to accept connections
2022-04-27 03:40:12.073 CEST client backend[32598] pg_regress/pg_walinspect ERROR:  WAL start LSN must be less than end
LSN
2022-04-27 03:40:12.073 CEST client backend[32598] pg_regress/pg_walinspect STATEMENT:  SELECT COUNT(*) >= 0 AS ok FROM
pg_get_wal_records_info('0/1903EE8','0/1903E40'); 
2022-04-27 03:40:12.075 CEST client backend[32598] pg_regress/pg_walinspect ERROR:  WAL start LSN must be less than end
LSN
2022-04-27 03:40:12.075 CEST client backend[32598] pg_regress/pg_walinspect STATEMENT:  SELECT COUNT(*) >= 0 AS ok FROM
pg_get_wal_stats('0/1903EE8','0/1903E40'); 
2022-04-27 03:40:12.085 CEST client backend[32598] pg_regress/pg_walinspect LOG:  let's wait for wal: 0
2022-04-27 03:40:12.085 CEST client backend[32598] pg_regress/pg_walinspect STATEMENT:  SELECT COUNT(*) >= 0 AS ok FROM
pg_get_wal_records_info_till_end_of_wal('0/1903E40');
2022-04-27 03:40:12.085 CEST client backend[32598] pg_regress/pg_walinspect ERROR:  could not read WAL at 0/1903E40:
decode_queue_headis null 
2022-04-27 03:40:12.085 CEST client backend[32598] pg_regress/pg_walinspect STATEMENT:  SELECT COUNT(*) >= 0 AS ok FROM
pg_get_wal_records_info_till_end_of_wal('0/1903E40');
2022-04-27 03:40:12.091 CEST client backend[32598] pg_regress/pg_walinspect LOG:  let's wait for wal: 0
2022-04-27 03:40:12.091 CEST client backend[32598] pg_regress/pg_walinspect STATEMENT:  SELECT COUNT(*) >= 0 AS ok FROM
pg_get_wal_stats_till_end_of_wal('0/1903E40');
2022-04-27 03:40:12.091 CEST client backend[32598] pg_regress/pg_walinspect ERROR:  could not read WAL at 0/1903E40:
decode_queue_headis null 
2022-04-27 03:40:12.091 CEST client backend[32598] pg_regress/pg_walinspect STATEMENT:  SELECT COUNT(*) >= 0 AS ok FROM
pg_get_wal_stats_till_end_of_wal('0/1903E40');
2022-04-27 03:40:12.198 CEST postmaster[32569] LOG:  received fast shutdown request
2022-04-27 03:40:12.299 CEST postmaster[32569] LOG:  aborting any active transactions
2022-04-27 03:40:12.304 CEST postmaster[32569] LOG:  background worker "logical replication launcher" (PID 32577)
exitedwith exit code 1 
2022-04-27 03:40:12.306 CEST checkpointer[32572] LOG:  shutting down
2022-04-27 03:40:12.310 CEST checkpointer[32572] LOG:  checkpoint starting: shutdown immediate
2022-04-27 03:40:12.524 CEST checkpointer[32572] LOG:  checkpoint complete: wrote 784 buffers (4.8%); 0 WAL file(s)
added,0 removed, 0 recycled; write=0.199 s, sync=0.001 s, total=0.218 s; sync files=0, longest=0.000 s, average=0.000
s;distance=4197 kB, estimate=4197 kB 
2022-04-27 03:40:12.838 CEST postmaster[32569] LOG:  database system is shut down

pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: Possible corruption by CreateRestartPoint at promotion
Next
From: Thomas Munro
Date:
Subject: Re: pgsql: Add contrib/pg_walinspect.