Re: Different compression methods for FPI - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: Different compression methods for FPI
Date
Msg-id 20220905002320.GD31833@telsasoft.com
Whole thread Raw
In response to Re: Different compression methods for FPI  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: Different compression methods for FPI
Re: Different compression methods for FPI
List pgsql-hackers
On Fri, Sep 02, 2022 at 06:55:11AM -0500, Justin Pryzby wrote:
> On Sun, Jun 13, 2021 at 08:24:12PM -0500, Justin Pryzby wrote:
> > In this patch series, I added compression information to the errcontext from
> > xlog_block_info(), and allow specifying compression levels like zlib-2.  I'll
> > rearrange that commit earlier if we decide that's desirable to include.
> 
> 4035cd5d4 added wal_compress=lz4 and
> e9537321a added wal_compress=zstd
> 
> Since 4035cd5d4, pg_waldump has shown compression info (and walinspect
> shows the same since 2258e76f9).
> 
> But if you try to replay WAL on a server which wasn't compiled with
> support for the requisite compression methods, it's a bit crummy that it
> doesn't include in the error message the reason *why* it failed to
> restore the image, if that's caused by the missing compression method.

That's also hitting an elog().

2022-09-04 15:56:29.916 CDT startup[2625] FATAL:  XX000: failed to restore block image
2022-09-04 15:56:29.916 CDT startup[2625] DETAIL:  image at 0/1D11CB8 compressed with zstd not supported by build,
block0
 
2022-09-04 15:56:29.916 CDT startup[2625] CONTEXT:  WAL redo at 0/1D11CB8 for Heap/DELETE: off 50 flags 0x00
KEYS_UPDATED; blkref #0: rel 1663/16384/2610, blk 4 FPW
 
2022-09-04 15:56:29.916 CDT startup[2625] LOCATION:  XLogReadBufferForRedoExtended, xlogutils.c:396

(gdb) bt
#0  report_invalid_record (state=0x555555e33ff0, fmt=0x555555b1c1a8 "image at %X/%X compressed with %s not supported by
build,block %d") at xlogreader.c:74
 
#1  0x00005555556beeec in RestoreBlockImage (record=record@entry=0x555555e33ff0, block_id=block_id@entry=0 '\000',
page=page@entry=0x7fffee9bdc00"") at xlogreader.c:2078
 
#2  0x00005555556c5d39 in XLogReadBufferForRedoExtended (record=record@entry=0x555555e33ff0, block_id=block_id@entry=0
'\000',mode=mode@entry=RBM_NORMAL, get_cleanup_lock=get_cleanup_lock@entry=false, 
 
    buf=buf@entry=0x7fffffffd760) at xlogutils.c:395
#3  0x00005555556c5e4a in XLogReadBufferForRedo (record=record@entry=0x555555e33ff0, block_id=block_id@entry=0 '\000',
buf=buf@entry=0x7fffffffd760)at xlogutils.c:320
 
#4  0x000055555565bd5b in heap_xlog_delete (record=0x555555e33ff0) at heapam.c:9032
#5  0x00005555556622b7 in heap_redo (record=<optimized out>) at heapam.c:9836
#6  0x00005555556c15ed in ApplyWalRecord (xlogreader=0x555555e33ff0, record=record@entry=0x7fffee2b6820,
replayTLI=replayTLI@entry=0x7fffffffd870)at ../../../../src/include/access/xlog_internal.h:379
 
#7  0x00005555556c4c30 in PerformWalRecovery () at xlogrecovery.c:1725
#8  0x00005555556b7f23 in StartupXLOG () at xlog.c:5291
#9  0x0000555555ac4491 in InitPostgres (in_dbname=in_dbname@entry=0x555555e09390 "postgres", dboid=dboid@entry=0,
username=username@entry=0x555555dedda0"pryzbyj", useroid=useroid@entry=0, 
 
    load_session_libraries=<optimized out>, override_allow_connections=override_allow_connections@entry=false,
out_dbname=0x0)at postinit.c:731
 
#10 0x000055555598471f in PostgresMain (dbname=0x555555e09390 "postgres", username=username@entry=0x555555dedda0
"pryzbyj")at postgres.c:4085
 
#11 0x00005555559851b0 in PostgresSingleUserMain (argc=5, argv=0x555555de7530, username=0x555555dedda0 "pryzbyj") at
postgres.c:3986
#12 0x0000555555840533 in main (argc=5, argv=0x555555de7530) at main.c:194

I guess it should be promoted to an ereport(), since it's now a
user-facing error rathere than an internal one.

$ ./tmp_install.without-zstd/usr/local/pgsql/bin/postgres -D ./src/test/regress/tmp_check/data
2022-09-04 15:28:37.446 CDT postmaster[29964] LOG:  starting PostgreSQL 16devel on x86_64-pc-linux-gnu, compiled by gcc
(Ubuntu9.4.0-1ubuntu1~20.04.1) 9.4.0, 64-bit
 
2022-09-04 15:28:37.446 CDT postmaster[29964] LOG:  listening on IPv4 address "127.0.0.1", port 5432
2022-09-04 15:28:37.528 CDT postmaster[29964] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-09-04 15:28:37.587 CDT startup[29972] LOG:  database system was interrupted while in recovery at 2022-09-04
15:27:44CDT
 
2022-09-04 15:28:37.587 CDT startup[29972] HINT:  This probably means that some data is corrupted and you will have to
usethe last backup for recovery.
 
2022-09-04 15:28:38.010 CDT startup[29972] LOG:  database system was not properly shut down; automatic recovery in
progress
2022-09-04 15:28:38.038 CDT startup[29972] LOG:  redo starts at 0/1D118C0
2022-09-04 15:28:38.039 CDT startup[29972] LOG:  could not stat file "pg_tblspc/16502": No such file or directory
2022-09-04 15:28:38.039 CDT startup[29972] CONTEXT:  WAL redo at 0/1D11970 for Tablespace/DROP: 16502
2022-09-04 15:28:38.039 CDT startup[29972] FATAL:  failed to restore block image
2022-09-04 15:28:38.039 CDT startup[29972] DETAIL:  image at 0/1D11CB8 compressed with zstd not supported by build,
block0
 
2022-09-04 15:28:38.039 CDT startup[29972] CONTEXT:  WAL redo at 0/1D11CB8 for Heap/DELETE: off 50 flags 0x00
KEYS_UPDATED; blkref #0: rel 1663/16384/2610, blk 4 FPW
 

diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 0cda22597fe..01c7454bcc7 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -393,7 +393,11 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
                                       prefetch_buffer);
         page = BufferGetPage(*buf);
         if (!RestoreBlockImage(record, block_id, page))
-            elog(ERROR, "failed to restore block image");
+            ereport(ERROR,
+                    errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                    errmsg("failed to restore block image"),
+                    errdetail("%s", record->errormsg_buf));
+
 
         /*
          * The page may be uninitialized. If so, we can't set the LSN because



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: POC: GROUP BY optimization
Next
From: Peter Smith
Date:
Subject: Re: Column Filtering in Logical Replication