pg_waldump erroneously outputs newline for FPWs, and another minorbug - Mailing list pgsql-hackers

From Andres Freund
Subject pg_waldump erroneously outputs newline for FPWs, and another minorbug
Date
Msg-id 20191029233341.4gnyau7e5v2lh5sc@alap3.anarazel.de
Whole thread Raw
Responses Re: pg_waldump erroneously outputs newline for FPWs, and anotherminor bug
Re: pg_waldump erroneously outputs newline for FPWs, and anotherminor bug
List pgsql-hackers
Hi,

When using -b, --bkp-details pg_waldump outputs an unnecessary newline
for blocks that contain an FPW.

In --bkp-details block references are output on their own lines, like:

rmgr: SPGist      len (rec/tot):   4348/  4348, tx:        980, lsn: 0/01985818, prev 0/01983850, desc: PICKSPLIT ndel
92;nins 93
 
        blkref #0: rel 1663/16384/16967 fork main blk 3
        blkref #1: rel 1663/16384/16967 fork main blk 6
        blkref #2: rel 1663/16384/16967 fork main blk 5
        blkref #3: rel 1663/16384/16967 fork main blk 1
rmgr: Heap        len (rec/tot):     69/    69, tx:        980, lsn: 0/01986930, prev 0/01985818, desc: INSERT off 2
flags0x00
 
        blkref #0: rel 1663/16384/16961 fork main blk 1

but unfortunately, when there's actually an FPW present, it looks like:

rmgr: XLOG        len (rec/tot):     75/ 11199, tx:        977, lsn: 0/019755E0, prev 0/0194EDD8, desc: FPI
        blkref #0: rel 1663/16384/16960 fork main blk 32 (FPW); hole: offset: 548, length: 4484

        blkref #1: rel 1663/16384/16960 fork main blk 33 (FPW); hole: offset: 548, length: 4484

        blkref #2: rel 1663/16384/16960 fork main blk 34 (FPW); hole: offset: 548, length: 4484

rmgr: Heap        len (rec/tot):    188/   188, tx:        977, lsn: 0/019781D0, prev 0/019755E0, desc: INPLACE off 23

which clearly seems unnecessary. Looking at the code it seems to me that

static void
XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
{
...
            printf("\tblkref #%u: rel %u/%u/%u fork %s blk %u",
                   block_id,
                   rnode.spcNode, rnode.dbNode, rnode.relNode,
                   forkNames[forknum],
                   blk);
            if (XLogRecHasBlockImage(record, block_id))
            {
                if (record->blocks[block_id].bimg_info &
                    BKPIMAGE_IS_COMPRESSED)
                {
                    printf(" (FPW%s); hole: offset: %u, length: %u, "
                           "compression saved: %u\n",
                           XLogRecBlockImageApply(record, block_id) ?
                           "" : " for WAL verification",
                           record->blocks[block_id].hole_offset,
                           record->blocks[block_id].hole_length,
                           BLCKSZ -
                           record->blocks[block_id].hole_length -
                           record->blocks[block_id].bimg_len);
                }
                else
                {
                    printf(" (FPW%s); hole: offset: %u, length: %u\n",
                           XLogRecBlockImageApply(record, block_id) ?
                           "" : " for WAL verification",
                           record->blocks[block_id].hole_offset,
                           record->blocks[block_id].hole_length);
                }
            }
            putchar('\n');

was intended to not actually print a newline in the printfs in the if
preceding the putchar.

This is a fairly longstanding bug, introduced in:

commit 2c03216d831160bedd72d45f712601b6f7d03f1c
Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date:   2014-11-20 17:56:26 +0200

    Revamp the WAL record format.


Does anybody have an opinion about fixing it just in master or also
backpatching it? I guess there could be people having written parsers
for the waldump output?  I'm inclined to backpatch.


I also find a second minor bug:

static void
XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
{
...
    const char *id;
...
    id = desc->rm_identify(info);
    if (id == NULL)
        id = psprintf("UNKNOWN (%x)", info & ~XLR_INFO_MASK);
...
    printf("desc: %s ", id);

after that "id" is not referenced anymore. Which means we would leak
memory if there were a lot of UNKNOWN records. This is from
commit 604f7956b9460192222dd37bd3baea24cb669a47
Author: Andres Freund <andres@anarazel.de>
Date:   2014-09-22 16:48:14 +0200

    Improve code around the recently added rm_identify rmgr callback.

While not a lot of memory, it's not absurd to run pg_waldump against a
large amount of WAL, so backpatching seems mildly advised.

I'm inlined to think that the best fix is to just move the relevant code
to the callsite, and not psprintf'ing into a temporary buffer. We'd need
additional state to free the memory, as rm_identify returns a static
buffer.

So I'll make it

    id = desc->rm_identify(info);
    if (id == NULL)
        printf("desc: UNKNOWN (%x) ", info & ~XLR_INFO_MASK);
    else
        printf("desc: %s ", id);

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: PL/Python fails on new NetBSD/PPC 8.0 install
Next
From: Peter Geoghegan
Date:
Subject: Re: Connections hang indefinitely while taking a gin index's LWLockbuffer_content lock(PG10.7)