Re: "PANIC: could not open critical system index 2662" - twice - Mailing list pgsql-general

From Jeffrey Walton
Subject Re: "PANIC: could not open critical system index 2662" - twice
Date
Msg-id CAH8yC8=dCVPt43NJGnYHKzf5vBg3BP8km6YsP85cAcvoiEpwyg@mail.gmail.com
Whole thread Raw
In response to Re: "PANIC: could not open critical system index 2662" - twice  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: "PANIC: could not open critical system index 2662" - twice
List pgsql-general
On Sat, May 6, 2023 at 6:35 AM Thomas Munro <thomas.munro@gmail.com> wrote:
>
> On Sat, May 6, 2023 at 9:58 PM Evgeny Morozov
> <postgresql3@realityexists.net> wrote:
> > Right - I should have realised that! base/1414389/2662 is indeed all
> > nulls, 32KB of them. I included the file anyway in
> > https://objective.realityexists.net/temp/pgstuff2.zip
>
> OK so it's not just page 0, you have 32KB or 4 pages of all zeroes.
> That's the expected length of that relation when copied from the
> initial template, and consistent with the pg_waldump output (it uses
> FPIs to copy blocks 0-3).  We can't see the block contents but we know
> that block 2 definitely is not all zeroes at that point because there
> are various modifications to it, which not only write non-zeroes but
> must surely have required a sane page 0.
>
> So it does indeed look like something unknown has replaced 32KB of
> data with 32KB of zeroes underneath us.

This may be related... I seem to recall the GNUlib folks talking about
a cp bug on sparse files. It looks like it may be fixed in coreutils
release 9.2 (2023-03-20):
https://github.com/coreutils/coreutils/blob/master/NEWS#L233

If I recall correctly, it had something to do with the way
copy_file_range worked. (Or maybe, it did not work as expected).

According to the GNUlib docs
(https://www.gnu.org/software/gnulib/manual/html_node/copy_005ffile_005frange.html):

    This function has many problems on Linux
    kernel versions before 5.3

> Are there more non-empty
> files that are all-zeroes?  Something like this might find them:
>
> for F in base/1414389/*
> do
>   if [ -s $F ] && ! xxd -p $F | grep -qEv '^(00)*$' > /dev/null
>   then
>     echo $F
>   fi
> done



pgsql-general by date:

Previous
From: Andrew Gierth
Date:
Subject: Re: Check that numeric is zero
Next
From: Thomas Munro
Date:
Subject: Re: "PANIC: could not open critical system index 2662" - twice