Home > mailing lists

Re: "PANIC: could not open critical system index 2662" - twice - Mailing list pgsql-general

From	Dilip Kumar
Subject	Re: "PANIC: could not open critical system index 2662" - twice
Date	May 8, 2023 13:45:20
Msg-id	CAFiTN-uKc46MGgkCB9Dim14_Xq23NQznZXSLRdZ-hgjxVBGRYA@mail.gmail.com Whole thread Raw
In response to	Re: "PANIC: could not open critical system index 2662" - twice (Michael Paquier <michael@paquier.xyz>)
Responses	Re: "PANIC: could not open critical system index 2662" - twice
List	pgsql-general

Tree view

On Mon, May 8, 2023 at 7:55 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Sun, May 07, 2023 at 10:30:52PM +1200, Thomas Munro wrote:
> > Bug-in-PostgreSQL explanations could include that we forgot it was
> > dirty, or some backend wrote it out to the wrong file; but if we were
> > forgetting something like permanent or dirty, would there be a more
> > systematic failure?  Oh, it could require special rare timing if it is
> > similar to 8a8661828's confusion about permanence level or otherwise
> > somehow not setting BM_PERMANENT, but in the target blocks, so I think
> > that'd require a checkpoint AND a crash.  It doesn't reproduce for me,
> > but perhaps more unlucky ingredients are needed.
> >
> > Bug-in-OS/FS explanations could include that a whole lot of writes
> > were mysteriously lost in some time window, so all those files still
> > contain the zeroes we write first in smgrextend().  I guess this
> > previously rare (previously limited to hash indexes?) use of sparse
> > file hole-punching could be a factor in an it's-all-ZFS's-fault
> > explanation:
>
> Yes, you would need a bit of all that.
>
> I can reproduce the same backtrace here.  That's just my usual laptop
> with ext4, so this would be a Postgres bug.  First, here are the four
> things running in parallel so as I can get a failure in loading a
> critical index when connecting:
> 1) Create and drop a database with WAL_LOG as strategy and the
> regression database as template:
> while true; do
>   createdb --template=regression --strategy=wal_log testdb;
>   dropdb testdb;
> done
> 2) Feeding more data to pg_class in the middle, while testing the
> connection to the database created:
> while true;
>   do psql -c 'create table popo as select 1 as a;' regression > /dev/null 2>&1 ;
>   psql testdb -c "select 1" > /dev/null 2>&1 ;
>   psql -c 'drop table popo' regression > /dev/null 2>&1 ;
>   psql testdb -c "select 1" > /dev/null 2>&1 ;
> done;
> 3) Force some checkpoints:
> while true; do psql -c 'checkpoint' > /dev/null 2>&1; sleep 4; done
> 4) Force a few crashes and recoveries:
> while true ; do pg_ctl stop -m immediate ; pg_ctl start ; sleep 4 ; done
>

I am able to reproduce this using the steps given above, I am also
trying to analyze this further.  I will send the update once I get
some clue.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

pgsql-general by date:

From: Tom Lane
Date: 08 May 2023, 13:44:18
Subject: Re: huge discrepancy between EXPLAIN cost and actual time (but the table has just been ANALYZED)

From: Oscar Carlberg
Date: 08 May 2023, 14:35:14
Subject: ICU, locale and collation question

Re: "PANIC: could not open critical system index 2662" - twice - Mailing list pgsql-general

Previous

Next