Re: pg16: invalid page/page verification failed - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: pg16: invalid page/page verification failed
Date
Msg-id ZSAhKVk9GOO8iL6N@pryzbyj2023
Whole thread Raw
In response to Re: pg16: invalid page/page verification failed  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
On Fri, Oct 06, 2023 at 09:20:05AM +0900, Michael Paquier wrote:
> On Thu, Oct 05, 2023 at 11:45:18AM -0500, Justin Pryzby wrote:
> > This table is what it sounds like: a partition into which CSV logs are
> > COPY'ed.  It would've been created around 8am.  There's no special
> > params set for the table nor for autovacuum.
> 
> This may be an important bit of information.  31966b151e6a is new as
> of Postgres 16, has changed the way relations are extended and COPY
> was one area touched.  I am adding Andres in CC.

Also, I realized that someone kicked off a process just after 9am which
would've done a lot of INSERT ON CONFLICT DO UPDATE, VACUUM FULL, and
VACUUM.  Which consumed and dirtied buffers about 100x faster than normal.

log_time         | 2023-10-05 10:00:55.794-05
pid              | 31754
left             | duration: 51281.001 ms  statement: VACUUM (FULL,FREEZE) othertable...

log_time         | 2023-10-05 10:01:01.784-05
backend_type     | checkpointer
left             | checkpoint starting: time

log_time         | 2023-10-05 10:01:02.935-05
pid              | 10023
left             | page verification failed, calculated checksum 5074 but expected 5050
context          | COPY postgres_log, line 947
left             | COPY postgres_log FROM '/var/log/postgresql/postgresql-2023-10-05_095600.csv' WITH csv

log_time         | 2023-10-05 10:01:02.935-05
pid              | 10023
left             | invalid page in block 119 of relation base/16409/801594131
context          | COPY postgres_log, line 947
left             | COPY postgres_log FROM '/var/log/postgresql/postgresql-2023-10-05_095600.csv' WITH csv

log_time         | 2023-10-05 10:01:11.636-05
pid              | 31754
left             | duration: 15838.374 ms  statement: VACUUM (FREEZE) othertable...

I meant to point out that the issue is on the last block.

postgres=# SELECT pg_relation_size('"BROKEN_postgres_log_2023_10_05_0900"')/8192;
?column? | 120

It sounds like there may be an issue locking (pinning?) a page, or
rather not locking it, or releasing the lock too early.

-- 
Justin



pgsql-hackers by date:

Previous
From: Aleksander Alekseev
Date:
Subject: Re: Add const to values and nulls arguments
Next
From: Dagfinn Ilmari Mannsåker
Date:
Subject: Re: typo in couple of places