Home > mailing lists

Fix checkpointer PANIC due to missing fsync cancel in mdunlinkfork() - Mailing list pgsql-hackers

From	SATYANARAYANA NARLAPURAM
Subject	Fix checkpointer PANIC due to missing fsync cancel in mdunlinkfork()
Date	April 14 04:55:28
Msg-id	CAHg+QDdH+uBc0deP7uPE5rBkn0EjdkVC2_8ykK3KV90V4z6_5Q@mail.gmail.com Whole thread Raw
List	pgsql-hackers

Tree view

Hi hackers,

I think I found a bug in mdunlinkfork() that causes the checkpointer

to PANIC with "could not fsync file ... No such file or directory" when

a relation is moved between tablespaces and the old tablespace is dropped.

mdunlinkfork() has two code paths for unlinking a relation segment file:
(1) The 'if' branch (isRedo, binary upgrade, non-MAIN forknum, or temp
relations) correctly calls register_forget_request() to cancel any pending

fsync before unlinking. (2) The 'else' branch that handles normal

MAIN_FORKNUM unlink is missing register_forget_request().

This leaves a stale SYNC_REQUEST entry in the checkpointer's

pendingOps hash table.

The comment in ProcessSyncRequests() states below:

* The fsync table could contain requests to fsync segments that
* have been deleted (unlinked) by the time we get to them. Rather
* than just hoping an ENOENT (or EACCES on Windows) error can be
* ignored, what we do on error is absorb pending requests and
* then retry. Since mdunlink() queues a "cancel" message before
* actually unlinking, the fsync request is guaranteed to be
* marked canceled after the absorb if it really was this case.
* DROP DATABASE likewise has to tell us to forget fsync requests
* before it starts deletions.
*/

but this guarantee is not provided by the else branch.

Repro:

Reproducing this is not easy as it is a race but the test below

when repeated sufficient number of times you can see ocasionally

checkpointer PANIC. Though I used TABLESPACE for somewhat

consistent repro, this race exists for operations like DROP table,

TRUNCATE table as well I believe.

CREATE TABLESPACE ts LOCATION '/tmp/ts';
CREATE TABLE t (id int, pad text) TABLESPACE ts;
INSERT INTO t SELECT g, repeat('x', 500) FROM generate_series(1,50000) g;

-- Concurrent updates to dirty shared buffers (background sessions)
UPDATE t SET pad = repeat('y', 500) WHERE id <= 25000;

-- Move table away; FlushRelationBuffers sends SYNC_REQUESTs for old path
ALTER TABLE t SET TABLESPACE pg_default;

-- Drop tablespace forces checkpoint + removes directory
DROP TABLESPACE ts;

-- Next checkpoint hits stale entry -> PANIC
CHECKPOINT;

PANIC: could not fsync file "pg_tblspc/41343/PG_19_202604061/5/41348": No such file or directory
LOG: checkpointer process (PID 166749) was terminated by signal 6: Aborted
LOG: terminating any other active server processes
LOG: all server processes terminated; reinitializing

The fix is to add register_forget_request() in the else branch of
mdunlinkfork(), before register_unlink_segment(), matching what the
'if' branch already does.

Thoughts?

Thanks,

Satya

Attachment

0001-Fix-checkpointer-PANIC-due-to-missing-fsync-cancel-i.patch

pgsql-hackers by date:

From: Chengpeng Yan
Date: 14 April, 04:34:02
Subject: Re: [PATCH] ANALYZE: hash-accelerate MCV tracking for equality-only types

From: Chao Li
Date: 14 April, 05:49:38
Subject: Re: Bug: Rule actions see wrong values for generated columns (NEW.gen reads OLD value)

Fix checkpointer PANIC due to missing fsync cancel in mdunlinkfork() - Mailing list pgsql-hackers

Attachment

Previous

Next