BUG #4879: bgwriter fails to fsync the file in recovery mode - Mailing list pgsql-bugs

From Fujii Masao
Subject BUG #4879: bgwriter fails to fsync the file in recovery mode
Date
Msg-id 200906251255.n5PCt77V016240@wwwmaster.postgresql.org
Whole thread Raw
Responses Re: BUG #4879: bgwriter fails to fsync the file in recovery mode
Re: BUG #4879: bgwriter fails to fsync the file in recovery mode
List pgsql-bugs
The following bug has been logged online:

Bug reference:      4879
Logged by:          Fujii Masao
Email address:      masao.fujii@gmail.com
PostgreSQL version: 8.4dev
Operating system:   RHEL5.1 x86_64
Description:        bgwriter fails to fsync the file in recovery mode
Details:

The restartpoint by bgwriter in recovery mode caused the following error.

    ERROR:  could not fsync segment 0 of relation base/11564/16422_fsm: No
such file or directory

The following procedure can reproduce this error.

(1) create warm-standby environment
(2) execute "pgbench -i -s10"
(3) execute the following SQLs

    TRUNCATE pgbench_accounts ;
    TRUNCATE pgbench_branches ;
    TRUNCATE pgbench_history ;
    TRUNCATE pgbench_tellers ;
    CHECKPOINT ;
    SELECT pg_switch_xlog();

(4) wait a minute, then the upcoming restartpoint would cause the error
    in the standby server.


Whether this error happens or not depends on the timing of operations.
So, you might need to repeat the procedure (2) and (3) in order to
reproduce the error.

I suspect that the cause of this error is the race condition between
file deletion by startup process and fsync by bgwriter: TRUNCATE xlog
record immediately deletes the corresponding file, while it might be
scheduled to be fsynced by bgwriter. We should leave the actual file
deletion to bgwriter instead of startup process, like normal mode?

pgsql-bugs by date:

Previous
From: "Krimstock, Roger I (Roger)"
Date:
Subject: Re: BUG #4785: Installation fails
Next
From: Simon Riggs
Date:
Subject: Re: BUG #4879: bgwriter fails to fsync the file in recovery mode