Thread: BUG #13287: Database corruption - PANIC: could not fsync file "pg_replslot/[Slot]/state": Bad file descriptor

The following bug has been logged on the website:

Bug reference:      13287
Logged by:          Hillel Eilat
Email address:      Hillel.Eilat@attunity.com
PostgreSQL version: 9.4.1
Operating system:   Windows 7
Description:

Hello

I am in a process of developing an REPLICATION application using PostgreSQL
9.4.1 on a "Windows 7" platform.

"Logical Decoding" feature is used for extracting database changes in
real-time.
A dedicated replication slot - say [Replication Slot Name] - is used for
controlling this flow.
Everything works fine until PostgreSQL service is stopped.
While an active replication slot exists and PostgreSQL service is neatly
stopped, further attempt to restart it is responded by:

PostgreSQL - PANIC:  could not fsync file "pg_replslot/[Replication Slot
Name]/state": Bad file descriptor

Actually - the database in question is now corrupted / not operational
anymore.
Tonight this occurred also after a spontaneous machine shutdown.

This misbehavior now occurs very systematically upon stopping PostgreSQL
service while active replication slots are defined  there.

Your help will be appreciated.

Hillel.
Hi,

On 2015-05-14 10:55:14 +0000, Hillel.Eilat@attunity.com wrote:
> I am in a process of developing an REPLICATION application using PostgreSQL
> 9.4.1 on a "Windows 7" platform.
>
> "Logical Decoding" feature is used for extracting database changes in
> real-time.
> A dedicated replication slot - say [Replication Slot Name] - is used for
> controlling this flow.
> Everything works fine until PostgreSQL service is stopped.
> While an active replication slot exists and PostgreSQL service is neatly
> stopped, further attempt to restart it is responded by:
>
> PostgreSQL - PANIC:  could not fsync file "pg_replslot/[Replication Slot
> Name]/state": Bad file descriptor

> Actually - the database in question is now corrupted / not operational
> anymore.
> Tonight this occurred also after a spontaneous machine shutdown.
>
> This misbehavior now occurs very systematically upon stopping PostgreSQL
> service while active replication slots are defined  there.

This is a known and fixed problem that only occurs on windows
(unfortunately it's not allowed to fsync a readonly file handle on
windows). The next 9.4 release will be made public on the 21st.

Until then there's basically three ways to circumvent the problem:
1) Start the database with fsync turned off, and immediately after start
   turn it on, and reload the config. If you're concerned about the data
   loss window that could theoretically incur you can use initdb -S.
2) Remove the slot by simply removing the directory. Obviously that's
   problematic because the slot's gone in that case.
3) Apply the fix
   (http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=dfbaed459754e71e01bb0cc90a12802bba3f9786)
   and recompile postgres. Unfortunately that's not that easy on windows.

Hope that helps?

Andres