Re: BUG #13143: Cannot stop and restart a streaming server with a replication slot - Mailing list pgsql-bugs

From Andres Freund
Subject Re: BUG #13143: Cannot stop and restart a streaming server with a replication slot
Date
Msg-id 20150427151229.GG18789@awork2.anarazel.de
Whole thread Raw
In response to Re: BUG #13143: Cannot stop and restart a streaming server with a replication slot  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: BUG #13143: Cannot stop and restart a streaming server with a replication slot  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-bugs
On 2015-04-27 11:44:47 -0300, Alvaro Herrera wrote:
> I think this is failing in the fsync_fname() call in slot.c line 1045
> (REL9_4_STABLE).

Patrice has since replied with log_error_verbosity=verbose logs, but
that reply is probably still stuck in moderation:

> 2015-04-25 14:25:59 EDT LOG:  00000: le système de bases de données a été arrêté à 2015-04-25 14:25:39 EDT
> 2015-04-25 14:25:59 EDT EMPLACEMENT :  StartupXLOG, src\backend\access\transam\xlog.c:6011
> 2015-04-25 14:25:59 EDT PANIC:  XX000: n'a pas pu synchroniser sur disque (fsync) le fichier «
pg_replslot/node_win2008sec/state» : Bad file descriptor 
> 2015-04-25 14:25:59 EDT EMPLACEMENT :  RestoreSlotFromDisk, src\backend\replication\slot.c:1115
> 2015-04-25 14:25:59 EDT LOG:  00000: processus de lancement (PID 2696) a été arrêté par l'exception 0xC0000409
> 2015-04-25 14:25:59 EDT ASTUCE :  Voir le fichier d'en-tête C « ntstatus.h » pour une description de la valeur
>     hexadécimale.
> 2015-04-25 14:25:59 EDT EMPLACEMENT :  LogChildExit, src\backend\postmaster\postmaster.c:3336
> 2015-04-25 14:25:59 EDT LOG:  00000: annulation du démarrage à cause d'un échec dans le processus de lancement
> 2015-04-25 14:25:59 EDT EMPLACEMENT :  reaper, src\backend\postmaster\postmaster.c:2604

So it looks to me like it's a straight pg_fsync() failing. Given that
the open apparently succeeded I'm unsure how that could be. The error
message appears to be a EBADFD.

Hm. I wonder if it's maybe that the file is opened with O_RDONLY? The
OSs I have access to don't care - for good reason imo, fsync isn't a
write - but it's not inconceivable that windows might.  I very dimly
remember that that was a problem before at some point. Yep:
http://archives.postgresql.org/message-id/10494.1266903446%40sss.pgh.pa.us

So that's easy enough fixed.

Greetings,

Andres Freund

pgsql-bugs by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)
Next
From: Tom Lane
Date:
Subject: Re: pg_get_constraintdef failing with cache lookup error