Re: 9.2 recovery/startup problems - Mailing list pgsql-hackers

From Robert Haas
Subject Re: 9.2 recovery/startup problems
Date
Msg-id CA+TgmobzTH0yz=9VX0yB5JShwWAQp3oezqhSPxLR9Y28OGwfhw@mail.gmail.com
Whole thread Raw
In response to Re: 9.2 recovery/startup problems  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-hackers
On Tue, Dec 2, 2014 at 11:54 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
> During abort processing after getting a SIGTERM, the back end truncates
> 59288 to zero size, and unlinks all the other files (including 59288_init).
> The actual removal of 59288 is left until the checkpoint.  So if you SIGTERM
> the backend, then take down the server uncleanly before the next checkpoint
> completes, you are left with just 59288.
>
> Here is the strace:
>
> open("base/16416/59288", O_RDWR)        = 8
> ftruncate(8, 0)                         = 0
> close(8)                                = 0
> unlink("base/16416/59288.1")            = -1 ENOENT (No such file or
> directory)
> unlink("base/16416/59288_fsm")          = -1 ENOENT (No such file or
> directory)
> unlink("base/16416/59288_vm")           = -1 ENOENT (No such file or
> directory)
> unlink("base/16416/59288_init")         = 0
> unlink("base/16416/59288_init.1")       = -1 ENOENT (No such file or
> directory)

Hmm, that's not good.

I guess we can either adopt your suggestion of adjusting
ResetUnloggedRelationsInDbspaceDir() to cope with the possibility that
the situation has changed during recovery, or else figure out how to
be more stringent about the order in which forks get removed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: using custom scan nodes to prototype parallel sequential scan
Next
From: Robert Haas
Date:
Subject: Re: B-Tree support function number 3 (strxfrm() optimization)