Home > mailing lists

Re: [HACKERS] Unlogged tables cleanup - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: [HACKERS] Unlogged tables cleanup
Date	May 23, 2019 16:14:59
Msg-id	CA+TgmoY7oBoeuF5UaLRpx2SgcGVs4iB0UJcTqjkpoQ2S5sx9ug@mail.gmail.com Whole thread Raw
In response to	Re: [HACKERS] Unlogged tables cleanup (Michael Paquier <michael@paquier.xyz>)
Responses	Re: [HACKERS] Unlogged tables cleanup
List	pgsql-hackers

Tree view

On Thu, May 23, 2019 at 2:43 AM Michael Paquier <michael@paquier.xyz> wrote:
> On Tue, May 21, 2019 at 08:39:18AM -0400, Robert Haas wrote:
> > Yes.  I thought I had described it.  You create an unlogged table,
> > with an index of a type that does not smgrimmedsync(), your
> > transaction commits, and then the system crashes, losing the _init
> > fork for the index.
>
> The init forks won't magically go away, except in one case for empty
> routines not going through shared buffers.

No magic is required.  If you haven't called fsync(), the file might
not be there after a system crash.

Going through shared_buffers guarantees that the file will be
fsync()'d before the next checkpoint, but I'm talking about a scenario
where you crash before the next checkpoint.

> Then, empty routines going through shared buffers fill in one or more
> buffers, mark it/them as empty, dirty it/them, log the page(s) and then
> unlock the buffer(s).  If a crash happens after the transaction
> commits, so we would still have the init page in WAL, and at the end
> of recovery we would know about it.

Yeah, but the problem is that the currently system requires us to know
about it at the *beginning* of recovery.  See my earlier remarks:

Suppose we create an unlogged table and then crash. The main fork
makes it to disk, and the init fork does not.  Before WAL replay, we
remove any main forks that have init forks, but because the init fork
was lost, that does not happen.  Recovery recreates the init fork.
After WAL replay, we try to copy_file() each _init fork to the
corresponding main fork. That fails, because copy_file() expects to be
able to create the target file, and here it can't do that because it
already exists.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Peter Eisentraut
Date: 23 May 2019, 16:13:00
Subject: Re: Fuzzy thinking in is_publishable_class

From: Robert Haas
Date: 23 May 2019, 16:16:34
Subject: Re: Minimal logical decoding on standbys

Re: [HACKERS] Unlogged tables cleanup - Mailing list pgsql-hackers

Previous

Next