Home > mailing lists

Re: Error with index on unlogged table - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: Error with index on unlogged table
Date	March 26, 2015 17:50:33
Msg-id	20150326175024.GJ451@alap3.anarazel.de Whole thread Raw
In response to	Re: Error with index on unlogged table (Andres Freund <andres@2ndquadrant.com>)
Responses	Re: Error with index on unlogged table
List	pgsql-hackers

Tree view

On 2015-03-26 15:13:41 +0100, Andres Freund wrote:
> On 2015-03-26 13:55:22 +0000, Thom Brown wrote:
> > I still, however, have a problem with the separate and original issue of:
> > 
> > # insert into utest (thing) values ('moomoo');
> > ERROR:  index "utest_pkey" contains unexpected zero page at block 0
> > HINT:  Please REINDEX it.
> > 
> > I don't see why the user should need to go re-indexing all unlogged tables
> > each time a standby is promoted.  The index should just be empty and ready
> > to use.
> 
> There's definitely something rather broken here. Investigating.

As far as I can see this has been broken at least since the introduction
of fast promotion. WAL replay will update the init fork in shared
memory, but it'll not be guaranteed to be flushed to disk when the reset
happens. d3586fc8a et al. then also made it possible to hit the issue
without fast promotion.

To hit the issue there may not be a restartpoint (requiring a checkpoint
on the primary) since the creation of the unlogged table.

I think the problem here is that the *primary* makes no such
assumptions. Init forks are logged via stuff likesmgrwrite(index->rd_smgr, INIT_FORKNUM, BTREE_METAPAGE,          (char
*)metapage, true);if (XLogIsNeeded())    log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
BTREE_METAPAGE,metapage, false);

/* * An immediate sync is required even if we xlog'd the page, because the * write did not go through shared_buffers
andtherefore a concurrent * checkpoint may have moved the redo pointer past our xlog record.
*/smgrimmedsync(index->rd_smgr,INIT_FORKNUM);

i.e. the data is written out directly to disk, circumventing
shared_buffers. It's pretty bad that we don't do the same on the
standby. For master I think we should just add a bit to the XLOG_FPI
record saying the data should be forced out to disk. I'm less sure
what's to be done in the back branches. Flushing every HEAP_NEWPAGE
record isn't really an option.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

pgsql-hackers by date:

From: Heikki Linnakangas
Date: 26 March 2015, 17:16:31
Subject: Re: Index-only scans for GiST.

From: Peter Geoghegan
Date: 26 March 2015, 18:01:02
Subject: Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

Re: Error with index on unlogged table - Mailing list pgsql-hackers

Previous

Next