Fundamental error in "no WAL log" index/file creation stuff - Mailing list pgsql-hackers

From Tom Lane
Subject Fundamental error in "no WAL log" index/file creation stuff
Date
Msg-id 28884.1119727671@sss.pgh.pa.us
Whole thread Raw
Responses Re: Fundamental error in "no WAL log" index/file creation stuff
List pgsql-hackers
I believe I have figured out the problem behind the recent reports we've
been seeing of "index is not a btree" errors.  Here's how to reproduce
it (in 8.0 or HEAD):

-- drop database test if present
checkpoint;
create database test;
\c test
create table t1 ( name varchar(20) primary key );
-- kill -9 either current session or bgwriter, to force XLOG replay
-- now reconnect to test
vacuum t1;
ERROR:  index "t1_pkey" is not a btree

On investigation, the filesystem shows t1_pkey exists but has size 0.

The reason for this is that the only entry in the XLOG concerning
t1_pkey is an "smgr create" record --- we didn't XLOG any of the
data inserted into the index, and particularly not the metapage.

Why is that a problem, if we fsynced the index?  Because *replay of
CREATE DATABASE drops and recreates the entire database directory*.
This is not trivial to avoid, because the only way to generate the
new database is to copy from another database, and it's very hard
to tell what to copy if we want it done selectively.

It seems our choices are (a) somehow fix things so CREATE DATABASE
replay doesn't have to zap the whole directory, (b) force a checkpoint
immediately after any CREATE DATABASE, so that we never have to replay
one except in a PITR situation, or (c) abandon non-WAL-logged index
and table builds.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: autovacuum bootstrap
Next
From: Tom Lane
Date:
Subject: Re: autovacuum bootstrap