Re: [BUGS] BUG #6748: sequence value may be conflict in some cases - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [BUGS] BUG #6748: sequence value may be conflict in some cases
Date
Msg-id 4360.1343069014@sss.pgh.pa.us
Whole thread Raw
Responses Re: [BUGS] BUG #6748: sequence value may be conflict in some cases  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
meixiangming@huawei.com writes:
> [ freshly-created sequence has wrong state after crash ]

I didn't believe this at first, but sure enough, it fails just as
described if you force a crash between the first and second nextval
calls for the sequence.  This used to work ...

The change that broke it turns out to be the ALTER SEQUENCE OWNED BY
call that we added to serial-column creation circa 8.2; although on
closer inspection I think any ALTER SEQUENCE before the first nextval
call would be problematic.  The real issue is the ancient kluge in
sequence creation that writes something different into the WAL log
than what it leaves behind in shared buffers:
       /* We do not log first nextval call, so "advance" sequence here */       /* Note we are scribbling on local
tuple,not the disk buffer */       newseq->is_called = true;       newseq->log_cnt = 0;
 

The tuple in buffers has log_cnt = 1, is_called = false, but the initial
XLOG_SEQ_LOG record shows log_cnt = 0, is_called = true.  So if we crash
at this point, after recovery it looks like one nextval() has already
been done.  However, AlterSequence generates another XLOG_SEQ_LOG record
based on what's in shared buffers, so after replay of that, we're back
to the "original" state where it does not appear that any nextval() has
been done.

I'm of the opinion that this kluge needs to be removed; it's just insane
that we're not logging the same state we leave in our buffers.  To do
that, we need to fix nextval() so that the first nextval call generates
an xlog entry; that is, if we are changing is_called to true we ought to
consider that as a reason to force an xlog entry.  I think way back when
we thought it was a good idea to avoid making two xlog entries during
creation and immediate use of a sequence, but considering all the other
xlog entries involved in creation of a sequence object, this is a pretty
silly "optimization".  (Besides, it merely postpones the first
nextval-driven xlog entry from the first to the second nextval call.)
        regards, tom lane


pgsql-hackers by date:

Previous
From: Adam Crews
Date:
Subject: postgres 9 bind address for replication
Next
From: Robert Haas
Date:
Subject: Re: pgbench -i order of vacuum