Re: [BUGS] Bug #613: Sequence values fall back to previously chec - Mailing list pgsql-hackers

From Ben Grimm
Subject Re: [BUGS] Bug #613: Sequence values fall back to previously chec
Date
Msg-id 20020314164739.A14065@zaeon.com
Whole thread Raw
In response to Re: [BUGS] Bug #613: Sequence values fall back to previously chec  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, 14 Mar 2002, Tom Lane wrote:
> 
> If you do a SELECT nextval() and then use the returned value externally
> *without waiting for a commit acknowledgement*, then I think you are
> risking trouble; there's no guarantee that the WAL record (if one is
> needed) has hit disk yet, and so a crash could roll back the sequence.
> 
> This isn't an issue for a SELECT nextval() standing on its own ---
> AFAIK the result will not be transmitted to the client until after the
> commit happens.  But it would be an issue for a select executed inside
> a transaction block (begin/commit).
> 

The behavior of SELECT nextval() should not be conditional on being in or 
out of a transaction block.  What you implying by saying that the data is 
that it would be possible to rollback an uncommited call to nextval().
Am I missing some terminology?  

I think I finally realized why my old patch that forces a log right off 
the bat works to fix at least part of the problem.  When the database 
is shutdown properly all of the sequences that are in memory are written 
back to disk in their state at that time.   But the problem with that is
that their state at that time can have log_cnt > 0.  This is why after
startup that the sequence in memory is 'behind' the one on disk, the 
code sees log > fetch and doesn't log.  When you really think about it
log_cnt should not be part of the sequence record at all since there
is never a valid case for storing a log_cnt on disk with a value other 
than 0.  

Maybe the purpose for the on disk value of log_cnt should be changed?
It could be the value used in place of the static SEQ_LOG_VALS, which 
could then be definable on a per sequence basis.  And then log_cnt 
could be moved into elm->log_cnt.  Anyway, that's just a thought.

Here's my latest patch to work around the problem.  If there is a way 
to prevent log_cnt from being written out with a value greater than
zero, that would be better than this.  With this behavior log_cnt is 
reset to 0 each time a backend accesses a sequence for the first time.
That's probably overkill...  But I still believe that the XLogFlush 
after XLogInsert is necessary to ensure that the WAL value is written 
to disk immediately.  In my testing this patch works fine, YMMV.

-- Ben



*** src/backend/commands/sequence.c.orig        Tue Mar 12 18:58:55 2002
--- src/backend/commands/sequence.c     Thu Mar 14 17:34:25 2002
***************
*** 62,67 ****
--- 62,68 ----       int64           cached;       int64           last;       int64           increment;
+       bool            reset_logcnt;       struct SeqTableData *next; } SeqTableData;

***************
*** 270,275 ****
--- 271,277 ----
               PageSetLSN(page, recptr);               PageSetSUI(page, ThisStartUpID);
+               XLogFlush(recptr);       }       END_CRIT_SECTION();

***************
*** 314,321 ****               PG_RETURN_INT64(elm->last);       }

!       seq = read_info("nextval", elm, &buf);          /* lock page' buffer and
!                                                                                                * read tuple */
       last = next = result = seq->last_value;       incby = seq->increment_by;
--- 316,322 ----               PG_RETURN_INT64(elm->last);       }

!       seq = read_info("nextval", elm, &buf);          /* lock page' buffer and read tuple */
       last = next = result = seq->last_value;       incby = seq->increment_by;
***************
*** 331,339 ****               log--;       }

!       if (log < fetch)       {
!               fetch = log = fetch - log + SEQ_LOG_VALS;               logit = true;       }

--- 332,340 ----               log--;       }

!       if (log < fetch)       {
!               fetch = log = fetch - log + SEQ_LOG_VALS * cache;               logit = true;       }

***************
*** 403,409 ****               rdata[0].data = (char *) &xlrec;               rdata[0].len = sizeof(xl_seq_rec);
      rdata[0].next = &(rdata[1]);
 
!               seq->last_value = next;               seq->is_called = true;               seq->log_cnt = 0;
--- 404,410 ----               rdata[0].data = (char *) &xlrec;               rdata[0].len = sizeof(xl_seq_rec);
      rdata[0].next = &(rdata[1]);
 
!               seq->last_value = next;               seq->is_called = true;               seq->log_cnt = 0;
***************
*** 417,423 ****
               PageSetLSN(page, recptr);               PageSetSUI(page, ThisStartUpID);
!               if (fetch)                              /* not all numbers were fetched */                       log -=
fetch;      }
 
--- 418,425 ----
               PageSetLSN(page, recptr);               PageSetSUI(page, ThisStartUpID);
!               XLogFlush(recptr);
!               if (fetch)                              /* not all numbers were fetched */                       log -=
fetch;      }
 
***************
*** 507,513 ****               XLogRecPtr      recptr;               XLogRecData rdata[2];               Page
page = BufferGetPage(buf);
 
!               xlrec.node = elm->rel->rd_node;               rdata[0].buffer = InvalidBuffer;
rdata[0].data= (char *) &xlrec;
 
--- 509,516 ----               XLogRecPtr      recptr;               XLogRecData rdata[2];               Page
page = BufferGetPage(buf);
 
!
!               xlrec.node = elm->rel->rd_node;               rdata[0].buffer = InvalidBuffer;
rdata[0].data= (char *) &xlrec;
 
***************
*** 527,532 ****
--- 530,536 ----
               PageSetLSN(page, recptr);               PageSetSUI(page, ThisStartUpID);
+               XLogFlush(recptr);       }       /* save info in sequence relation */       seq->last_value = next;
   /* last fetched number */
 
***************
*** 660,665 ****
--- 664,674 ----
       seq = (Form_pg_sequence) GETSTRUCT(&tuple);

+       if (elm->reset_logcnt)
+       {
+               seq->log_cnt = 0;
+               elm->reset_logcnt = false;
+       }       elm->increment = seq->increment_by;
       return seq;
***************
*** 703,708 ****
--- 712,718 ----                                name, caller);                       elm->relid =
RelationGetRelid(seqrel);                      elm->cached = elm->last = elm->increment = 0;
 
+                       elm->reset_logcnt = true;               }       }       else
***************
*** 721,726 ****
--- 731,737 ----               elm->relid = RelationGetRelid(seqrel);               elm->cached = elm->last =
elm->increment= 0;               elm->next = (SeqTable) NULL;
 
+               elm->reset_logcnt = true;
               if (seqtab == (SeqTable) NULL)                       seqtab = elm;





pgsql-hackers by date:

Previous
From: "Tom Pfau"
Date:
Subject: Re: [BUGS] Bug #613: Sequence values fall back to previously chec
Next
From: 'Ben Grimm'
Date:
Subject: Re: [BUGS] Bug #613: Sequence values fall back to previously chec