Thread: Sequences - jumped after power failure
Is it possible for a whole set of sequences to suddenly 'jump'?
I have a set of claims tables that cover the claim itself, the customer, contact points etc. Yesterday there was a power failure and the server suffered an immediate power outage. When the server came back, everything seemed fine, apart from the fact that the claim related sequences had all jumped and left a gap of 33 (last was 52 before power failure, next one allocated after power failure 85). This seems consistent across all the tables related to the claim (it may be across the tables in the database - I haven't checked all of them as yet).
Does this sound feasible and if so, what is the cause?
I have a set of claims tables that cover the claim itself, the customer, contact points etc. Yesterday there was a power failure and the server suffered an immediate power outage. When the server came back, everything seemed fine, apart from the fact that the claim related sequences had all jumped and left a gap of 33 (last was 52 before power failure, next one allocated after power failure 85). This seems consistent across all the tables related to the claim (it may be across the tables in the database - I haven't checked all of them as yet).
Does this sound feasible and if so, what is the cause?
On Tue, Apr 15, 2008 at 4:21 AM, Steve T <steve@retsol.co.uk> wrote: > > Is it possible for a whole set of sequences to suddenly 'jump'? > > I have a set of claims tables that cover the claim itself, the customer, > contact points etc. Yesterday there was a power failure and the server > suffered an immediate power outage. When the server came back, everything > seemed fine, apart from the fact that the claim related sequences had all > jumped and left a gap of 33 (last was 52 before power failure, next one > allocated after power failure 85). This seems consistent across all the > tables related to the claim (it may be across the tables in the database - > I haven't checked all of them as yet). > > Does this sound feasible and if so, what is the cause? One explanation: if there were uncommitted transactions at the time of the power failure, the sequence would have been advanced, but the corresponding rows would not have entered the database. Sean
Sean,
I thought that at first, but there are only a half dozen or so people on the system. So I would have taken a gap of 3-4 of 'transactions in progress', but the 33 gap is far too big for that - unless its a caching issue. I have had no reports of missing data though (and the numbers on the physical data would seem to echo that it's all ok - ie the 52 rec is pre crash by about 2-3 mins and the 85 rec is just after the restart). Typically these tables increase by 30-50 rows a day - so a gap of 33 is a whole days worth!
I've checked the code this morning and can only find 2 sets of inserts into the tables in question - one in the manual entry and one in a batch process. So I checked the batches around that time and nothing was transferred.
Totally stumped. I could also have taken 'corruption' on one of the sequences, but I must have 5 sequences (all related tables in this area) that all exhibit the same 33 gap. I'm obviously missing something obvious here, but I just can't see it.
PS the version of PostgreSQL is a bit old - its an 8.0.3
On Tue, 2008-04-15 at 05:47 -0400, Sean Davis wrote:
I thought that at first, but there are only a half dozen or so people on the system. So I would have taken a gap of 3-4 of 'transactions in progress', but the 33 gap is far too big for that - unless its a caching issue. I have had no reports of missing data though (and the numbers on the physical data would seem to echo that it's all ok - ie the 52 rec is pre crash by about 2-3 mins and the 85 rec is just after the restart). Typically these tables increase by 30-50 rows a day - so a gap of 33 is a whole days worth!
I've checked the code this morning and can only find 2 sets of inserts into the tables in question - one in the manual entry and one in a batch process. So I checked the batches around that time and nothing was transferred.
Totally stumped. I could also have taken 'corruption' on one of the sequences, but I must have 5 sequences (all related tables in this area) that all exhibit the same 33 gap. I'm obviously missing something obvious here, but I just can't see it.
PS the version of PostgreSQL is a bit old - its an 8.0.3
On Tue, 2008-04-15 at 05:47 -0400, Sean Davis wrote:
On Tue, Apr 15, 2008 at 4:21 AM, Steve T <steve@retsol.co.uk> wrote: > > Is it possible for a whole set of sequences to suddenly 'jump'? > > I have a set of claims tables that cover the claim itself, the customer, > contact points etc. Yesterday there was a power failure and the server > suffered an immediate power outage. When the server came back, everything > seemed fine, apart from the fact that the claim related sequences had all > jumped and left a gap of 33 (last was 52 before power failure, next one > allocated after power failure 85). This seems consistent across all the > tables related to the claim (it may be across the tables in the database - > I haven't checked all of them as yet). > > Does this sound feasible and if so, what is the cause? One explanation: if there were uncommitted transactions at the time of the power failure, the sequence would have been advanced, but the corresponding rows would not have entered the database. Sean
Maybe its because the sequence gets a batch of IDs in the cache and discards it with a crash? So unused sequence-ID get lost, while the sequence is reset to the number of the last really used ID during a normal shutdown? Just a thought.
Steve T <steve@retsol.co.uk> writes: > I have a set of claims tables that cover the claim itself, the customer, > contact points etc. Yesterday there was a power failure and the server > suffered an immediate power outage. When the server came back, > everything seemed fine, apart from the fact that the claim related > sequences had all jumped and left a gap of 33 (last was 52 before power > failure, next one allocated after power failure 85). This seems > consistent across all the tables related to the claim (it may be across > the tables in the database - I haven't checked all of them as yet). > Does this sound feasible and if so, what is the cause? Yeah, this is intentional behavior designed to reduce the amount of disk write traffic generated by nextval()s. From a standing start, a nextval() actually advances the sequence 33 times (1 + SEQ_LOG_VALS), so that the next 32 nextval()s won't need to generate their own WAL records. I guess you must have crashed before that first nextval() was able to commit its result into the database ... regards, tom lane
Emil,
That would tend to make sense given the results I am seeing. I am/was just paranoid that I've lost data - but everything says not an that its just a blip in the sequencing.
On Tue, 2008-04-15 at 12:50 +0200, Emil Obermayr wrote:
That would tend to make sense given the results I am seeing. I am/was just paranoid that I've lost data - but everything says not an that its just a blip in the sequencing.
On Tue, 2008-04-15 at 12:50 +0200, Emil Obermayr wrote:
Maybe its because the sequence gets a batch of IDs in the cache and discards it with a crash? So unused sequence-ID get lost, while the sequence is reset to the number of the last really used ID during a normal shutdown? Just a thought.
Cheers Tom.
I'll note that down under the section '...things to note should the server crash...' - just under the note that says '...get a UPS...'
On Tue, 2008-04-15 at 10:58 -0400, Tom Lane wrote:
I'll note that down under the section '...things to note should the server crash...' - just under the note that says '...get a UPS...'
On Tue, 2008-04-15 at 10:58 -0400, Tom Lane wrote:
Steve T <steve@retsol.co.uk> writes: > I have a set of claims tables that cover the claim itself, the customer, > contact points etc. Yesterday there was a power failure and the server > suffered an immediate power outage. When the server came back, > everything seemed fine, apart from the fact that the claim related > sequences had all jumped and left a gap of 33 (last was 52 before power > failure, next one allocated after power failure 85). This seems > consistent across all the tables related to the claim (it may be across > the tables in the database - I haven't checked all of them as yet). > Does this sound feasible and if so, what is the cause? Yeah, this is intentional behavior designed to reduce the amount of disk write traffic generated by nextval()s. From a standing start, a nextval() actually advances the sequence 33 times (1 + SEQ_LOG_VALS), so that the next 32 nextval()s won't need to generate their own WAL records. I guess you must have crashed before that first nextval() was able to commit its result into the database ... regards, tom lane
|