Thread: SEQUENCE values (duplicated) in some corner cases when crash happens

SEQUENCE values (duplicated) in some corner cases when crash happens

From
Vinicius Abrahao
Date:
Hello PG Hackers

Hope you are well and safe!

I'm opening this thread to clarify something that I can observe: duplicated values for sequences.

[My understanding is that duplication is not something we desire. In fact it does NOT happen in the majority of cases, for example, when you immediately insert the value and commit it. But it can eventually happen in some specific scenarios...describe below]

The duplication can be observed when you only makes use of "nextval" (which calls sequence.c / nextval_internal function) without - inserting it for any reason - and the database crashes.
This is reproducible using the steps described on this link:

https://gist.github.com/vinnix/2fe148e3c42e11269bac5fcc5c78a8d1


There are two variants where this is reproducible (steps on the link above):
  • Autocommit + Suspending I/O (simulating a storage issue)
  • Explicitly opening transaction + Not Suspending I/O
** To simulate the "crash" I'm running: `killall -9 postgres`.

I've being debugging sequence.c code, I can see we only want to flush the WAL segments once every 32 requests (per connection) - this is defined by SEQ_LOG_VALS. Since, these values are "persisted" in the WAL file containing the "advanced"/future values; not the current value being retrieved by nextval_internal().

What I'm trying to understand now if this is a "bug" or a "new feature"...

Kind regards,
Vini

Re: SEQUENCE values (duplicated) in some corner cases when crashhappens

From
Alvaro Herrera
Date:
On 2020-Apr-29, Vinicius Abrahao wrote:

> Hello PG Hackers
> 
> Hope you are well and safe!
> 
> I'm opening this thread to clarify something that I can observe: duplicated
> values for sequences.
> 
> [My understanding is that duplication is not something we desire. In fact
> it does NOT happen in the majority of cases, for example, when you
> immediately insert the value and commit it. But it can eventually happen in
> some specific scenarios...describe below]

Hi Vinicius

I'm not sure that a sequence that produces the same value twice, without
writing it to the database the first time, and with an intervening crash
in between, is necessarily a bug that we care to fix.  Especially so if
the fix will cause a large performance regression for the normal case
where the sequence value is written to the DB by a committed transaction.

(I think you could also cause the same problem with an async-commit
transaction that does write the value to the database, but whose writes
are lost in the crash ... since the WAL record for the sequence would
also be lost.)

Is there some valid reason to be interested in that scenario?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2020-Apr-29, Vinicius Abrahao wrote:
>> I'm opening this thread to clarify something that I can observe: duplicated
>> values for sequences.

> I'm not sure that a sequence that produces the same value twice, without
> writing it to the database the first time, and with an intervening crash
> in between, is necessarily a bug that we care to fix.  Especially so if
> the fix will cause a large performance regression for the normal case
> where the sequence value is written to the DB by a committed transaction.

I believe this behavior is 100% intentional: the advance of the sequence
value is logged to WAL, but we don't guarantee to make the WAL entry
persistent until the calling transaction commits.  And while I'm too
lazy to check right now, I think the calling transaction might've had
to cause some additional non-sequence-object updates to happen on disk,
too, else we won't think it has done anything that needs committing.

As you say, doing something different would entail a large performance
penalty for a rather dubious semantic requirement.  The normal expectation
is that we have to protect sequence values that get written into tables
someplace.

            regards, tom lane



Re: SEQUENCE values (duplicated) in some corner cases when crashhappens

From
Jeremy Schneider
Date:
On 4/29/20 11:02, Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>> On 2020-Apr-29, Vinicius Abrahao wrote:
>>> I'm opening this thread to clarify something that I can observe: duplicated
>>> values for sequences.
> 
>> I'm not sure that a sequence that produces the same value twice, without
>> writing it to the database the first time, and with an intervening crash
>> in between, is necessarily a bug that we care to fix.  Especially so if
>> the fix will cause a large performance regression for the normal case
>> where the sequence value is written to the DB by a committed transaction.
> 
> I believe this behavior is 100% intentional: the advance of the sequence
> value is logged to WAL, but we don't guarantee to make the WAL entry
> persistent until the calling transaction commits.  And while I'm too
> lazy to check right now, I think the calling transaction might've had
> to cause some additional non-sequence-object updates to happen on disk,
> too, else we won't think it has done anything that needs committing.
The behavior we're observing is that a nextval() call in a committed
transaction is not crash-safe. This was discovered because some
applications were using nextval() to get a guaranteed unique sequence
number [or so they thought], then the application did some processing
with the value and later stored it in a relation of the same database.

The nextval() number was not used until the transaction was committed -
but then the fact of a value being generated, returned and committed was
lost on crash. The nextval() call used in isolation did not seem to
provide durability.


> As you say, doing something different would entail a large performance
> penalty for a rather dubious semantic requirement.  The normal expectation
> is that we have to protect sequence values that get written into tables
> someplace.

Whether or not it's dubious is in the eye of the beholder I guess; in
Oracle I believe the equivalent use of sequences provides the usual
durability guarantees. Probably that's why this particular user was
surprised at the behavior. If PostgreSQL isn't going to provide
durability for isolated use of sequences, then IMO that's fine but the
fact should at least be in the documentation.

-Jeremy


-- 
Jeremy Schneider
Database Engineer
Amazon Web Services



On Wed, May 6, 2020 at 1:52 PM Jeremy Schneider <schnjere@amazon.com> wrote:
 
The behavior we're observing is that a nextval() call in a committed
transaction is not crash-safe. This was discovered because some
applications were using nextval() to get a guaranteed unique sequence
number [or so they thought], then the application did some processing
with the value and later stored it in a relation of the same database.

The nextval() number was not used until the transaction was committed -

I don't know what this line means.  You said it was stored in a relation, surely that needs to have happened through some command which preceded the commit chronologically, though formally they may have happened atomically.
 
but then the fact of a value being generated, returned and committed was
lost on crash. The nextval() call used in isolation did not seem to
provide durability.

Are you clarifying the original complaint, or this a new, different complaint? Vini's test cases don't include any insertions.  Do you have test cases that can reproduce your complaint?

Cheers,

Jeff

Re: SEQUENCE values (duplicated) in some corner cases when crash happens

From
Jeremy Schneider
Date:
On 5/14/20 14:58, jeff.janes@gmail.com wrote:

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


On Wed, May 6, 2020 at 1:52 PM Jeremy Schneider <schnjere@amazon.com> wrote:
 
The behavior we're observing is that a nextval() call in a committed
transaction is not crash-safe. This was discovered because some
applications were using nextval() to get a guaranteed unique sequence
number [or so they thought], then the application did some processing
with the value and later stored it in a relation of the same database.

The nextval() number was not used until the transaction was committed -

I don't know what this line means.  You said it was stored in a relation, surely that needs to have happened through some command which preceded the commit chronologically, though formally they may have happened atomically.

"Later stored it in the table" - I'd have to double check with the other team, but IIUC it was application pseudo-code like this:
  • execute SQL "select nextval()" and store result in my_local_variable_unique_id
  • commit
  • do some processing, tracing, logging, etc identified with my_local_variable_unique_id
  • execute SQL "insert into mytable values(my_local_variable_unique_id, data1, data2)"
  • commit

They weren't expecting that they could get duplicates from a sequence, which leads to unique violations and other problems later.  Maybe a workaround is doing some kind of dummy insert or update or something in the transaction that gets a sequence value.


 
but then the fact of a value being generated, returned and committed was
lost on crash. The nextval() call used in isolation did not seem to
provide durability.

Are you clarifying the original complaint, or this a new, different complaint? Vini's test cases don't include any insertions.  Do you have test cases that can reproduce your complaint?

Clarification of same issue, not a new issue.

Tom also has said as much in his email - he said it's quite plausible that sequences used in isolation aren't crash safe.  I just think we should document it; I'll work on a proposal/doc-update-patch for everyone to bikeshed on when I have a few minutes  :)

-Jeremy


-- 
Jeremy Schneider
Database Engineer
Amazon Web Services

Re: SEQUENCE values (duplicated) in some corner cases when crashhappens

From
Alvaro Herrera
Date:
On 2020-May-14, Jeremy Schneider wrote:

> "Later stored it in the table" - I'd have to double check with the other
> team, but IIUC it was application pseudo-code like this:
> 
>   * execute SQL "select nextval()" and store result in
>     my_local_variable_unique_id
>   * commit

Yes, simply inserting the sequence value in a (logged!) dummy table
before this commit, as you suggest, should fix this problem.  The insert
ensures that the transaction commit is flushed to WAL.  The table need
not have indexes, making the insert faster.  Just make sure to truncate
the table every now and then.

+1 to documenting this.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services