On Mon, 2021-11-22 at 14:57 +0800, Andy Fan wrote:
> Should we guarantee the sequence's nextval should never be rolled back
> even in a crashed recovery case?
> I can produce the rollback in the following case:
>
> Session 1:
> CREATE SEQUENCE s;
> BEGIN;
> SELECT nextval('s'); \watch 0.01
>
> Session 2:
> kill -9 {sess1.pid}
>
> After the restart, the nextval('s') may be rolled back (less than the
> last value from session 1).
>
> The reason is because we never flush the xlog for the nextval_internal
> for the above case. So if
> the system crashes, there is nothing to redo from. It can be fixed
> with the following online change
> code.
>
> @@ -810,6 +810,8 @@ nextval_internal(Oid relid, bool check_permissions)
> recptr = XLogInsert(RM_SEQ_ID, XLOG_SEQ_LOG);
>
> PageSetLSN(page, recptr);
> +
> + XLogFlush(recptr);
> }
>
>
> If a user uses sequence value for some external systems, the
> rollbacked value may surprise them.
> [I didn't run into this issue in any real case, I just studied xlog /
> sequence stuff today and found this case].
I think that is a bad idea.
It will have an intolerable performance impact on OLTP queries, doubling
the number of I/O requests for many cases.
Perhaps it would make sense to document that you should never rely on
sequence values from an uncommitted transaction.
Yours,
Laurenz Albe