Thread: BUG #8059: sequence crash recovery is not working properly

BUG #8059: sequence crash recovery is not working properly

From
tarvip@gmail.com
Date:
The following bug has been logged on the website:

Bug reference:      8059
Logged by:          Tarvi Pillessaar
Email address:      tarvip@gmail.com
PostgreSQL version: 9.2.4
Operating system:   linux
Description:        =


Very simple example:

postgres@sbox /usr/local/pgsql $ /usr/local/pgsql/bin/psql test
psql (9.2.4)
Type "help" for help.

test=3D# create sequence s;
CREATE SEQUENCE
test=3D# begin;
BEGIN
test=3D# select nextval('s');
 nextval =

---------
       1
(1 row)


Now let's crash the cluster:

postgres@sbox /usr/local/pgsql $ pgrep -lf writer
13638 postgres: writer process
13639 postgres: wal writer process
postgres@sbox /usr/local/pgsql $ kill -9 13638
postgres@sbox /usr/local/pgsql $ tail logfile =

HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
LOG:  all server processes terminated; reinitializing
LOG:  database system was interrupted; last known up at 2013-04-12 14:28:26
EEST
LOG:  database system was not properly shut down; automatic recovery in
progress
LOG:  redo starts at 0/177C9E0
LOG:  record with zero length at 0/1791888
LOG:  redo done at 0/1791858
LOG:  last completed transaction was at log time 2013-04-12
14:29:48.562356+03
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started
postgres@sbox /usr/local/pgsql $ =




test=3D# select nextval('s');
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
test=3D# select nextval('s');
 nextval =

---------
       1
(1 row)

test=3D# =


This problem occurs on the same conditions as described in commit
af026b5d9b8ae6ef4c75a796bdac209df6411181.


For another example i'm going to use very simple python script:
import psycopg2
cur =3D psycopg2.connect('dbname=3Dtest host=3D/tmp/').cursor()
while True:
    try:
        cur.execute("select nextval('s')")
        val=3Dcur.fetchone()[0]
    except Exception, e:
        print "%s\n%s" % (val,e)
        break

This script consumes a lot of values from sequence 's' within a single
transaction, when crash occurs it prints out last value that the script got
from database.

postgres@sbox /usr/local/pgsql $ python test.py =

132119
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.

postgres@sbox /usr/local/pgsql $ /usr/local/pgsql/bin/psql test
psql (9.2.4)
Type "help" for help.

test=3D# select nextval('s');
 nextval =

---------
  130501
(1 row)

test=3D#

Re: BUG #8059: sequence crash recovery is not working properly

From
Andres Freund
Date:
On 2013-04-12 12:27:01 +0000, tarvip@gmail.com wrote:
> The following bug has been logged on the website:
>
> Bug reference:      8059
> Logged by:          Tarvi Pillessaar
> Email address:      tarvip@gmail.com
> PostgreSQL version: 9.2.4
> Operating system:   linux
> Description:
>
> Very simple example:
>
> postgres@sbox /usr/local/pgsql $ /usr/local/pgsql/bin/psql test
> psql (9.2.4)
> Type "help" for help.
>
> test=# create sequence s;
> CREATE SEQUENCE
> test=# begin;
> BEGIN
> test=# select nextval('s');
>  nextval
> ---------
>        1
> (1 row)
>
>
> Now let's crash the cluster:
>
> postgres@sbox /usr/local/pgsql $ pgrep -lf writer
> 13638 postgres: writer process
> 13639 postgres: wal writer process
> postgres@sbox /usr/local/pgsql $ kill -9 13638
> postgres@sbox /usr/local/pgsql $ tail logfile
> HINT:  In a moment you should be able to reconnect to the database and
> repeat your command.
> LOG:  all server processes terminated; reinitializing
> LOG:  database system was interrupted; last known up at 2013-04-12 14:28:26
> EEST
> LOG:  database system was not properly shut down; automatic recovery in
> progress
> LOG:  redo starts at 0/177C9E0
> LOG:  record with zero length at 0/1791888
> LOG:  redo done at 0/1791858
> LOG:  last completed transaction was at log time 2013-04-12
> 14:29:48.562356+03
> LOG:  database system is ready to accept connections
> LOG:  autovacuum launcher started
> postgres@sbox /usr/local/pgsql $

Thats caused by the fact that you didn't do anything that forces a WAL
flush since you didn't do any inserts or such. If you would do any
DML in the nextval() calling transaction this shouldn't happen (unless
you use synchronous_commit=off).
Now, there's a legitimate argument to be made that returning the
sequence value to the user kinda persists it. On the other hand, at
least in the first example you haven't even committed the transaction so
there's nothing that could flush the transaction unless we we would
*always* flush nextval() immediately if needs to get new values which
doesn't seem reasonable.

I think a reasonable compromise would be that nextval_internal()
acquires a real xid (by calling GetTopTransactionId()). In that case the
commit would force a WAL flush but there are no forced wal flushes ones
inside the transaction. And it would support async commits.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: BUG #8059: sequence crash recovery is not working properly

From
Tom Lane
Date:
Andres Freund <andres@2ndquadrant.com> writes:
> Thats caused by the fact that you didn't do anything that forces a WAL
> flush since you didn't do any inserts or such. If you would do any
> DML in the nextval() calling transaction this shouldn't happen (unless
> you use synchronous_commit=off).
> Now, there's a legitimate argument to be made that returning the
> sequence value to the user kinda persists it. On the other hand, at
> least in the first example you haven't even committed the transaction so
> there's nothing that could flush the transaction unless we we would
> *always* flush nextval() immediately if needs to get new values which
> doesn't seem reasonable.

I believe we've gone around on this before and decided to leave it
as-is.  The thinking is that any realistic use-case would involve
storing the nextval result into some table somewhere, so that additional
WAL and a commit record would be generated.

            regards, tom lane

Re: BUG #8059: sequence crash recovery is not working properly

From
Andres Freund
Date:
On 2013-04-12 10:47:52 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > Thats caused by the fact that you didn't do anything that forces a WAL
> > flush since you didn't do any inserts or such. If you would do any
> > DML in the nextval() calling transaction this shouldn't happen (unless
> > you use synchronous_commit=off).
> > Now, there's a legitimate argument to be made that returning the
> > sequence value to the user kinda persists it. On the other hand, at
> > least in the first example you haven't even committed the transaction so
> > there's nothing that could flush the transaction unless we we would
> > *always* flush nextval() immediately if needs to get new values which
> > doesn't seem reasonable.
>
> I believe we've gone around on this before and decided to leave it
> as-is.  The thinking is that any realistic use-case would involve
> storing the nextval result into some table somewhere, so that additional
> WAL and a commit record would be generated.

Hm. I couldn't find anything about that discussions after searching some
minutes. I am not really sure if I can follow that argument, its not
uncommon to use the database sequences to prepare data thats then
inserted lateron. In a separate transaction and possibly even in a
separate connection. In the latter case you wouldn't even notice the
database has restarted.
I don't think flushing the WAL immediately in nextval_internal is a
realistic thing to do, but making it generate an xid if its doing WAL
logging sounds like a compromise with acceptable overhead. In the normal
usage where you use to directly insert it won't have any additional
overhead since the xid will have to be generated anyway and if you use
it in a separate transaction/connection you want the flush at commit.
Sure, if somebody loops around SELECT nextval(...); without using a
transaction that will increase the overhead, but that doesn't sound like
a terribly good idea anyway.

On a tangent: I'd still really like to have nextval(...,
how_many_values) or such...

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: BUG #8059: sequence crash recovery is not working properly

From
Tarvi Pillessaar
Date:
On 12.04.2013 17:04, Andres Freund wrote:
> On the other hand, at
> least in the first example you haven't even committed the transaction so
> there's nothing that could flush the transaction unless we we would
> *always*  flush nextval() immediately if needs to get new values which
> doesn't seem reasonable.
Yes, i know that i haven't commited anything and also i agree that
flushing always is not reasonable.
I'm not sure that i understand all the details in sequence.c, but there
seems to be some sort of cache used for sequences, so instead of
flushing always maybe flush should only happen if next batch of values
are written to cache.

Regards,
Tarvi Pillessaar

Re: BUG #8059: sequence crash recovery is not working properly

From
Andres Freund
Date:
On 2013-04-12 20:02:07 +0300, Tarvi Pillessaar wrote:
> On 12.04.2013 17:04, Andres Freund wrote:
> >On the other hand, at
> >least in the first example you haven't even committed the transaction so
> >there's nothing that could flush the transaction unless we we would
> >*always*  flush nextval() immediately if needs to get new values which
> >doesn't seem reasonable.
> Yes, i know that i haven't commited anything and also i agree that flushing
> always is not reasonable.
> I'm not sure that i understand all the details in sequence.c, but there
> seems to be some sort of cache used for sequences, so instead of flushing
> always maybe flush should only happen if next batch of values are written to
> cache.

Flushing everytime the cache is used up will have a far too high
overhead. Consider bulk loading new data into a table. If we would flush
wal every 32 new rows that would be disastrous from a performance
perspective.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services