Re: [GENERAL] Template zero xid issue - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [GENERAL] Template zero xid issue
Date
Msg-id 25870.1186510454@sss.pgh.pa.us
Whole thread Raw
Responses Re: [GENERAL] Template zero xid issue
List pgsql-hackers
[ redirected to -hackers for discussion of bug fix ]

"Keaton Adams" <kadams@mxlogic.com> writes:
> Some additional information I just received from one of the operations
> personnel:

> 1. I set pg_database.datallowconn to true on template0 in an attempt to
> log into the database with psql and perform a VACUUM there.  It was our
> last-ditch attempt to perform a vacuum against that database without
> shutting down postgres altogether, and it worked fine when I tested it
> in our ORT environment.

Hah, and in fact the log you sent me shows that that is exactly what
started the problem --- everything was fine until here:

<2007-08-06 09:42:50 MDT>LOG:  transaction ID wrap limit is 2147484146, limited by database "template0"
<2007-08-06 09:42:50 MDT>STATEMENT:  UPDATE pg_database SET datallowconn='t' WHERE datname='template0';
<2007-08-06 09:42:50 MDT>WARNING:  database "template0" must be vacuumed within 736811 transactions
<2007-08-06 09:42:50 MDT>HINT:  To avoid a database shutdown, execute a full-database VACUUM in "template0".
<2007-08-06 09:42:50 MDT>STATEMENT:  UPDATE pg_database SET datallowconn='t' WHERE datname='template0';
<2007-08-06 09:42:50 MDT>ERROR:  database is not accepting commands to avoid wraparound data loss in database
"template0"
<2007-08-06 09:42:50 MDT>HINT:  Stop the postmaster and use a standalone backend to vacuum database "template0".

So the whole thing boiled down to pilot error :-( ... if you hadn't
misinterpreted template0's age() reading as something you needed to
Do Something about, everything would have been fine.

FWIW, in 8.2 and up template0 is autovacuumed the same as every other
database, which hopefully will prevent people from making this sort
of mistake in the future.

That explains most of what I was wondering about in connection with the
log excerpts.  The remaining loose end was why, after successfully
stopping the bgwriter, the postmaster chose to restart everything:

<2007-08-06 09:44:07 MDT>LOG:  database system is shut down
<2007-08-06 09:44:07 MDT>FATAL:  the database system is shutting down
[ lots of these snipped ]
<2007-08-06 09:44:07 MDT>LOG:  background writer process (PID 5203) exited with exit code 0
<2007-08-06 09:44:07 MDT>LOG:  terminating any other active server processes
<2007-08-06 09:44:07 MDT>LOG:  all server processes terminated; reinitializing
<2007-08-06 09:44:07 MDT>LOG:  database system was shut down at 2007-08-06 09:44:07 MDT
<2007-08-06 09:44:07 MDT>FATAL:  the database system is shutting down

But I think I see what's going on there: the test for successful
system shutdown is
           if (exitstatus == 0 && Shutdown > NoShutdown && !FatalError &&               !DLGetHead(BackendList) &&
AutoVacPID== 0)
 

ie, we saw bgwriter do exit(0), and we are shutting down, and there
is no other activity in the system.  The problem with this apparently
is that BackendList wasn't empty, which is not surprising given the
storm of clients attempting to reconnect, as evidenced by all the
"system is shutting down" log messages.  Evidently, at the moment the
bgwriter exited, there was at least one child that had been forked but
hadn't yet finished telling its client to go away.

I think this needs to be fixed, because in principle a sufficiently
heavy load of connection requests could prevent the postmaster from
*ever* finishing the shutdown sequence --- after it's restarted and
re-stopped the bgwriter, it will come right back to this same test, and
if there are more transient children it would repeat the whole process.
Since we've not heard of any complaints of such happening in the field,
it probably needn't be back-patched, but I want a fix for 8.3.

I'm not entirely sure what's the most reasonable way to fix it though.
I can see a number of alternatives:

1. Remove the test on BackendList shown above.  This seems like a bad
idea as it puts us at risk of the postmaster shutting down prematurely,
should the bgwriter happen to exit(0) while there are still live
backends.

2. Stop paying attention to incoming connection requests once we've
begun the shutdown sequence, so that no new children will be spawned.
This is probably not acceptable, particularly in the case of a "smart"
shutdown which can go on for a long time; we really want would-be
clients to get some kind of response and not just hang waiting for
a connection.

3. Don't spawn a child process to reject connection requests once
we have begun shutdown, but have the postmaster just send the response
directly, much as it does in the case of fork failure.  This is not
too bad; the only real downside is that we can't allow the postmaster
to block on any one would-be client, and so we'd have to send the
error message on a one-try-and-give-up basis, the same as we do with the
fork failure case.  But the odds of failure for that are really pretty
low.

4. Keep spawning a child, but mark it in the BackendList as known
doomed, and don't count such children when deciding if it's OK to
terminate.  The problem with this idea is that such children will
still be connected to shared memory, and we really don't want to
terminate the postmaster before all connections to shmem are gone.
(This objection also applies to #1, now that I think about it.)

I'm sort of leaning to solution #3, but I wondered if anyone had
a different opinion or a better idea.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: HOT patch, missing things
Next
From: Tom Lane
Date:
Subject: Re: More logging for autovacuum