Thread: BUG #16614: Stale temporary objects makes vacuum ineffective when 1 million transactions remain

BUG #16614: Stale temporary objects makes vacuum ineffective when 1 million transactions remain

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      16614
Logged by:          Giorgio Saviane
Email address:      gsaviane@gmail.com
PostgreSQL version: 9.5.23
Operating system:   Linux
Description:

We noticed the same issue described below occurring multiple times in some
of our many Postgres 9.5 deployments.

The application makes extensive use of temporary tables. Although there are
scheduled activities for regularly vacuuming the database, after some months
of uptime the database gets into the 1 million transactions limit. Even
though we perform a manual maintenance by vacuuming in single user mode, the
server keeps returning the same warning and hint:

WARNING:  database "mydb" must be vacuumed within (X<1000000) transactions
HINT:  To avoid a database shutdown, execute a database-wide VACUUM in
"mydb".

Any further vacuum command issued completes but the warning keeps showing
up, and the remaining transactions are decremented.

Only by removing all the stale temporary objects with

 DO $$ 
  DECLARE rec RECORD; 
  BEGIN FOR rec IN 
    SELECT 'DROP schema ' || nspname || ' CASCADE' AS stmt 
    FROM pg_catalog.pg_namespace 
    WHERE nspname LIKE 'pg_temp%' OR nspname LIKE 'pg_toast_temp%' 
      LOOP 
        EXECUTE rec.stmt; 
      END LOOP; 
  END;
$$

Makes the vacuum command complete without warnings.

We think that stale temporary objects are somehow blocking the vacuuming of
the database.


PG Bug reporting form <noreply@postgresql.org> writes:
> We noticed the same issue described below occurring multiple times in some
> of our many Postgres 9.5 deployments.
> The application makes extensive use of temporary tables. Although there are
> scheduled activities for regularly vacuuming the database, after some months
> of uptime the database gets into the 1 million transactions limit.

Couple of possibilities here ...

Autovacuum, and manual vacuum too, cannot do anything with temp tables
belonging to other sessions.  (It's physically impossible because the
live data for a temp table is likely to only exist inside the memory
of the owning session.)  So, if you have temp tables that survive long
enough to need vacuuming because of wraparound considerations, then
the owning session has to do that itself.

Alternatively, maybe you don't have any temp-table-owning sessions
that live that long, but for some reason some temp tables failed to
get dropped when their owning session exited.  (AFAIK this could only
happen as part of a database crash --- is that a common occurrence
for you?)  They're still there and still holding back the global XID
horizon, so the only way to fix things is to forcibly drop them.

PG versions later than 9.5 have added increasingly aggressive rules for
automatically dropping such orphan temp tables.  From memory, 9.5 does
have some behavior of that sort; but it's evidently not firing for you.
What you'll need to do is find the old tables (look for pg_class entries
with large values of age(relfrozenxid)) and manually DROP those tables.

I would not recommend the wholesale removal of temp schemas shown in
your script, because that carries too much risk of removing tables
that are in active use.

            regards, tom lane



On Thu, Sep 10, 2020 at 10:43:54AM -0400, Tom Lane wrote:
> Alternatively, maybe you don't have any temp-table-owning sessions
> that live that long, but for some reason some temp tables failed to
> get dropped when their owning session exited.  (AFAIK this could only
> happen as part of a database crash --- is that a common occurrence
> for you?)  They're still there and still holding back the global XID
> horizon, so the only way to fix things is to forcibly drop them.
>
> PG versions later than 9.5 have added increasingly aggressive rules for
> automatically dropping such orphan temp tables.  From memory, 9.5 does
> have some behavior of that sort; but it's evidently not firing for you.
> What you'll need to do is find the old tables (look for pg_class entries
> with large values of age(relfrozenxid)) and manually DROP those tables.

Yeah.  11 and newer versions have been made even more aggressive with
the cleanup of orphaned tables in autovacuum, particularly the case
where a backend reuses the ID of a past session that crashed, leaving
behind some temporary tables.  Perhaps that was the case here?

> I would not recommend the wholesale removal of temp schemas shown in
> your script, because that carries too much risk of removing tables
> that are in active use.

Dropping directly a temporary schema would put sessions still relying
on them in geopardy, as sessions that have created at least one
temporary table store statically the namespace OID of the temp
schema.  The result will get fun if you create some temp tables
afterwards on those live sessions, and unlikely the result you'd
expect, but the operation can be useful in some cases as that's a
no-brainer, and only a superuser can do that.  Dropping directly the
tables from another session would be fine.
--
Michael

Attachment
Michael Paquier <michael@paquier.xyz> writes:
> Yeah.  11 and newer versions have been made even more aggressive with
> the cleanup of orphaned tables in autovacuum, particularly the case
> where a backend reuses the ID of a past session that crashed, leaving
> behind some temporary tables.  Perhaps that was the case here?

Off-list, the OP indicated that the problem is actually of the other
category, ie the problematic tables belong to extremely long-lived
sessions that are managed by a connection pooler.  I don't quite
understand why the pooler isn't issuing DISCARD ALL between clients,
but anyway it seems that PG is operating as designed here.

            regards, tom lane



On Fri, Sep 11, 2020 at 10:21:21AM -0400, Tom Lane wrote:
> Off-list, the OP indicated that the problem is actually of the other
> category, ie the problematic tables belong to extremely long-lived
> sessions that are managed by a connection pooler.  I don't quite
> understand why the pooler isn't issuing DISCARD ALL between clients,
> but anyway it seems that PG is operating as designed here.

Oh, OK, thanks!  That was not mentioned originally so I got confused.
--
Michael

Attachment
The thing is that we realized this only when the wraparound stop limit was reached, that seems an inevitable cul-de-sac if your application is leaking temporary tables. Even worse, when you try to vacuum in single user mode there is nothing that tells you "drop those bloody temp tables". It keeps telling you that less than 1 mln transactions are remaining, without explaining why. We lost an entire database of one of our customers before being aware of the fact that vacuum is ineffective with temp tables.
The question is: can you implement some warning thrown by vacuum/autovacuum in case orphaned temporary tables are in the way?

The connection pooler we are using is the one provided by Tomcat 7. I'm gonna check if it supports DISCARD ALL on close.

Kind regards

P.S. sorry for replying here, couldn't get how to reply on the thread publicly.

Il giorno ven 11 set 2020 alle ore 16:21 Tom Lane <tgl@sss.pgh.pa.us> ha scritto:
Michael Paquier <michael@paquier.xyz> writes:
> Yeah.  11 and newer versions have been made even more aggressive with
> the cleanup of orphaned tables in autovacuum, particularly the case
> where a backend reuses the ID of a past session that crashed, leaving
> behind some temporary tables.  Perhaps that was the case here?

Off-list, the OP indicated that the problem is actually of the other
category, ie the problematic tables belong to extremely long-lived
sessions that are managed by a connection pooler.  I don't quite
understand why the pooler isn't issuing DISCARD ALL between clients,
but anyway it seems that PG is operating as designed here.

                        regards, tom lane
On Sat, Sep 12, 2020 at 10:59 AM Giorgio Saviane <gsaviane@gmail.com> wrote:
....
> The question is: can you implement some warning thrown by vacuum/autovacuum in case orphaned temporary tables are in
theway?
 
> The connection pooler we are using is the one provided by Tomcat 7. I'm gonna check if it supports DISCARD ALL on
close.

From what I've read the thing is, if the pooler does not work
correctly ( by issuing discard all or otherwise taking care of
temporary tables ), there are few things as the tables are not
orphaned on the server side.

What the server sees is someone connects, creates the temporary
tables, issues a lot of other commands but does not  disconnect, so
the tables must be kept alive in case the the session wants to use
them again. DISCARD ALL will make them orphans.

Francisco Olarte.



Let me clarify.
The sequence of events that led our databases to the wraparound issue is the following:
  1. The application creates temporary tables that should be by default cleaned at the end of the session. As a matter of fact the pooler doesn't actually close the session, neither - apparently - issues a DISCARD ALL, creating leaks among sessions.
  2. Since sessions are kept alive with stale temporary tables the autovacuum process cannot do maintenance on transaction ids.
  3. Since we have no evidence on what is going on, the database reaches the 1 million stop limit to avoid transaction wraparound.
  4. The application gets then out of order. We stop the Postgres server gracefully and we enter single user mode to run the vacuum
  5. Any attempt to vacuum the database ends up with the usual "WARNING: database "mydb" must be vacuumed within XXX transactions" message, revealing that the vacuum could not fix up the transaction ids.
  6. By running this SQL (still in single user mode)
      SELECT nspname || '.' || relname AS objname
      FROM pg_class JOIN pg_namespace
        ON pg_namespace.oid = pg_class.relnamespace
      WHERE relpersistence = 't' AND relkind = 'r'
    We notice the presence of orphaned tables. Sessions are gone at this point, so they are for sure orphaned.
  7. If we manually drop the orphaned tables and run vacuum again the warning message disappears and the database is restored to normal functioning.
No matter this situation is caused by an application misbehavior or a missed temporary table cleanup, I'm asking if there is a reasonable way to prevent this situation improving the alerting with:
  • When vacuum cannot repeatedly perform a transaction fix up in regular multi user mode.
  • When vacuum cannot perform a transaction fix up in single user mode because there are orphaned tables in the way.
If we had these alerts we would not have lost a database.

Kind regards

Girogio Saviane

Il giorno sab 12 set 2020 alle ore 12:59 Francisco Olarte <folarte@peoplecall.com> ha scritto:
On Sat, Sep 12, 2020 at 10:59 AM Giorgio Saviane <gsaviane@gmail.com> wrote:
....
> The question is: can you implement some warning thrown by vacuum/autovacuum in case orphaned temporary tables are in the way?
> The connection pooler we are using is the one provided by Tomcat 7. I'm gonna check if it supports DISCARD ALL on close.

From what I've read the thing is, if the pooler does not work
correctly ( by issuing discard all or otherwise taking care of
temporary tables ), there are few things as the tables are not
orphaned on the server side.

What the server sees is someone connects, creates the temporary
tables, issues a lot of other commands but does not  disconnect, so
the tables must be kept alive in case the the session wants to use
them again. DISCARD ALL will make them orphans.

Francisco Olarte.
Giorgio.

On Sat, Sep 12, 2020 at 1:57 PM Giorgio Saviane <gsaviane@gmail.com> wrote:
... It seems I've hit a new gmail "bug-feature" and it has stripped
numbering on the reply, doing it by hand, misnumberings are mine.....

> The sequence of events that led our databases to the wraparound issue is the following:
>
> 1.The application creates temporary tables that should be by default cleaned at the end of the session. As a matter
offact the pooler doesn't actually close the session, neither - apparently - issues a DISCARD ALL, creating leaks among
sessions.
> 2.Since sessions are kept alive with stale temporary tables the autovacuum process cannot do maintenance on
transactionids.
 
> 3.Since we have no evidence on what is going on, the database reaches the 1 million stop limit to avoid transaction
wraparound.
> 4.The application gets then out of order. We stop the Postgres server gracefully and we enter single user mode to run
thevacuum
 
> 5.Any attempt to vacuum the database ends up with the usual "WARNING: database "mydb" must be vacuumed within XXX
transactions"message, revealing that the vacuum could not fix up the transaction ids.
 
> By running this SQL (still in single user mode)
>   SELECT nspname || '.' || relname AS objname
>   FROM pg_class JOIN pg_namespace
>     ON pg_namespace.oid = pg_class.relnamespace
>   WHERE relpersistence = 't' AND relkind = 'r'
> 6.We notice the presence of orphaned tables. Sessions are gone at this point, so they are for sure orphaned.
> 7.If we manually drop the orphaned tables and run vacuum again the warning message disappears and the database is
restoredto normal functioning.
 

I see your point now. IMO postgres should either have dropped the
temporary tables in 4., graceful shutdown or left them in a state
where thay could be vacuumed, I did not notice the graceful shutdown
step before.

Just one question, PG documents shutdown as smart, fast and immediate,
which one are you using for "graceful"? ( anyway, after a server
restart I would expect tables to be vacuumable, even if it was a power
loss followed by recovery ), they are not supposed to outlive the
session and as you say the sessions are gone.

Francisco Olarte.



Giorgio Saviane <gsaviane@gmail.com> writes:
> Let me clarify.
> The sequence of events that led our databases to the wraparound issue is
> the following:

>    1. The application creates temporary tables that should be by default
>    cleaned at the end of the session. As a matter of fact the pooler doesn't
>    actually close the session, neither - apparently - issues a DISCARD ALL,
>    creating leaks among sessions.
>    2. Since sessions are kept alive with stale temporary tables the
>    autovacuum process cannot do maintenance on transaction ids.
>    3. Since we have no evidence on what is going on, the database reaches
>    the 1 million stop limit to avoid transaction wraparound.
>    4. The application gets then out of order. We stop the Postgres server
>    gracefully and we enter single user mode to run the vacuum
>    5. Any attempt to vacuum the database ends up with the usual "WARNING:
>    database "mydb" must be vacuumed within XXX transactions" message,
>    revealing that the vacuum could not fix up the transaction ids.

So, actually, that was the wrong recovery method.  All you really
needed to do was terminate that long-running session(s) and wait for
a minute or two.  There is code to delete old orphaned temp tables,
even as far back as 9.5.  But *it's part of autovacuum*.  Dropping
down to single-user mode prevents it from running.

I thought briefly about whether VACUUM in single-user mode should
also run that code; but I'm not really convinced it'd be a good thing
to be auto-dropping tables in single-user mode.  And anyway we shouldn't
be encouraging use of single-user mode, and it'd be much better to not
be in this situation to start with.

What seems like the most helpful thing we could do is teach autovacuum to
issue log warnings when it sees old temp tables holding back datfrozenxid.
The trick here is to not spam the log too much.  After a few moment's
thought, maybe warn if:

* a table is temp but not orphaned, and
* its relfrozenxid or relminmxid is equal to datfrozenxid or datminmxid,
proving that some previous autovac run saw it as holding back those
horizons, and
* those xids are older than the relevant freeze limit.

For extra credit maybe the message could mention the PID of the
session owning the table?

            regards, tom lane



Ok, let me add one more step in that sequence.
Before shutting down Postgres and going down to single mode we stopped the application server and other services around it. I'm pretty sure that there was a reasonable window of time before we went to single mode. Moreover, if the database already reached the stop limit for preventing the wraparound, was it still possible for autovacuum to kick in in multiuser mode?
Wasn't the graceful database shutdown enough to let Postgres drop the temporary tables?

Kind regards

Giorgio Saviane

Il giorno sab 12 set 2020 alle ore 17:49 Tom Lane <tgl@sss.pgh.pa.us> ha scritto:
Giorgio Saviane <gsaviane@gmail.com> writes:
> Let me clarify.
> The sequence of events that led our databases to the wraparound issue is
> the following:

>    1. The application creates temporary tables that should be by default
>    cleaned at the end of the session. As a matter of fact the pooler doesn't
>    actually close the session, neither - apparently - issues a DISCARD ALL,
>    creating leaks among sessions.
>    2. Since sessions are kept alive with stale temporary tables the
>    autovacuum process cannot do maintenance on transaction ids.
>    3. Since we have no evidence on what is going on, the database reaches
>    the 1 million stop limit to avoid transaction wraparound.
>    4. The application gets then out of order. We stop the Postgres server
>    gracefully and we enter single user mode to run the vacuum
>    5. Any attempt to vacuum the database ends up with the usual "WARNING:
>    database "mydb" must be vacuumed within XXX transactions" message,
>    revealing that the vacuum could not fix up the transaction ids.

So, actually, that was the wrong recovery method.  All you really
needed to do was terminate that long-running session(s) and wait for
a minute or two.  There is code to delete old orphaned temp tables,
even as far back as 9.5.  But *it's part of autovacuum*.  Dropping
down to single-user mode prevents it from running.

I thought briefly about whether VACUUM in single-user mode should
also run that code; but I'm not really convinced it'd be a good thing
to be auto-dropping tables in single-user mode.  And anyway we shouldn't
be encouraging use of single-user mode, and it'd be much better to not
be in this situation to start with.

What seems like the most helpful thing we could do is teach autovacuum to
issue log warnings when it sees old temp tables holding back datfrozenxid.
The trick here is to not spam the log too much.  After a few moment's
thought, maybe warn if:

* a table is temp but not orphaned, and
* its relfrozenxid or relminmxid is equal to datfrozenxid or datminmxid,
proving that some previous autovac run saw it as holding back those
horizons, and
* those xids are older than the relevant freeze limit.

For extra credit maybe the message could mention the PID of the
session owning the table?

                        regards, tom lane
Giorgio Saviane <gsaviane@gmail.com> writes:
> Ok, let me add one more step in that sequence.
> Before shutting down Postgres and going down to single mode we stopped the
> application server and other services around it. I'm pretty sure that there
> was a reasonable window of time before we went to single mode.

Hmm, but did you stop the connection pooler?  That's what was holding
the database session open, if I'm understanding things correctly.

Unless you've changed the default autovacuum settings (particularly
autovacuum_naptime), I'd have expected autovacuum to start zapping
temp tables within a minute or so after they become orphaned.
Even more to the point, if the database session was allowed to stop
normally, it should've cleaned them up itself.

            regards, tom lane



The connection pooler is a component embedded in the Tomcat container. When Tomcat stops the connection pooler gets shut down too. We can tell it by the number of Postgres processes that disappear from htop.
We never touch autovacuum settings, they are kept to the default.

Kind regards

Giorgio Saviane

Il giorno sab 12 set 2020 alle ore 20:18 Tom Lane <tgl@sss.pgh.pa.us> ha scritto:
Giorgio Saviane <gsaviane@gmail.com> writes:
> Ok, let me add one more step in that sequence.
> Before shutting down Postgres and going down to single mode we stopped the
> application server and other services around it. I'm pretty sure that there
> was a reasonable window of time before we went to single mode.

Hmm, but did you stop the connection pooler?  That's what was holding
the database session open, if I'm understanding things correctly.

Unless you've changed the default autovacuum settings (particularly
autovacuum_naptime), I'd have expected autovacuum to start zapping
temp tables within a minute or so after they become orphaned.
Even more to the point, if the database session was allowed to stop
normally, it should've cleaned them up itself.

                        regards, tom lane