Re: Fun fact about autovacuum and orphan temp tables - Mailing list pgsql-hackers

From Constantin S. Pan
Subject Re: Fun fact about autovacuum and orphan temp tables
Date
Msg-id 20161020153024.19cbc758@ppg
Whole thread Raw
In response to Fun fact about autovacuum and orphan temp tables  (Grigory Smolkin <g.smolkin@postgrespro.ru>)
Responses Re: Fun fact about autovacuum and orphan temp tables
List pgsql-hackers
On Mon, 5 Sep 2016 14:54:05 +0300
Grigory Smolkin <g.smolkin@postgrespro.ru> wrote:

> Hello, hackers!
>
> We were testing how well some application works with PostgreSQL and
> stumbled upon an autovacuum behavior which I fail to understand.
> Application in question have a habit to heavily use temporary tables
> in funny ways.
> For example it creates A LOT of them.
> Which is ok.
> Funny part is that it never drops them. So when backend is finally
> terminated, it tries to drop them and fails with error:
>
> FATAL:  out of shared memory
> HINT:  You might need to increase max_locks_per_transaction
>
> If I understand that rigth, we are trying to drop all these temp
> tables in one transaction and running out of locks to do so.
> After that postgresql.log is flooded at the rate 1k/s with messages
> like that:
>
> LOG:  autovacuum: found orphan temp table "pg_temp_15"."tt38147" in
> database "DB_TEST"
>
> It produces a noticeable load on the system and it`s getting worst
> with every terminated backend or restart.
> I did some RTFS and it appears that autovacuum has no intention of
> cleaning that orphan tables unless
> it`s wraparound time:
>
> src/backend/postmaster/autovacuum.c
>               /* We just ignore it if the owning backend is still
> active */ 2037             if (backendID == MyBackendId ||
> BackendIdGetProc(backendID) == NULL)
>   2038             {
>   2039                 /*
>   2040                  * We found an orphan temp table (which was
> probably left
>   2041                  * behind by a crashed backend).  If it's so
> old as to need
>   2042                  * vacuum for wraparound, forcibly drop it.
> Otherwise just
>   2043                  * log a complaint.
>   2044                  */
>   2045                 if (wraparound)
>   2046                 {
>   2047                     ObjectAddress object;
>   2048
>   2049                     ereport(LOG,
>   2050                             (errmsg("autovacuum: dropping
> orphan temp table \"%s\".\"%s\" in database \"%s\"",
>   2051 get_namespace_name(classForm->relnamespace),
>   2052 NameStr(classForm->relname),
>   2053 get_database_name(MyDatabaseId))));
>   2054                     object.classId = RelationRelationId;
>   2055                     object.objectId = relid;
>   2056                     object.objectSubId = 0;
>   2057                     performDeletion(&object, DROP_CASCADE,
> PERFORM_DELETION_INTERNAL);
>   2058                 }
>   2059                 else
>   2060                 {
>   2061                     ereport(LOG,
>   2062                             (errmsg("autovacuum: found orphan
> temp table \"%s\".\"%s\" in database \"%s\"",
>   2063 get_namespace_name(classForm->relnamespace),
>   2064 NameStr(classForm->relname),
>   2065 get_database_name(MyDatabaseId))));
>   2066                 }
>   2067             }
>   2068         }
>
>
> What is more troubling is that pg_statistic is starting to bloat
> badly.
>
> LOG:  automatic vacuum of table "DB_TEST.pg_catalog.pg_statistic":
> index scans: 0
>          pages: 0 removed, 68225 remain, 0 skipped due to pins
>          tuples: 0 removed, 2458382 remain, 2408081 are dead but not
> yet removable
>          buffer usage: 146450 hits, 31 misses, 0 dirtied
>          avg read rate: 0.010 MB/s, avg write rate: 0.000 MB/s
>          system usage: CPU 3.27s/6.92u sec elapsed 23.87 sec
>
> What is the purpose of keeping orphan tables around and not dropping
> them on the spot?
>
>

Hey Hackers,

I tried to fix the problem with a new backend not being
able to reuse a temporary namespace when it contains
thousands of temporary tables. I disabled locking of objects
during namespace clearing process. See the patch attached.
Please tell me if there are any reasons why this is wrong.

I also added a GUC variable and changed the condition in
autovacuum to let it nuke orphan tables on its way.
See another patch attached.

Regards,
Constantin Pan

Attachment

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Indirect indexes
Next
From: Stephen Frost
Date:
Subject: Re: File content logging during execution of COPY queries (was: Better logging of COPY queries if log_statement='all')