Re: Fun fact about autovacuum and orphan temp tables - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Fun fact about autovacuum and orphan temp tables
Date
Msg-id CAB7nPqSbYT6dRwsXVgiKmBdL_ARemfDZMPA+RPeC_ge0GK70hA@mail.gmail.com
Whole thread Raw
In response to Re: Fun fact about autovacuum and orphan temp tables  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Fun fact about autovacuum and orphan temp tables  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
On Fri, Nov 18, 2016 at 1:11 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> So now I think that we probably need to make this logic a bit smarter.
> Add all of the OIDs that need to be dropped to a list.  Then have a
> loop prior to the main loop (where it says "Perform operations on
> collected tables.") which iterates over that list and drops those
> tables one by one, starting a transaction every (say) 100 tables or
> after an error.  For bonus points, if a transaction fails, put all of
> the OIDs except the one that provoked the failure back into the list
> of OIDs to be dropped, so that we still make a progress even if some
> DROPs are failing for some reason.

Okay.

> That might sound adding unnecessary work just for the sake of
> paranoia, but I don't think it is.  Failures here won't be common, but
> since they are entirely automated there will be no human intelligence
> available to straighten things out.  Barring considerable caution,
> we'll just enter a flaming death spiral.

Thinking more paranoid, an extra way to enter in this flaming death
spiral is to not limit the maximum number of failures authorized when
dropping a set of orphaned tables and transactions fail multiple
times. This is basically not important as the relation on which the
drop failed gets dropped from the list but failing on each one of them
is a good way to slow down autovacuum, so putting a limit of say 10
transactions failing is I think really important.

I have played with what you suggested, and finished with the patch
attached. I have run some tests using this function to create some
temp tables with several backends to be sure that multiple backend IDs
are used:

CREATE FUNCTION create_temp_tables(i int) RETURNS void
AS $$
BEGIN
FOR i IN 1..i LOOP
  EXECUTE 'CREATE TEMP TABLE aa' || i || ' (a int);';
END LOOP;
END
$$ LANGUAGE plpgsql;

Then I killed the instance. At restart I could see a bunch of temp
tables in pg_class, and I let autovacuum do the cleanup after restart.
I have tested as well the error code path in the PG_TRY() block by
enforcing manually a elog(ERROR) to be sure that the maximum number of
failures is correctly handled, better safe than sorry.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Fix checkpoint skip logic on idle systems by tracking LSN progress
Next
From: Amit Kapila
Date:
Subject: Re: Re: [bug fix] Cascading standby cannot catch up and get stuck emitting the same message repeatedly