Re: Per-tablespace autovacuum settings - Mailing list pgsql-hackers

From Oleksii Kliukin
Subject Re: Per-tablespace autovacuum settings
Date
Msg-id ADE5BB1E-5370-4A13-BC1F-20A462BC8D73@hintbits.com
Whole thread Raw
In response to Re: Per-tablespace autovacuum settings  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Andres Freund <andres@anarazel.de> wrote:

> Hi,
>
> On 2019-02-14 17:56:17 +0100, Oleksii Kliukin wrote:
>> Is there any interest in making autovacuum parameters available on a
>> tablespace level in order to apply those to all vacuumable objects in the
>> tablespace?
>>
>> We have a set of tables running on ZFS, where autovacuum does almost no good
>> to us (except for preventing anti-wraparound) due to the nature of ZFS (FS
>> fragmentation caused by copy-on-write leads to sequential scans doing random
>> access) and the fact that our tables there are append-only. Initially, the
>> team in charge of the application just disabled autovacuum globally, but
>> that lead to a huge system catalog bloat.
>>
>> At present, we have to re-enable autovacuum globally and then disable it
>> per-table using table storage parameters, but that is inelegant and requires
>> doing it once for existing tables and modifying the script that periodically
>> creates new ones (the whole system is a Postgres-based replacement of an
>> ElasticSearch cluster and we have to create new partitions regularly).
>
> Won't that a) lead to periodic massive anti-wraparound sessions? b)
> prevent any use of index only scans?

The wraparound is hardly an issue there, as the data is transient and only
exist for 14 days (I think the entire date-based partition is dropped,
that’s how we ended up with pg_class catalog bloat). The index-only scan can
be an issue, although, IIRC, there is some manual vacuum that runs from time
to time, perhaps following your advice below.

> ISTM you'd be better off running vacuum rarely, with large
> thresholds. That way it'd do most of the writes in one pass, hopefully
> leading to less fragementation, and it'd set the visibilitymap bits to
> prevent further need to touch those. By doing it only rarely, vacuum
> should process pages sequentially, reducing the fragmentation.
>
>
>> Grouping tables by tablespaces for the purpose of autovacuum configuration
>> seems natural, as tablespaces are often placed on another filesystems/device
>> that may require changing how often does autovacuum run, make it less/more
>> aggressive depending on the I/O performance or require disabling it
>> altogether as in my example above. Furthermore, given that we allow
>> cost-based options per-tablespace the infrastructure is already there and
>> the task is mostly to teach autovacuum to look at tablespaces in addition to
>> the relation storage options (in case of a conflict, relation options should
>> always take priority).
>
> While I don't buy the reasoning above, I think this'd be useful for
> other cases.

Even if we don’t want to disable autovacuum completely, we might want to
make it much less frequent by increasing the thresholds or costs/delays to
reduce the I/O strain for a particular tablespace.

Regards,
Oleksii Kliukin



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: libpq host/hostaddr/conninfo inconsistencies
Next
From: Andres Freund
Date:
Subject: Re: libpq debug log