Thread: Automatic autovacuum to prevent wraparound - PG13.5
Hello guys,
Have recently upgraded from PG10 to PG13.5 and would like to understand the reason why we are seeing triggered autovacuum to prevent the wraparound while all the metrics are still far off from the multixact/freeze max ages defined. And inclusive there was one time where it was triggered as aggressive.
Some background:
- autovacuum_freeze_max_age: 400M
- autovacuum_multixact_freeze_max_age: 800M
- the activity is mostly insert intensive in one particular table (60M daily)
- the team execute vacuum freeze verbose every day at night to keep the multixact ids down
- we generally reach near 70M mxids before running vacuum freeze at night
- the postgresql is Aurora
The scenario:
- Out of nowhere (during the weekend), without database activity load or batches running, with previous nightly run of vacuum freeze, in the middle of the day, with xids and mxids below 20M we are seeing autovacuum being triggered to prevent wraparound.
My question is why this is occurring, which condition might be responsible for this behaviour?
On Wed, 2022-06-15 at 12:13 +0100, Mauro Farracha wrote: > Have recently upgraded from PG10 to PG13.5 and would like to understand the reason why we > are seeing triggered autovacuum to prevent the wraparound while all the metrics are still > far off from the multixact/freeze max ages defined. And inclusive there was one time where > it was triggered as aggressive. > > Some background: > - autovacuum_freeze_max_age: 400M > - autovacuum_multixact_freeze_max_age: 800M > - the activity is mostly insert intensive in one particular table (60M daily) > - the team execute vacuum freeze verbose every day at night to keep the multixact ids down > - we generally reach near 70M mxids before running vacuum freeze at night > - the postgresql is Aurora > > The scenario: > - Out of nowhere (during the weekend), without database activity load or batches running, > with previous nightly run of vacuum freeze, in the middle of the day, with xids and mxids > below 20M we are seeing autovacuum being triggered to prevent wraparound. > > My question is why this is occurring, which condition might be responsible for this behaviour? A long-running transaction or a prepared transaction. Or an abandoned replication slot with an old "xmin". That would be the answer for PostgreSQL. It might apply to Amazon Aurora, unless they changed the behavior there. Perhaps ask Amazon. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
Frankly speaking, Aurora PostgreSQL's behaviour is quite unpredictable.
In our case, the autovacuum was not even getting triggered in spite of crossing the autovacuum_freeze_max_age. Finally, the database went down abruptly, which resolved the issue.
Thanks,
Ninad
On Wed, Jun 15, 2022 at 7:57 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Wed, 2022-06-15 at 12:13 +0100, Mauro Farracha wrote:
> Have recently upgraded from PG10 to PG13.5 and would like to understand the reason why we
> are seeing triggered autovacuum to prevent the wraparound while all the metrics are still
> far off from the multixact/freeze max ages defined. And inclusive there was one time where
> it was triggered as aggressive.
>
> Some background:
> - autovacuum_freeze_max_age: 400M
> - autovacuum_multixact_freeze_max_age: 800M
> - the activity is mostly insert intensive in one particular table (60M daily)
> - the team execute vacuum freeze verbose every day at night to keep the multixact ids down
> - we generally reach near 70M mxids before running vacuum freeze at night
> - the postgresql is Aurora
>
> The scenario:
> - Out of nowhere (during the weekend), without database activity load or batches running,
> with previous nightly run of vacuum freeze, in the middle of the day, with xids and mxids
> below 20M we are seeing autovacuum being triggered to prevent wraparound.
>
> My question is why this is occurring, which condition might be responsible for this behaviour?
A long-running transaction or a prepared transaction.
Or an abandoned replication slot with an old "xmin".
That would be the answer for PostgreSQL. It might apply to Amazon Aurora, unless they
changed the behavior there. Perhaps ask Amazon.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
On Wed, Jun 15, 2022 at 4:13 AM Mauro Farracha <farracha@gmail.com> wrote: > The scenario: > - Out of nowhere (during the weekend), without database activity load or batches running, with previous nightly run ofvacuum freeze, in the middle of the day, with xids and mxids below 20M we are seeing autovacuum being triggered to preventwraparound. > > My question is why this is occurring, which condition might be responsible for this behaviour? There is a behavior that seems like it might be relevant: VACUUM interprets autovacuum_multixact_freeze_max_age in a way that accounts for both MultiXactId consumption and the consumption of "member space" by MultiXacts. Technically there are 2 SLRUs for MultiXacts, either of which can wraparound. This behavior was established by commit 53bb309d2d. It is documented. Admittedly this whole area of the documentation is in dire need of an overhaul. :-( -- Peter Geoghegan