Home > mailing lists

Thread: How can i be certain autovacuum is causing reuse if table still grows

How can i be certain autovacuum is causing reuse if table still grows

From

Sidney Aloisio Ferreira Pryor

Date:

27 July 2020, 20:08:45

Hi, we have a 9.6.5 postgres database with one table that is growing a lot both in length and in number of new lines.

Each day we have an average of table growing 30GB, indexes growing 15GB and 400.000 new lines.

Autovacuum is set with default parameters and is running on this table for days without naping or stopping.

flip=# select schemaname as table_schema,

flip-# relname as table_name,

flip-# pg_size_pretty(pg_total_relation_size(relid)) as total_size,

flip-# pg_size_pretty(pg_relation_size(relid)) as data_size,

flip-# pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid))

flip-# as external_size

flip-# from pg_catalog.pg_statio_user_tables

flip-# order by pg_total_relation_size(relid) desc,

flip-# pg_relation_size(relid) desc

flip-# limit 1;

table_schema | table_name | total_size | data_size | external_size

--------------+--------------------+------------+-----------+---------------

public | flip_pagina_edicao | 4072 GB | 2526 GB | 1546 GB

(1 row

flip=# SELECT pid,

flip-# Age(query_start, Clock_timestamp()),

flip-# usename,

flip-# query

flip-# FROM pg_stat_activity

flip-# WHERE query != '<IDLE>'

flip-# AND query ilike '%vacuum%'

flip-# ORDER BY query_start ASC;

pid | age | usename | query

------+--------------------------+----------+------------------------------------------------------------------------------

3285 | -5 days -23:33:10.792166 | postgres | autovacuum: VACUUM ANALYZE public.flip_pagina_edicao (to prevent wraparound)

(1 rows)

There is a high number of updates each day but we are not able to be sure what is really going on.

We are assuming autovacuum is marking dead tuples for reuse but we still see table allocating a lot of more gbytes each day.

We see live tuple length, dead tuple length and free space all growing according to pgstattuple. So no reuse??

But according with pg_freespace the information is very different. So there is reuse??

flip=# SELECT * FROM pgstattuple('public.flip_pagina_edicao');

---------------+-------------+-------------+---------------+------------------+----------------+--------------------+---------------+--------------

2713168764928 | 103935134 | 95056542505 | 3.5 | 61449232 | 37711185236 | 1.39 | 2546185255732 | 93.85

(1 row)

flip=# select pg_size_pretty(sum(avail)) from pg_freespace('flip_pagina_edicao');

pg_size_pretty

----------------

98 GB

(1 row)

We would appreciate some help to find what can be happening:

1) How can we be sure postgres is reusing if relation is still allocating a lot of new space?

2) Is vacuum achieving marking dead tuples even if it is not ending?

3) We killed a autovacuum execution to see if pgstattuple show some different information, but no changing on its output. Is there a more reliable query or function to map what is really going on?

Thank you.

Sidney Pryor.

Re: How can i be certain autovacuum is causing reuse if table still grows

From

"David G. Johnston"

Date:

27 July 2020, 20:17:45

On Monday, July 27, 2020, Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:

Hi, we have a 9.6.5 postgres

Autovacuum is set with default parameters and is running on this table for days without naping or stopping.

3285 | -5 days -23:33:10.792166 | postgres | autovacuum: VACUUM ANALYZE public.flip_pagina_edicao (to prevent wraparound)
(1 rows)

Its probably best avoid drawing conclusions here until the anti-wraparound vacuum has completed and you’ve updated to the latest minor version of PostgreSQL.

David J.

RE: How can i be certain autovacuum is causing reuse if table still grows

From

Sidney Aloisio Ferreira Pryor

Date:

27 July 2020, 20:36:29

Thank you, David.

So can i assume that autovacuum only release dead tuples when it ends its execution?

Autovacuum is not finishing and is running for days.

We did not find a parameter to limit autovacuum executing time.

Do you have any recomendation?

Thank you.

Sidney.

De: David G. Johnston <david.g.johnston@gmail.com>
Enviado: segunda-feira, 27 de julho de 2020 17:17
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On Monday, July 27, 2020, Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:

Hi, we have a 9.6.5 postgres

Autovacuum is set with default parameters and is running on this table for days without naping or stopping.

3285 | -5 days -23:33:10.792166 | postgres | autovacuum: VACUUM ANALYZE public.flip_pagina_edicao (to prevent wraparound)
(1 rows)

Its probably best avoid drawing conclusions here until the anti-wraparound vacuum has completed and you’ve updated to the latest minor version of PostgreSQL.

David J.

Re: How can i be certain autovacuum is causing reuse if table still grows

From

Keith Fiske

Date:

27 July 2020, 20:59:07

On Mon, Jul 27, 2020 at 4:36 PM Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:

Thank you, David.
So can i assume that autovacuum only release dead tuples when it ends its execution?

Autovacuum is not finishing and is running for days.
We did not find a parameter to limit autovacuum executing time.
Do you have any recomendation?

Thank you.
Sidney.
De: David G. Johnston <david.g.johnston@gmail.com>
Enviado: segunda-feira, 27 de julho de 2020 17:17
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On Monday, July 27, 2020, Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:
Hi, we have a 9.6.5 postgres

Autovacuum is set with default parameters and is running on this table for days without naping or stopping.

3285 | -5 days -23:33:10.792166 | postgres | autovacuum: VACUUM ANALYZE public.flip_pagina_edicao (to prevent wraparound)
(1 rows)

Its probably best avoid drawing conclusions here until the anti-wraparound vacuum has completed and you’ve updated to the latest minor version of PostgreSQL.

David J.

Correct, the tuples will only be released once the autovacuum transaction has committed.

If it's been running for days without being committed, I would recommend cancelling it and immediately running a manual vacuum before it starts up again. You'll have to be quick and double-check that your manual vacuum is the one running and not getting blocked by the wraparound vac starting up again. You can check this by looking at pg_stat_activity. You may see the wraparound vacuum try and start up again as well, but it should be blocked by yours and typically cancel out once yours finishes.

A manual vacuum runs with different, more aggressive settings than autovacuum. So you should see this finish much sooner. May still take a few hours, but it shouldn't be much longer. The biggest culprit with autovacuum taking a long time on a given table, besides there being a lot of bloat, is typically autovacuum_vacuum_cost_delay, which can cause autovacuum to pause if autovacuum_vacuum_cost_limit is reached. With a manual vacuum, this delay is disabled (it uses vaccum_cost_delay which is by default 0). You can see more in this section of the documentation - https://www.postgresql.org/docs/9.6/runtime-config-autovacuum.html

You may want to adjust the autovac cost delay (lower it from 2 to 1 maybe). I believe you can do this on a per-table basis vs changing it for the entire database. I would not recommend completely disabling the delay until you at least get your system fully vacuumed and in a better state. I'd also recommend some more aggressive vacuuming be configured for your given table as well, so it doesn't accumulate so many dead rows before vacuum is able to run. I wrote about how to do this here - https://www.keithf4.com/per-table-autovacuum-tuning/

After it finishes running, in addition to upgrading to the latest version of 9.6, I would recommend checking the table & indexes for bloat. You can use the pgstattuple contrib module or this script which makes it a little more user-friendly https://github.com/keithf4/pg_bloat_check

Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

Re: How can i be certain autovacuum is causing reuse if table still grows

From

Ron

Date:

27 July 2020, 21:27:30

Another suggestion is to partition the table by a chronological key, so that only a fraction of the table needs to be vacuumed.

On 7/27/20 3:36 PM, Sidney Aloisio Ferreira Pryor wrote:

P {margin-top:0;margin-bottom:0;}
Thank you, David.
So can i assume that autovacuum only release dead tuples when it ends its execution?

Autovacuum is not finishing and is running for days.
We did not find a parameter to limit autovacuum executing time.
Do you have any recomendation?

Thank you.
Sidney.
De: David G. Johnston <david.g.johnston@gmail.com>
Enviado: segunda-feira, 27 de julho de 2020 17:17
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On Monday, July 27, 2020, Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:
Hi, we have a 9.6.5 postgres

Autovacuum is set with default parameters and is running on this table for days without naping or stopping.

3285 | -5 days -23:33:10.792166 | postgres | autovacuum: VACUUM ANALYZE public.flip_pagina_edicao (to prevent wraparound)
(1 rows)

Its probably best avoid drawing conclusions here until the anti-wraparound vacuum has completed and you’ve updated to the latest minor version of PostgreSQL.

David J.

--
Angular momentum makes the world go 'round.

RE: How can i be certain autovacuum is causing reuse if table still grows

From

Sidney Aloisio Ferreira Pryor

Date:

27 July 2020, 21:35:48

Thank you so much Keith.

We will try manual vacuum as you suggested.

I read something like this but did not give the right attention to it.

About pgstattuple was a bit confusing its free area information when we compared to pg_freespace.

Can you please explain this difference?

De: Keith Fiske <keith.fiske@crunchydata.com>
Enviado: segunda-feira, 27 de julho de 2020 17:59
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: David G. Johnston <david.g.johnston@gmail.com>; pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On Mon, Jul 27, 2020 at 4:36 PM Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:

Thank you, David.
So can i assume that autovacuum only release dead tuples when it ends its execution?

Autovacuum is not finishing and is running for days.
We did not find a parameter to limit autovacuum executing time.
Do you have any recomendation?

Thank you.
Sidney.
De: David G. Johnston <david.g.johnston@gmail.com>
Enviado: segunda-feira, 27 de julho de 2020 17:17
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On Monday, July 27, 2020, Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:
Hi, we have a 9.6.5 postgres

Autovacuum is set with default parameters and is running on this table for days without naping or stopping.

3285 | -5 days -23:33:10.792166 | postgres | autovacuum: VACUUM ANALYZE public.flip_pagina_edicao (to prevent wraparound)
(1 rows)

Its probably best avoid drawing conclusions here until the anti-wraparound vacuum has completed and you’ve updated to the latest minor version of PostgreSQL.

David J.

Correct, the tuples will only be released once the autovacuum transaction has committed.

Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

RE: How can i be certain autovacuum is causing reuse if table still grows

From

Sidney Aloisio Ferreira Pryor

Date:

27 July 2020, 21:38:41

Thank you Ron but this table has no date column.

It has one line for each page of each scanned document.

De: Ron <ronljohnsonjr@gmail.com>
Enviado: segunda-feira, 27 de julho de 2020 18:27
Para: pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

Another suggestion is to partition the table by a chronological key, so that only a fraction of the table needs to be vacuumed.

On 7/27/20 3:36 PM, Sidney Aloisio Ferreira Pryor wrote:

Thank you, David.
So can i assume that autovacuum only release dead tuples when it ends its execution?

Autovacuum is not finishing and is running for days.
We did not find a parameter to limit autovacuum executing time.
Do you have any recomendation?

Thank you.
Sidney.
De: David G. Johnston <david.g.johnston@gmail.com>
Enviado: segunda-feira, 27 de julho de 2020 17:17
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On Monday, July 27, 2020, Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:
Hi, we have a 9.6.5 postgres

Autovacuum is set with default parameters and is running on this table for days without naping or stopping.

3285 | -5 days -23:33:10.792166 | postgres | autovacuum: VACUUM ANALYZE public.flip_pagina_edicao (to prevent wraparound)
(1 rows)

Its probably best avoid drawing conclusions here until the anti-wraparound vacuum has completed and you’ve updated to the latest minor version of PostgreSQL.

David J.

--
Angular momentum makes the world go 'round.

Re: How can i be certain autovacuum is causing reuse if table still grows

From

Keith Fiske

Date:

27 July 2020, 22:19:11

As the documentation for pg_freespacemap says, it's values are not exact and they may not be up to date because you are viewing the statistics that were gathered during the last analyze, not the actual freespace map

https://www.postgresql.org/docs/9.6/pgfreespacemap.html

When you run pgstattuple, it is actually scanning the target object. Also note that scanning a table does not also scan its indexes, those must be done individually. So the free space reported by pgstattuple should be the actual free space, given as both byte size and % of the whole object.

pg_freespacemap is used by the planner and several other internal parts of PG, so checking it must be quick. pgstattuple can take a while to run on larger objects, but it will give you 100% accurate results as of the time it ran. This is why I prefer it when checking for bloat since many of the queries you see out there also try and base their results on stats. While the query runs quick, I've often seen them be wildly wrong.

Also, I would not recommend partitioning simply to improve vacuuming. Especially if extensive tuning hasn't been tried first. Most times you can get per-table tuning working well enough to get autovacuum running properly. Especially on 9.6 and even more-so on PG11, where autovacuum has itself been improved.

On Mon, Jul 27, 2020 at 5:35 PM Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:

Thank you so much Keith.
We will try manual vacuum as you suggested.
I read something like this but did not give the right attention to it.

About pgstattuple was a bit confusing its free area information when we compared to pg_freespace.
Can you please explain this difference?
De: Keith Fiske <keith.fiske@crunchydata.com>
Enviado: segunda-feira, 27 de julho de 2020 17:59
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: David G. Johnston <david.g.johnston@gmail.com>; pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On Mon, Jul 27, 2020 at 4:36 PM Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:
Thank you, David.
So can i assume that autovacuum only release dead tuples when it ends its execution?

Autovacuum is not finishing and is running for days.
We did not find a parameter to limit autovacuum executing time.
Do you have any recomendation?

Thank you.
Sidney.
De: David G. Johnston <david.g.johnston@gmail.com>
Enviado: segunda-feira, 27 de julho de 2020 17:17
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On Monday, July 27, 2020, Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:
Hi, we have a 9.6.5 postgres

Autovacuum is set with default parameters and is running on this table for days without naping or stopping.

3285 | -5 days -23:33:10.792166 | postgres | autovacuum: VACUUM ANALYZE public.flip_pagina_edicao (to prevent wraparound)
(1 rows)

Its probably best avoid drawing conclusions here until the anti-wraparound vacuum has completed and you’ve updated to the latest minor version of PostgreSQL.

David J.

Correct, the tuples will only be released once the autovacuum transaction has committed.

If it's been running for days without being committed, I would recommend cancelling it and immediately running a manual vacuum before it starts up again. You'll have to be quick and double-check that your manual vacuum is the one running and not getting blocked by the wraparound vac starting up again. You can check this by looking at pg_stat_activity. You may see the wraparound vacuum try and start up again as well, but it should be blocked by yours and typically cancel out once yours finishes.

A manual vacuum runs with different, more aggressive settings than autovacuum. So you should see this finish much sooner. May still take a few hours, but it shouldn't be much longer. The biggest culprit with autovacuum taking a long time on a given table, besides there being a lot of bloat, is typically autovacuum_vacuum_cost_delay, which can cause autovacuum to pause if autovacuum_vacuum_cost_limit is reached. With a manual vacuum, this delay is disabled (it uses vaccum_cost_delay which is by default 0). You can see more in this section of the documentation - https://www.postgresql.org/docs/9.6/runtime-config-autovacuum.html

You may want to adjust the autovac cost delay (lower it from 2 to 1 maybe). I believe you can do this on a per-table basis vs changing it for the entire database. I would not recommend completely disabling the delay until you at least get your system fully vacuumed and in a better state. I'd also recommend some more aggressive vacuuming be configured for your given table as well, so it doesn't accumulate so many dead rows before vacuum is able to run. I wrote about how to do this here - https://www.keithf4.com/per-table-autovacuum-tuning/

After it finishes running, in addition to upgrading to the latest version of 9.6, I would recommend checking the table & indexes for bloat. You can use the pgstattuple contrib module or this script which makes it a little more user-friendly https://github.com/keithf4/pg_bloat_check

--
Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

Re: How can i be certain autovacuum is causing reuse if table still grows

From

Keith Fiske

Date:

27 July 2020, 22:21:33

On Mon, Jul 27, 2020 at 6:19 PM Keith Fiske <keith.fiske@crunchydata.com> wrote:

As the documentation for pg_freespacemap says, it's values are not exact and they may not be up to date because you are viewing the statistics that were gathered during the last analyze, not the actual freespace map

Sorry, just wanted to clarify this. You are looking at the actual freespace map. It's that the values actually stored in it are not exact and based off of statistics gathering

https://www.postgresql.org/docs/9.6/pgfreespacemap.html

When you run pgstattuple, it is actually scanning the target object. Also note that scanning a table does not also scan its indexes, those must be done individually. So the free space reported by pgstattuple should be the actual free space, given as both byte size and % of the whole object.

pg_freespacemap is used by the planner and several other internal parts of PG, so checking it must be quick. pgstattuple can take a while to run on larger objects, but it will give you 100% accurate results as of the time it ran. This is why I prefer it when checking for bloat since many of the queries you see out there also try and base their results on stats. While the query runs quick, I've often seen them be wildly wrong.

Also, I would not recommend partitioning simply to improve vacuuming. Especially if extensive tuning hasn't been tried first. Most times you can get per-table tuning working well enough to get autovacuum running properly. Especially on 9.6 and even more-so on PG11, where autovacuum has itself been improved.

On Mon, Jul 27, 2020 at 5:35 PM Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:
Thank you so much Keith.
We will try manual vacuum as you suggested.
I read something like this but did not give the right attention to it.

About pgstattuple was a bit confusing its free area information when we compared to pg_freespace.
Can you please explain this difference?
De: Keith Fiske <keith.fiske@crunchydata.com>
Enviado: segunda-feira, 27 de julho de 2020 17:59
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: David G. Johnston <david.g.johnston@gmail.com>; pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On Mon, Jul 27, 2020 at 4:36 PM Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:
Thank you, David.
So can i assume that autovacuum only release dead tuples when it ends its execution?

Autovacuum is not finishing and is running for days.
We did not find a parameter to limit autovacuum executing time.
Do you have any recomendation?

Thank you.
Sidney.
De: David G. Johnston <david.g.johnston@gmail.com>
Enviado: segunda-feira, 27 de julho de 2020 17:17
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On Monday, July 27, 2020, Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:
Hi, we have a 9.6.5 postgres

Autovacuum is set with default parameters and is running on this table for days without naping or stopping.

3285 | -5 days -23:33:10.792166 | postgres | autovacuum: VACUUM ANALYZE public.flip_pagina_edicao (to prevent wraparound)
(1 rows)

Its probably best avoid drawing conclusions here until the anti-wraparound vacuum has completed and you’ve updated to the latest minor version of PostgreSQL.

David J.

Correct, the tuples will only be released once the autovacuum transaction has committed.

If it's been running for days without being committed, I would recommend cancelling it and immediately running a manual vacuum before it starts up again. You'll have to be quick and double-check that your manual vacuum is the one running and not getting blocked by the wraparound vac starting up again. You can check this by looking at pg_stat_activity. You may see the wraparound vacuum try and start up again as well, but it should be blocked by yours and typically cancel out once yours finishes.

A manual vacuum runs with different, more aggressive settings than autovacuum. So you should see this finish much sooner. May still take a few hours, but it shouldn't be much longer. The biggest culprit with autovacuum taking a long time on a given table, besides there being a lot of bloat, is typically autovacuum_vacuum_cost_delay, which can cause autovacuum to pause if autovacuum_vacuum_cost_limit is reached. With a manual vacuum, this delay is disabled (it uses vaccum_cost_delay which is by default 0). You can see more in this section of the documentation - https://www.postgresql.org/docs/9.6/runtime-config-autovacuum.html

You may want to adjust the autovac cost delay (lower it from 2 to 1 maybe). I believe you can do this on a per-table basis vs changing it for the entire database. I would not recommend completely disabling the delay until you at least get your system fully vacuumed and in a better state. I'd also recommend some more aggressive vacuuming be configured for your given table as well, so it doesn't accumulate so many dead rows before vacuum is able to run. I wrote about how to do this here - https://www.keithf4.com/per-table-autovacuum-tuning/

After it finishes running, in addition to upgrading to the latest version of 9.6, I would recommend checking the table & indexes for bloat. You can use the pgstattuple contrib module or this script which makes it a little more user-friendly https://github.com/keithf4/pg_bloat_check

--
Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

--
Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

RE: How can i be certain autovacuum is causing reuse if table still grows

From

Sidney Aloisio Ferreira Pryor

Date:

27 July 2020, 22:48:59

Thank you for the explanation Keith.

Pgstattuple runs for 8 hours approximately before printing its result.

and pg_freespace runs for "only" 16 minutes so we presumed really that pgstattuple scans the relation and brings more accurate information.

The strange fact is that pg_freespace result is changing from day to day, even without vacuum or analyze finishing.

At day one we start monitoring it was saying it has 108GB and kept changing each day.

And the worse fact is the different values for free space: pgstattupple with 2.3TB free VS pg_freespace with 96GB.

But i just killed autovacuum and started a manual vacuum verbose.

and 20 seconds later a autovacuum started.

it seems they are both running together.

Both sessions gets row exclusive locks.

I think it is better disabling autovacuum, do you agree?

De: Keith Fiske <keith.fiske@crunchydata.com>
Enviado: segunda-feira, 27 de julho de 2020 19:21
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: David G. Johnston <david.g.johnston@gmail.com>; pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On Mon, Jul 27, 2020 at 6:19 PM Keith Fiske <keith.fiske@crunchydata.com> wrote:

As the documentation for pg_freespacemap says, it's values are not exact and they may not be up to date because you are viewing the statistics that were gathered during the last analyze, not the actual freespace map

Sorry, just wanted to clarify this. You are looking at the actual freespace map. It's that the values actually stored in it are not exact and based off of statistics gathering

https://www.postgresql.org/docs/9.6/pgfreespacemap.html

When you run pgstattuple, it is actually scanning the target object. Also note that scanning a table does not also scan its indexes, those must be done individually. So the free space reported by pgstattuple should be the actual free space, given as both byte size and % of the whole object.

pg_freespacemap is used by the planner and several other internal parts of PG, so checking it must be quick. pgstattuple can take a while to run on larger objects, but it will give you 100% accurate results as of the time it ran. This is why I prefer it when checking for bloat since many of the queries you see out there also try and base their results on stats. While the query runs quick, I've often seen them be wildly wrong.

Also, I would not recommend partitioning simply to improve vacuuming. Especially if extensive tuning hasn't been tried first. Most times you can get per-table tuning working well enough to get autovacuum running properly. Especially on 9.6 and even more-so on PG11, where autovacuum has itself been improved.

Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

Re: How can i be certain autovacuum is causing reuse if table still grows

From

Keith Fiske

Date:

27 July 2020, 23:55:53

What were the results of pgstattuple? Curious what the live tuple_* and dead_tuple* values vs the free space were. Also note that if you just ran it on the table, that is the results for JUST the table. If there are indexes as well, you'd have to check those separately and they may be even worse (index bloat is often worse than table). The pg_bloat_check script I linked, when run against a single table, will run against the table and all b-tree indexes (it cannot check gin/gist).

Unfortunately, since this is a wraparound vacuum that is starting, disabling autovacuum will not work. Wraparound vacuums run even when autovac is disabled in order to try and keep you from getting into transaction exhaustion territory which means you must take the database offline to fix things and avoid data corruption. As I said autovac will attempt to run again, even if you start yours. And both will attempt to get the lock, but only one will actually succeed in getting it. Check the wait_event_type and wait_event column in pg_stat_activity to determine which one actually obtained the lock and you want to make sure yours is the one that did. Can check the documentation for the meanings of those columns https://www.postgresql.org/docs/9.6/monitoring-stats.html#PG-STAT-ACTIVITY-VIEW

In all honesty, though, if this table is as bloated as it seems, you may want to just try and schedule an outage to run a VACUUM FULL on this table and get it fully cleaned up. 2.3TB of empty space in just the table is quite a lot and each index may be just as bad. Until regular vacuums can efficiently run on this table it's only going to get progressively worse and just keep taking longer and longer. I would also highly suggest getting on the latest version of 9.6 before doing so that you're sure all bug fixes are in place.

An outage now may be costly, but it will be nowhere near as costly as autovacuum never being able to run on this table and reaching transaction exhaustion.

Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

On Mon, Jul 27, 2020 at 6:49 PM Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:

Thank you for the explanation Keith.

Pgstattuple runs for 8 hours approximately before printing its result.
and pg_freespace runs for "only" 16 minutes so we presumed really that pgstattuple scans the relation and brings more accurate information.

The strange fact is that pg_freespace result is changing from day to day, even without vacuum or analyze finishing.
At day one we start monitoring it was saying it has 108GB and kept changing each day.
And the worse fact is the different values for free space: pgstattupple with 2.3TB free VS pg_freespace with 96GB.

But i just killed autovacuum and started a manual vacuum verbose.
and 20 seconds later a autovacuum started.
it seems they are both running together.

Both sessions gets row exclusive locks.
I think it is better disabling autovacuum, do you agree?

De: Keith Fiske <keith.fiske@crunchydata.com>
Enviado: segunda-feira, 27 de julho de 2020 19:21
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: David G. Johnston <david.g.johnston@gmail.com>; pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On Mon, Jul 27, 2020 at 6:19 PM Keith Fiske <keith.fiske@crunchydata.com> wrote:
As the documentation for pg_freespacemap says, it's values are not exact and they may not be up to date because you are viewing the statistics that were gathered during the last analyze, not the actual freespace map

Sorry, just wanted to clarify this. You are looking at the actual freespace map. It's that the values actually stored in it are not exact and based off of statistics gathering

https://www.postgresql.org/docs/9.6/pgfreespacemap.html

When you run pgstattuple, it is actually scanning the target object. Also note that scanning a table does not also scan its indexes, those must be done individually. So the free space reported by pgstattuple should be the actual free space, given as both byte size and % of the whole object.

pg_freespacemap is used by the planner and several other internal parts of PG, so checking it must be quick. pgstattuple can take a while to run on larger objects, but it will give you 100% accurate results as of the time it ran. This is why I prefer it when checking for bloat since many of the queries you see out there also try and base their results on stats. While the query runs quick, I've often seen them be wildly wrong.

Also, I would not recommend partitioning simply to improve vacuuming. Especially if extensive tuning hasn't been tried first. Most times you can get per-table tuning working well enough to get autovacuum running properly. Especially on 9.6 and even more-so on PG11, where autovacuum has itself been improved.

RE: How can i be certain autovacuum is causing reuse if table still grows

From

Sidney Aloisio Ferreira Pryor

Date:

28 July 2020, 00:17:46

Thank you for the explanation Keith.

We will really improve our minor release update policy.

We have few but important postgres database here.

I guess an outage would really be better.

Index bloat is shorter than table bloat but it is present too.

So far manual vacuum 3 of 8 indexes.

An alter table to disable autovacuum and an autovacuum are both waiting for manual vacuum verbose.

flip=# SELECT pid,

flip-# Age(query_start, Clock_timestamp()),

flip-# usename,

flip-# query,

flip-# state,

flip-# wait_event

flip-# FROM pg_stat_activity

flip-# WHERE query != '<IDLE>'

flip-# AND query ilike '%vacuum%'

flip-# ORDER BY query_start ASC;

-------+------------------+----------+------------------------------------------------------------------------------+--------+------------

I am truly optimist it will solve.

So far no user session is locked or waiting.

De: Keith Fiske <keith.fiske@crunchydata.com>
Enviado: segunda-feira, 27 de julho de 2020 20:55
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: David G. Johnston <david.g.johnston@gmail.com>; pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

An outage now may be costly, but it will be nowhere near as costly as autovacuum never being able to run on this table and reaching transaction exhaustion.

Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

On Mon, Jul 27, 2020 at 6:49 PM Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:

Thank you for the explanation Keith.

Pgstattuple runs for 8 hours approximately before printing its result.
and pg_freespace runs for "only" 16 minutes so we presumed really that pgstattuple scans the relation and brings more accurate information.

The strange fact is that pg_freespace result is changing from day to day, even without vacuum or analyze finishing.
At day one we start monitoring it was saying it has 108GB and kept changing each day.
And the worse fact is the different values for free space: pgstattupple with 2.3TB free VS pg_freespace with 96GB.

But i just killed autovacuum and started a manual vacuum verbose.
and 20 seconds later a autovacuum started.
it seems they are both running together.

Both sessions gets row exclusive locks.
I think it is better disabling autovacuum, do you agree?

De: Keith Fiske <keith.fiske@crunchydata.com>
Enviado: segunda-feira, 27 de julho de 2020 19:21
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: David G. Johnston <david.g.johnston@gmail.com>; pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On Mon, Jul 27, 2020 at 6:19 PM Keith Fiske <keith.fiske@crunchydata.com> wrote:
As the documentation for pg_freespacemap says, it's values are not exact and they may not be up to date because you are viewing the statistics that were gathered during the last analyze, not the actual freespace map

Sorry, just wanted to clarify this. You are looking at the actual freespace map. It's that the values actually stored in it are not exact and based off of statistics gathering

https://www.postgresql.org/docs/9.6/pgfreespacemap.html

When you run pgstattuple, it is actually scanning the target object. Also note that scanning a table does not also scan its indexes, those must be done individually. So the free space reported by pgstattuple should be the actual free space, given as both byte size and % of the whole object.

pg_freespacemap is used by the planner and several other internal parts of PG, so checking it must be quick. pgstattuple can take a while to run on larger objects, but it will give you 100% accurate results as of the time it ran. This is why I prefer it when checking for bloat since many of the queries you see out there also try and base their results on stats. While the query runs quick, I've often seen them be wildly wrong.

Also, I would not recommend partitioning simply to improve vacuuming. Especially if extensive tuning hasn't been tried first. Most times you can get per-table tuning working well enough to get autovacuum running properly. Especially on 9.6 and even more-so on PG11, where autovacuum has itself been improved.

Re: How can i be certain autovacuum is causing reuse if table still grows

From

Keith

Date:

28 July 2020, 00:49:06

I would cancel that alter table to disable autovacuum. It's not going to go through until you manual vacuum finishes and at that point you don't want it to be disabled anymore.

On Mon, Jul 27, 2020 at 8:18 PM Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:

Thank you for the explanation Keith.
We will really improve our minor release update policy.
We have few but important postgres database here.

I guess an outage would really be better.
Index bloat is shorter than table bloat but it is present too.

So far manual vacuum 3 of 8 indexes.
An alter table to disable autovacuum and an autovacuum are both waiting for manual vacuum verbose.

flip=# SELECT pid,
flip-# Age(query_start, Clock_timestamp()),
flip-# usename,
flip-# query,
flip-# state,
flip-# wait_event
flip-# FROM pg_stat_activity
flip-# WHERE query != '<IDLE>'
flip-# AND query ilike '%vacuum%'
flip-# ORDER BY query_start ASC;
pid | age | usename | query | state | wait_event
-------+------------------+----------+------------------------------------------------------------------------------+--------+------------
25849 | -01:59:22.309787 | postgres | vacuum verbose flip_pagina_edicao; | active |
22317 | -00:55:09.848074 | postgres | ALTER TABLE flip_pagina_edicao SET (autovacuum_enabled = false); | active | relation
27718 | -00:53:00.368776 | postgres | autovacuum: VACUUM ANALYZE public.flip_pagina_edicao (to prevent wraparound) | active | relation

I am truly optimist it will solve.
So far no user session is locked or waiting.

De: Keith Fiske <keith.fiske@crunchydata.com>
Enviado: segunda-feira, 27 de julho de 2020 20:55
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: David G. Johnston <david.g.johnston@gmail.com>; pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

What were the results of pgstattuple? Curious what the live tuple_* and dead_tuple* values vs the free space were. Also note that if you just ran it on the table, that is the results for JUST the table. If there are indexes as well, you'd have to check those separately and they may be even worse (index bloat is often worse than table). The pg_bloat_check script I linked, when run against a single table, will run against the table and all b-tree indexes (it cannot check gin/gist).

Unfortunately, since this is a wraparound vacuum that is starting, disabling autovacuum will not work. Wraparound vacuums run even when autovac is disabled in order to try and keep you from getting into transaction exhaustion territory which means you must take the database offline to fix things and avoid data corruption. As I said autovac will attempt to run again, even if you start yours. And both will attempt to get the lock, but only one will actually succeed in getting it. Check the wait_event_type and wait_event column in pg_stat_activity to determine which one actually obtained the lock and you want to make sure yours is the one that did. Can check the documentation for the meanings of those columns https://www.postgresql.org/docs/9.6/monitoring-stats.html#PG-STAT-ACTIVITY-VIEW

In all honesty, though, if this table is as bloated as it seems, you may want to just try and schedule an outage to run a VACUUM FULL on this table and get it fully cleaned up. 2.3TB of empty space in just the table is quite a lot and each index may be just as bad. Until regular vacuums can efficiently run on this table it's only going to get progressively worse and just keep taking longer and longer. I would also highly suggest getting on the latest version of 9.6 before doing so that you're sure all bug fixes are in place.

An outage now may be costly, but it will be nowhere near as costly as autovacuum never being able to run on this table and reaching transaction exhaustion.

--
Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

On Mon, Jul 27, 2020 at 6:49 PM Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:
Thank you for the explanation Keith.

Pgstattuple runs for 8 hours approximately before printing its result.
and pg_freespace runs for "only" 16 minutes so we presumed really that pgstattuple scans the relation and brings more accurate information.

The strange fact is that pg_freespace result is changing from day to day, even without vacuum or analyze finishing.
At day one we start monitoring it was saying it has 108GB and kept changing each day.
And the worse fact is the different values for free space: pgstattupple with 2.3TB free VS pg_freespace with 96GB.

But i just killed autovacuum and started a manual vacuum verbose.
and 20 seconds later a autovacuum started.
it seems they are both running together.

Both sessions gets row exclusive locks.
I think it is better disabling autovacuum, do you agree?

De: Keith Fiske <keith.fiske@crunchydata.com>
Enviado: segunda-feira, 27 de julho de 2020 19:21
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: David G. Johnston <david.g.johnston@gmail.com>; pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On Mon, Jul 27, 2020 at 6:19 PM Keith Fiske <keith.fiske@crunchydata.com> wrote:
As the documentation for pg_freespacemap says, it's values are not exact and they may not be up to date because you are viewing the statistics that were gathered during the last analyze, not the actual freespace map

Sorry, just wanted to clarify this. You are looking at the actual freespace map. It's that the values actually stored in it are not exact and based off of statistics gathering

https://www.postgresql.org/docs/9.6/pgfreespacemap.html

When you run pgstattuple, it is actually scanning the target object. Also note that scanning a table does not also scan its indexes, those must be done individually. So the free space reported by pgstattuple should be the actual free space, given as both byte size and % of the whole object.

pg_freespacemap is used by the planner and several other internal parts of PG, so checking it must be quick. pgstattuple can take a while to run on larger objects, but it will give you 100% accurate results as of the time it ran. This is why I prefer it when checking for bloat since many of the queries you see out there also try and base their results on stats. While the query runs quick, I've often seen them be wildly wrong.

Also, I would not recommend partitioning simply to improve vacuuming. Especially if extensive tuning hasn't been tried first. Most times you can get per-table tuning working well enough to get autovacuum running properly. Especially on 9.6 and even more-so on PG11, where autovacuum has itself been improved.

RE: How can i be certain autovacuum is causing reuse if table still grows

From

Sidney Aloisio Ferreira Pryor

Date:

28 July 2020, 01:42:16

Yes, it is true.

But i think it would be better to disable it just after manual vacuum so i can tune cost_delay of autovacuum to make it more agressive/tolerant and not last for days anymore in the future.

De: Keith <keith@keithf4.com>
Enviado: segunda-feira, 27 de julho de 2020 21:49
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: Keith Fiske <keith.fiske@crunchydata.com>; David G. Johnston <david.g.johnston@gmail.com>; pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

I would cancel that alter table to disable autovacuum. It's not going to go through until you manual vacuum finishes and at that point you don't want it to be disabled anymore.

On Mon, Jul 27, 2020 at 8:18 PM Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:

Thank you for the explanation Keith.
We will really improve our minor release update policy.
We have few but important postgres database here.

I guess an outage would really be better.
Index bloat is shorter than table bloat but it is present too.

So far manual vacuum 3 of 8 indexes.
An alter table to disable autovacuum and an autovacuum are both waiting for manual vacuum verbose.

flip=# SELECT pid,
flip-# Age(query_start, Clock_timestamp()),
flip-# usename,
flip-# query,
flip-# state,
flip-# wait_event
flip-# FROM pg_stat_activity
flip-# WHERE query != '<IDLE>'
flip-# AND query ilike '%vacuum%'
flip-# ORDER BY query_start ASC;
pid | age | usename | query | state | wait_event
-------+------------------+----------+------------------------------------------------------------------------------+--------+------------
25849 | -01:59:22.309787 | postgres | vacuum verbose flip_pagina_edicao; | active |
22317 | -00:55:09.848074 | postgres | ALTER TABLE flip_pagina_edicao SET (autovacuum_enabled = false); | active | relation
27718 | -00:53:00.368776 | postgres | autovacuum: VACUUM ANALYZE public.flip_pagina_edicao (to prevent wraparound) | active | relation

I am truly optimist it will solve.
So far no user session is locked or waiting.

De: Keith Fiske <keith.fiske@crunchydata.com>
Enviado: segunda-feira, 27 de julho de 2020 20:55
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: David G. Johnston <david.g.johnston@gmail.com>; pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

What were the results of pgstattuple? Curious what the live tuple_* and dead_tuple* values vs the free space were. Also note that if you just ran it on the table, that is the results for JUST the table. If there are indexes as well, you'd have to check those separately and they may be even worse (index bloat is often worse than table). The pg_bloat_check script I linked, when run against a single table, will run against the table and all b-tree indexes (it cannot check gin/gist).

Unfortunately, since this is a wraparound vacuum that is starting, disabling autovacuum will not work. Wraparound vacuums run even when autovac is disabled in order to try and keep you from getting into transaction exhaustion territory which means you must take the database offline to fix things and avoid data corruption. As I said autovac will attempt to run again, even if you start yours. And both will attempt to get the lock, but only one will actually succeed in getting it. Check the wait_event_type and wait_event column in pg_stat_activity to determine which one actually obtained the lock and you want to make sure yours is the one that did. Can check the documentation for the meanings of those columns https://www.postgresql.org/docs/9.6/monitoring-stats.html#PG-STAT-ACTIVITY-VIEW

In all honesty, though, if this table is as bloated as it seems, you may want to just try and schedule an outage to run a VACUUM FULL on this table and get it fully cleaned up. 2.3TB of empty space in just the table is quite a lot and each index may be just as bad. Until regular vacuums can efficiently run on this table it's only going to get progressively worse and just keep taking longer and longer. I would also highly suggest getting on the latest version of 9.6 before doing so that you're sure all bug fixes are in place.

An outage now may be costly, but it will be nowhere near as costly as autovacuum never being able to run on this table and reaching transaction exhaustion.

--
Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

On Mon, Jul 27, 2020 at 6:49 PM Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:
Thank you for the explanation Keith.

Pgstattuple runs for 8 hours approximately before printing its result.
and pg_freespace runs for "only" 16 minutes so we presumed really that pgstattuple scans the relation and brings more accurate information.

The strange fact is that pg_freespace result is changing from day to day, even without vacuum or analyze finishing.
At day one we start monitoring it was saying it has 108GB and kept changing each day.
And the worse fact is the different values for free space: pgstattupple with 2.3TB free VS pg_freespace with 96GB.

But i just killed autovacuum and started a manual vacuum verbose.
and 20 seconds later a autovacuum started.
it seems they are both running together.

Both sessions gets row exclusive locks.
I think it is better disabling autovacuum, do you agree?

De: Keith Fiske <keith.fiske@crunchydata.com>
Enviado: segunda-feira, 27 de julho de 2020 19:21
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: David G. Johnston <david.g.johnston@gmail.com>; pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On Mon, Jul 27, 2020 at 6:19 PM Keith Fiske <keith.fiske@crunchydata.com> wrote:
As the documentation for pg_freespacemap says, it's values are not exact and they may not be up to date because you are viewing the statistics that were gathered during the last analyze, not the actual freespace map

Sorry, just wanted to clarify this. You are looking at the actual freespace map. It's that the values actually stored in it are not exact and based off of statistics gathering

https://www.postgresql.org/docs/9.6/pgfreespacemap.html

When you run pgstattuple, it is actually scanning the target object. Also note that scanning a table does not also scan its indexes, those must be done individually. So the free space reported by pgstattuple should be the actual free space, given as both byte size and % of the whole object.

pg_freespacemap is used by the planner and several other internal parts of PG, so checking it must be quick. pgstattuple can take a while to run on larger objects, but it will give you 100% accurate results as of the time it ran. This is why I prefer it when checking for bloat since many of the queries you see out there also try and base their results on stats. While the query runs quick, I've often seen them be wildly wrong.

Also, I would not recommend partitioning simply to improve vacuuming. Especially if extensive tuning hasn't been tried first. Most times you can get per-table tuning working well enough to get autovacuum running properly. Especially on 9.6 and even more-so on PG11, where autovacuum has itself been improved.

Re: How can i be certain autovacuum is causing reuse if table still grows

From

Ron

Date:

28 July 2020, 03:32:48

Any key that "increases linearly over time" is a chronological key; sequences are a perfect example.

On 7/27/20 4:38 PM, Sidney Aloisio Ferreira Pryor wrote:

P {margin-top:0;margin-bottom:0;}
Thank you Ron but this table has no date column.
It has one line for each page of each scanned document.
De: Ron <ronljohnsonjr@gmail.com>
Enviado: segunda-feira, 27 de julho de 2020 18:27
Para: pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

Another suggestion is to partition the table by a chronological key, so that only a fraction of the table needs to be vacuumed.

On 7/27/20 3:36 PM, Sidney Aloisio Ferreira Pryor wrote:

Thank you, David.
So can i assume that autovacuum only release dead tuples when it ends its execution?

Autovacuum is not finishing and is running for days.
We did not find a parameter to limit autovacuum executing time.
Do you have any recomendation?

Thank you.
Sidney.
De: David G. Johnston <david.g.johnston@gmail.com>
Enviado: segunda-feira, 27 de julho de 2020 17:17
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On Monday, July 27, 2020, Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:
Hi, we have a 9.6.5 postgres

Autovacuum is set with default parameters and is running on this table for days without naping or stopping.

3285 | -5 days -23:33:10.792166 | postgres | autovacuum: VACUUM ANALYZE public.flip_pagina_edicao (to prevent wraparound)
(1 rows)

Its probably best avoid drawing conclusions here until the anti-wraparound vacuum has completed and you’ve updated to the latest minor version of PostgreSQL.

David J.

--
Angular momentum makes the world go 'round.

--
Angular momentum makes the world go 'round.

Re: How can i be certain autovacuum is causing reuse if table still grows

From

Ron

Date:

28 July 2020, 03:39:47

On 7/27/20 5:19 PM, Keith Fiske wrote:
[snip]
> Also, I would not recommend partitioning simply to improve vacuuming. 
> Especially if extensive tuning hasn't been tried first. Most times you can 
> get per-table tuning working well enough to get autovacuum running 
> properly. Especially on 9.6 and even more-so on PG11, where autovacuum has 
> itself been improved.

SIMPLY to improve vacuum performance?  No.  But there are reasons that 
partitioning was "invented", and minimizing the work needed to be done on 
the whole of a Very Large and Rapidly Growing table is one of them.

-- 
Angular momentum makes the world go 'round.

RE: How can i be certain autovacuum is causing reuse if table still grows

From

Sidney Aloisio Ferreira Pryor

Date:

28 July 2020, 03:54:08

Thank you Ron.

Yes there is a column id in which value is provided by a sequence.

But it was not clear to me exactly how to partition?

The idea is to cover a limited range of "id" so there will be less lines to vacuum on each operation?

Because yet we will need eventually to vacuum all others.

All lines even old ones might be updated by our application.

De: Ron <ronljohnsonjr@gmail.com>
Enviado: terça-feira, 28 de julho de 2020 00:39
Para: pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On 7/27/20 5:19 PM, Keith Fiske wrote:
[snip]
> Also, I would not recommend partitioning simply to improve vacuuming.
> Especially if extensive tuning hasn't been tried first. Most times you can
> get per-table tuning working well enough to get autovacuum running
> properly. Especially on 9.6 and even more-so on PG11, where autovacuum has
> itself been improved.

SIMPLY to improve vacuum performance? No. But there are reasons that
partitioning was "invented", and minimizing the work needed to be done on
the whole of a Very Large and Rapidly Growing table is one of them.

--
Angular momentum makes the world go 'round.

Re: How can i be certain autovacuum is causing reuse if table still grows

From

Ron

Date:

28 July 2020, 05:15:54

> All lines even old ones might be updated by our application.

Ah. Even so, there are benefits to maintaining "lots of smaller chunks" instead of One Giant Table: "pg_dump --jobs=X" will run faster, for example, when there are more small-to-mid-sized tables instead of A Few Giant Tables. Ditto "reindexdb --jobs=X".

Given that even "old" records get updated, I'd look into some other key that you can "naturally" partition on.

On 7/27/20 10:54 PM, Sidney Aloisio Ferreira Pryor wrote:

P {margin-top:0;margin-bottom:0;}
Thank you Ron.
Yes there is a column id in which value is provided by a sequence.

But it was not clear to me exactly how to partition?
The idea is to cover a limited range of "id" so there will be less lines to vacuum on each operation?

Because yet we will need eventually to vacuum all others.
All lines even old ones might be updated by our application.

De: Ron <ronljohnsonjr@gmail.com>
Enviado: terça-feira, 28 de julho de 2020 00:39
Para: pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On 7/27/20 5:19 PM, Keith Fiske wrote:
[snip]
> Also, I would not recommend partitioning simply to improve vacuuming.
> Especially if extensive tuning hasn't been tried first. Most times you can
> get per-table tuning working well enough to get autovacuum running
> properly. Especially on 9.6 and even more-so on PG11, where autovacuum has
> itself been improved.

SIMPLY to improve vacuum performance? No. But there are reasons that
partitioning was "invented", and minimizing the work needed to be done on
the whole of a Very Large and Rapidly Growing table is one of them.

--
Angular momentum makes the world go 'round.

--
Angular momentum makes the world go 'round.

Re: How can i be certain autovacuum is causing reuse if table still grows

From

Holger Jakobs

Date:

28 July 2020, 06:44:18

So you can use your SERIAL (or GENERATED AS IDENTITY) column for this. If you don't have this, you might add a DATE DEFAULT CURRENT_DATE column, even if it's of no use except being the partition key.

Am 28.07.20 um 05:32 schrieb Ron:

Any key that "increases linearly over time" is a chronological key; sequences are a perfect example.

On 7/27/20 4:38 PM, Sidney Aloisio Ferreira Pryor wrote:
P {margin-top:0;margin-bottom:0;}
Thank you Ron but this table has no date column.
It has one line for each page of each scanned document.
De: Ron <ronljohnsonjr@gmail.com>
Enviado: segunda-feira, 27 de julho de 2020 18:27
Para: pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

Another suggestion is to partition the table by a chronological key, so that only a fraction of the table needs to be vacuumed.

On 7/27/20 3:36 PM, Sidney Aloisio Ferreira Pryor wrote:

Thank you, David.
So can i assume that autovacuum only release dead tuples when it ends its execution?

Autovacuum is not finishing and is running for days.
We did not find a parameter to limit autovacuum executing time.
Do you have any recomendation?

Thank you.
Sidney.
De: David G. Johnston <david.g.johnston@gmail.com>
Enviado: segunda-feira, 27 de julho de 2020 17:17
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On Monday, July 27, 2020, Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:
Hi, we have a 9.6.5 postgres

Autovacuum is set with default parameters and is running on this table for days without naping or stopping.

3285 | -5 days -23:33:10.792166 | postgres | autovacuum: VACUUM ANALYZE public.flip_pagina_edicao (to prevent wraparound)
(1 rows)

Its probably best avoid drawing conclusions here until the anti-wraparound vacuum has completed and you’ve updated to the latest minor version of PostgreSQL.

David J.

--
Angular momentum makes the world go 'round.

--
Angular momentum makes the world go 'round.

-- 
Holger Jakobs, Bergisch Gladbach, Tel. +49-178-9759012

Re: How can i be certain autovacuum is causing reuse if table still grows

From

Sidney Aloisio Ferreira Pryor

Date:

28 July 2020, 10:17:57

Thank you Ron

I would appreciate if you could share a good documentation or review link about partitioning tables on postgres.

Enviado do meu iPhone

Em 28 de jul de 2020, à(s) 02:16, Ron <ronljohnsonjr@gmail.com> escreveu:

> All lines even old ones might be updated by our application.

Ah. Even so, there are benefits to maintaining "lots of smaller chunks" instead of One Giant Table: "pg_dump --jobs=X" will run faster, for example, when there are more small-to-mid-sized tables instead of A Few Giant Tables. Ditto "reindexdb --jobs=X".

Given that even "old" records get updated, I'd look into some other key that you can "naturally" partition on.

On 7/27/20 10:54 PM, Sidney Aloisio Ferreira Pryor wrote:
Thank you Ron.
Yes there is a column id in which value is provided by a sequence.

But it was not clear to me exactly how to partition?
The idea is to cover a limited range of "id" so there will be less lines to vacuum on each operation?

Because yet we will need eventually to vacuum all others.
All lines even old ones might be updated by our application.

De: Ron <ronljohnsonjr@gmail.com>
Enviado: terça-feira, 28 de julho de 2020 00:39
Para: pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On 7/27/20 5:19 PM, Keith Fiske wrote:
[snip]
> Also, I would not recommend partitioning simply to improve vacuuming.
> Especially if extensive tuning hasn't been tried first. Most times you can
> get per-table tuning working well enough to get autovacuum running
> properly. Especially on 9.6 and even more-so on PG11, where autovacuum has
> itself been improved.

SIMPLY to improve vacuum performance? No. But there are reasons that
partitioning was "invented", and minimizing the work needed to be done on
the whole of a Very Large and Rapidly Growing table is one of them.

--
Angular momentum makes the world go 'round.

--
Angular momentum makes the world go 'round.

Re: How can i be certain autovacuum is causing reuse if table still grows

From

Keith Fiske

Date:

28 July 2020, 13:55:39

Again, I would caution against going with partitioning until you can first try and tune the table itself. I have quite a bit of experience with it as I'm the author of pg_partman so you can see I have nothing against partitioning itself, but I'm only for using it when it's necessary. If you decide to go that route, it would be a good tool to help you automate much of the maintenance process of partitioning either by time or integer.

https://github.com/pgpartman/pg_partman

I say this because, while it could possibly help with making vacuum more efficient, there are many caveats to partitioning. Since you're on 9.6, there is no native partitioning, so you'd have to do trigger based partitioning which has a HUGE impact on write performance (at least 2x slower if not worse). Even with native partitioning in 10+, there is a performance penalty during tuple routing and you would want to test that impact. Especially if you are doing updates which would move data across child partitions (updates that move data prior to PG10 are very, very complicated and not supported by pg_partman). Also, if you have any unique keys to maintain, they are not enforced across the partition set, and only supported in native if they are also part of the partition key itself. You're also just adding to the general complexity of your database overall.

I would in general say the primary reason you would want to consider partitioning in PG is if you're trying to expire old data out. Dropping a table is much more efficient than running large/many DELETE statements due to the very issues you're encountering now: bloat. But it sounds like you are not expiring data, so you also have to take into account how many partitions you will have long term. It wasn't until PG12 that having over just even a few hundred partitions could have major performance impact overall. So, before even considering partitioning, you would want to get onto the latest major release.

Partitioning can help with vacuum and general maintenance, but that's really only when the actual live data in a single table starts getting very, very large. And in that case you may want to first consider your hardware resources before going down the road of partitioning since that may be the better benefit long term. You've got nearly 2.3TB of space to clean up in this table, so, again, you want to solve that problem first. If you had that much free space before this vacuum finished, you likely have even more now after it finishes. And just note that a regular vacuum is likely not going to clean up this space. It may clean up some, but vacuum only returns space to the system in very specific circumstances (the tail-end pages are empty). You could use something like pg_repack or pg_squeeze to do this without a long outage, but with bloat this size, you may run into issues with how long those tools would need to run. Especially if the table is still in use. So, again, I would recommend an outage to do a VACUUM FULL which will lock and rewrite the table. I'd be curious to see what your total size of the entire table before and after is. The following function will give that total size (table + indexes)

select pg_total_relation_size('public.mytable');

So, while partitioning could possibly be the answer long term, I would not recommend it until you've cleaned up the existing bloat on this table and its indexes and try to tune autovacuum run more efficiently. If that works, you've saved yourself a great deal of complexity. And again, you'd definitely want to get on the latest major version of PG where partitioning has improved tremendously, so if a major version upgrade isn't in the works for a while, I wouldn't even consider it.

Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

On Tue, Jul 28, 2020 at 6:18 AM Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:

Thank you Ron
I would appreciate if you could share a good documentation or review link about partitioning tables on postgres.

Enviado do meu iPhone

Em 28 de jul de 2020, à(s) 02:16, Ron <ronljohnsonjr@gmail.com> escreveu:

> All lines even old ones might be updated by our application.

Ah. Even so, there are benefits to maintaining "lots of smaller chunks" instead of One Giant Table: "pg_dump --jobs=X" will run faster, for example, when there are more small-to-mid-sized tables instead of A Few Giant Tables. Ditto "reindexdb --jobs=X".

Given that even "old" records get updated, I'd look into some other key that you can "naturally" partition on.

On 7/27/20 10:54 PM, Sidney Aloisio Ferreira Pryor wrote:
Thank you Ron.
Yes there is a column id in which value is provided by a sequence.

But it was not clear to me exactly how to partition?
The idea is to cover a limited range of "id" so there will be less lines to vacuum on each operation?

Because yet we will need eventually to vacuum all others.
All lines even old ones might be updated by our application.

De: Ron <ronljohnsonjr@gmail.com>
Enviado: terça-feira, 28 de julho de 2020 00:39
Para: pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On 7/27/20 5:19 PM, Keith Fiske wrote:
[snip]
> Also, I would not recommend partitioning simply to improve vacuuming.
> Especially if extensive tuning hasn't been tried first. Most times you can
> get per-table tuning working well enough to get autovacuum running
> properly. Especially on 9.6 and even more-so on PG11, where autovacuum has
> itself been improved.

SIMPLY to improve vacuum performance? No. But there are reasons that
partitioning was "invented", and minimizing the work needed to be done on
the whole of a Very Large and Rapidly Growing table is one of them.

--
Angular momentum makes the world go 'round.

--
Angular momentum makes the world go 'round.

RE: How can i be certain autovacuum is causing reuse if table still grows

From

Sidney Aloisio Ferreira Pryor

Date:

28 July 2020, 18:20:06

Thank you very much for this complete explanation and opinion, Keith.

The first goal of publishing this mail on the list was to answer how to measure precisely free space: pgstattuple is the correct one.

The second goal is running so far: manual vacuum to accomplish reuse of that large free area. (i will post a pgstattuple of the table when this vacuum ends)

And the third goal is to free and downsize this huge allocation of space: schedule an outage to run vacuum full.

Again i am truly grateful for your precious help.

And i will post here Pg_total_relation_size before and after running vacuum full when we schedule that outage.

Best Regards,

Sidney Pryor

De: Keith Fiske <keith.fiske@crunchydata.com>
Enviado: terça-feira, 28 de julho de 2020 10:55
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: Ron <ronljohnsonjr@gmail.com>; pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

https://github.com/pgpartman/pg_partman

select pg_total_relation_size('public.mytable');

Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

On Tue, Jul 28, 2020 at 6:18 AM Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:

Thank you Ron
I would appreciate if you could share a good documentation or review link about partitioning tables on postgres.

Enviado do meu iPhone

Em 28 de jul de 2020, à(s) 02:16, Ron <ronljohnsonjr@gmail.com> escreveu:

> All lines even old ones might be updated by our application.

Ah. Even so, there are benefits to maintaining "lots of smaller chunks" instead of One Giant Table: "pg_dump --jobs=X" will run faster, for example, when there are more small-to-mid-sized tables instead of A Few Giant Tables. Ditto "reindexdb --jobs=X".

Given that even "old" records get updated, I'd look into some other key that you can "naturally" partition on.

On 7/27/20 10:54 PM, Sidney Aloisio Ferreira Pryor wrote:
Thank you Ron.
Yes there is a column id in which value is provided by a sequence.

But it was not clear to me exactly how to partition?
The idea is to cover a limited range of "id" so there will be less lines to vacuum on each operation?

Because yet we will need eventually to vacuum all others.
All lines even old ones might be updated by our application.

De: Ron <ronljohnsonjr@gmail.com>
Enviado: terça-feira, 28 de julho de 2020 00:39
Para: pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On 7/27/20 5:19 PM, Keith Fiske wrote:
[snip]
> Also, I would not recommend partitioning simply to improve vacuuming.
> Especially if extensive tuning hasn't been tried first. Most times you can
> get per-table tuning working well enough to get autovacuum running
> properly. Especially on 9.6 and even more-so on PG11, where autovacuum has
> itself been improved.

SIMPLY to improve vacuum performance? No. But there are reasons that
partitioning was "invented", and minimizing the work needed to be done on
the whole of a Very Large and Rapidly Growing table is one of them.

--
Angular momentum makes the world go 'round.

--
Angular momentum makes the world go 'round.

RE: How can i be certain autovacuum is causing reuse if table still grows

From

Sidney Aloisio Ferreira Pryor

Date:

03 August 2020, 21:36:10

Hi Keith, how are you?

Hope you are very fine.

As i promissed we ran vacuum full on that table on a test environment similar to production and it took 11 hours to complete.

It reduced the original table segment from 2.3TB to 100GB as below (note that it is still growing...).

So we are now proposing to our Direction a maintenance window to vacuum full production.

Original:

table_schema | table_name | total_size | data_size | external_size

--------------+---------------------+------------+-----------+---------------

public | flip_pagina_edicao | 4578 GB | 2880 GB | 1699 GB

Copy of Orginal (after vacuum full):

table_schema | table_name | total_size | data_size | external_size

--------------+---------------------+------------+-----------+---------------

public | flip_pagina_edicao | 180 GB | 100 GB | 80 GB

In meantime if you could please help us understand some curious facts below about vacuum verbose (not full).

I am running on production another vacuum verbose (not full) because that first one stucked on the same indexes for 3 days.

But now is on a 3rd sequential scan of all indexes of that table followed by a scan on the table itself.

It is not finishing even completing scans.

1) Why it is scanning again and not finishing vacuum?

2) I noticed that on all scans it is taking 12 hours only on an index on a bpchar(1) column (character(1) DEFAULT 'N'::bpchar).

All other indexes and even the table it self are running too much faster, can you guess a reason for this?

CREATE INDEX ocrindex

ON public.flip_pagina_edicao

USING btree

(ocr COLLATE pg_catalog."default");

3) And is repeatly scanning 11.184.520 or 521 or 528 row versions.

Why does it not end and scans again the same amount of row versions?

flip=# vacuum verbose flip_pagina_edicao;

INFO: vacuuming "public.flip_pagina_edicao"

INFO: scanned index "id_caderno" to remove 11184520 row versions

DETAIL: CPU 155.34s/540.40u sec elapsed 1101.49 sec

INFO: scanned index "ordem_index" to remove 11184520 row versions

DETAIL: CPU 253.35s/870.77u sec elapsed 1962.36 sec

INFO: scanned index "primary00024" to remove 11184520 row versions

DETAIL: CPU 134.10s/478.79u sec elapsed 922.02 sec

INFO: scanned index "ordem" to remove 11184520 row versions

DETAIL: CPU 265.16s/932.90u sec elapsed 1878.20 sec

INFO: scanned index "flip_pagina_edicao_pkey" to remove 11184520 row versions

DETAIL: CPU 149.51s/523.42u sec elapsed 949.68 sec

INFO: scanned index "flip_flippagedic_idcaderno" to remove 11184520 row versions

DETAIL: CPU 460.19s/1171.75u sec elapsed 2696.89 sec

INFO: scanned index "nomepdfindex" to remove 11184520 row versions

DETAIL: CPU 598.88s/1286.84u sec elapsed 3543.26 sec

INFO: scanned index "ocrindex" to remove 11184520 row versions

DETAIL: CPU 25657.30s/18907.82u sec elapsed 46652.49 sec

INFO: scanned index "idr_documento_index" to remove 11184520 row versions

DETAIL: CPU 321.96s/968.75u sec elapsed 2004.91 sec

INFO: "flip_pagina_edicao": removed 11184520 row versions in 3762529 pages

DETAIL: CPU 165.65s/95.06u sec elapsed 647.49 sec

INFO: scanned index "id_caderno" to remove 11184521 row versions

DETAIL: CPU 175.89s/461.26u sec elapsed 1098.74 sec

INFO: scanned index "ordem_index" to remove 11184521 row versions

DETAIL: CPU 301.57s/887.26u sec elapsed 2052.44 sec

INFO: scanned index "primary00024" to remove 11184521 row versions

DETAIL: CPU 150.12s/461.92u sec elapsed 947.74 sec

INFO: scanned index "ordem" to remove 11184521 row versions

DETAIL: CPU 286.29s/896.03u sec elapsed 1954.21 sec

INFO: scanned index "flip_pagina_edicao_pkey" to remove 11184521 row versions

DETAIL: CPU 140.75s/423.38u sec elapsed 901.04 sec

INFO: scanned index "flip_flippagedic_idcaderno" to remove 11184521 row versions

DETAIL: CPU 455.24s/1043.27u sec elapsed 2551.54 sec

INFO: scanned index "nomepdfindex" to remove 11184521 row versions

DETAIL: CPU 644.78s/1163.80u sec elapsed 3469.67 sec

INFO: scanned index "ocrindex" to remove 11184521 row versions

DETAIL: CPU 25951.95s/18712.36u sec elapsed 49880.37 sec

INFO: scanned index "idr_documento_index" to remove 11184521 row versions

DETAIL: CPU 274.05s/710.37u sec elapsed 1799.98 sec

INFO: "flip_pagina_edicao": removed 11184521 row versions in 2358457 pages

DETAIL: CPU 68.83s/36.81u sec elapsed 353.40 sec

INFO: scanned index "id_caderno" to remove 11184528 row versions

DETAIL: CPU 156.47s/372.75u sec elapsed 1022.31 sec

INFO: scanned index "ordem_index" to remove 11184528 row versions

DETAIL: CPU 257.13s/669.86u sec elapsed 2057.05 sec

INFO: scanned index "primary00024" to remove 11184528 row versions

DETAIL: CPU 116.29s/319.86u sec elapsed 842.55 sec

INFO: scanned index "ordem" to remove 11184528 row versions

DETAIL: CPU 232.78s/562.22u sec elapsed 2676.35 sec

INFO: scanned index "flip_pagina_edicao_pkey" to remove 11184528 row versions

DETAIL: CPU 91.21s/241.00u sec elapsed 875.27 sec

INFO: scanned index "flip_flippagedic_idcaderno" to remove 11184528 row versions

DETAIL: CPU 285.71s/585.63u sec elapsed 2593.08 sec

INFO: scanned index "nomepdfindex" to remove 11184528 row versions

DETAIL: CPU 413.55s/692.77u sec elapsed 3843.78 sec

Thank you.

Sidney Pryor.

De: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Enviado: terça-feira, 28 de julho de 2020 15:20
Para: Keith Fiske <keith.fiske@crunchydata.com>
Cc: Ron <ronljohnsonjr@gmail.com>; pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: RE: How can i be certain autovacuum is causing reuse if table still grows

Thank you very much for this complete explanation and opinion, Keith.

The first goal of publishing this mail on the list was to answer how to measure precisely free space: pgstattuple is the correct one.

The second goal is running so far: manual vacuum to accomplish reuse of that large free area. (i will post a pgstattuple of the table when this vacuum ends)

And the third goal is to free and downsize this huge allocation of space: schedule an outage to run vacuum full.

Again i am truly grateful for your precious help.

And i will post here Pg_total_relation_size before and after running vacuum full when we schedule that outage.

Best Regards,

Sidney Pryor

https://github.com/pgpartman/pg_partman

select pg_total_relation_size('public.mytable');

Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

On Tue, Jul 28, 2020 at 6:18 AM Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:

Thank you Ron
I would appreciate if you could share a good documentation or review link about partitioning tables on postgres.

Enviado do meu iPhone

Em 28 de jul de 2020, à(s) 02:16, Ron <ronljohnsonjr@gmail.com> escreveu:

> All lines even old ones might be updated by our application.

Ah. Even so, there are benefits to maintaining "lots of smaller chunks" instead of One Giant Table: "pg_dump --jobs=X" will run faster, for example, when there are more small-to-mid-sized tables instead of A Few Giant Tables. Ditto "reindexdb --jobs=X".

Given that even "old" records get updated, I'd look into some other key that you can "naturally" partition on.

On 7/27/20 10:54 PM, Sidney Aloisio Ferreira Pryor wrote:
Thank you Ron.
Yes there is a column id in which value is provided by a sequence.

But it was not clear to me exactly how to partition?
The idea is to cover a limited range of "id" so there will be less lines to vacuum on each operation?

Because yet we will need eventually to vacuum all others.
All lines even old ones might be updated by our application.

De: Ron <ronljohnsonjr@gmail.com>
Enviado: terça-feira, 28 de julho de 2020 00:39
Para: pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On 7/27/20 5:19 PM, Keith Fiske wrote:
[snip]
> Also, I would not recommend partitioning simply to improve vacuuming.
> Especially if extensive tuning hasn't been tried first. Most times you can
> get per-table tuning working well enough to get autovacuum running
> properly. Especially on 9.6 and even more-so on PG11, where autovacuum has
> itself been improved.

SIMPLY to improve vacuum performance? No. But there are reasons that
partitioning was "invented", and minimizing the work needed to be done on
the whole of a Very Large and Rapidly Growing table is one of them.

--
Angular momentum makes the world go 'round.

--
Angular momentum makes the world go 'round.

Re: How can i be certain autovacuum is causing reuse if table still grows

From

Guillaume Lelarge

Date:

04 August 2020, 09:03:44

Le lun. 3 août 2020 à 23:36, Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> a écrit :

Hi Keith, how are you?
Hope you are very fine.

As i promissed we ran vacuum full on that table on a test environment similar to production and it took 11 hours to complete.
It reduced the original table segment from 2.3TB to 100GB as below (note that it is still growing...).
So we are now proposing to our Direction a maintenance window to vacuum full production.

Original:
table_schema | table_name | total_size | data_size | external_size
--------------+---------------------+------------+-----------+---------------
public | flip_pagina_edicao | 4578 GB | 2880 GB | 1699 GB

Copy of Orginal (after vacuum full):
table_schema | table_name | total_size | data_size | external_size
--------------+---------------------+------------+-----------+---------------
public | flip_pagina_edicao | 180 GB | 100 GB | 80 GB

In meantime if you could please help us understand some curious facts below about vacuum verbose (not full).
I am running on production another vacuum verbose (not full) because that first one stucked on the same indexes for 3 days.
But now is on a 3rd sequential scan of all indexes of that table followed by a scan on the table itself.
It is not finishing even completing scans.

1) Why it is scanning again and not finishing vacuum?

Your maintenance_work_mem parameter value is probably too low.

2) I noticed that on all scans it is taking 12 hours only on an index on a bpchar(1) column (character(1) DEFAULT 'N'::bpchar).
All other indexes and even the table it self are running too much faster, can you guess a reason for this?

CREATE INDEX ocrindex
ON public.flip_pagina_edicao
USING btree
(ocr COLLATE pg_catalog."default");

Not used, so not in cache? or much bigger than the other ones?

3) And is repeatly scanning 11.184.520 or 521 or 528 row versions.
Why does it not end and scans again the same amount of row versions?

Well, VACUUM is a multi-steps operation. It first scans the table and puts every tid that needs to be deleted in memory. Once done, il will scan indexes. But if there's more TID than memory can hold, it will have to do multiple scans of the table and the indexes. It seems clear in your case that, when VACUUM finds 11184520 TID, memory is full. By the way, memory is limited by a parameter named maintenance_work_mem. So what is its current value?

flip=# vacuum verbose flip_pagina_edicao;
INFO: vacuuming "public.flip_pagina_edicao"
INFO: scanned index "id_caderno" to remove 11184520 row versions
DETAIL: CPU 155.34s/540.40u sec elapsed 1101.49 sec
INFO: scanned index "ordem_index" to remove 11184520 row versions
DETAIL: CPU 253.35s/870.77u sec elapsed 1962.36 sec
INFO: scanned index "primary00024" to remove 11184520 row versions
DETAIL: CPU 134.10s/478.79u sec elapsed 922.02 sec
INFO: scanned index "ordem" to remove 11184520 row versions
DETAIL: CPU 265.16s/932.90u sec elapsed 1878.20 sec
INFO: scanned index "flip_pagina_edicao_pkey" to remove 11184520 row versions
DETAIL: CPU 149.51s/523.42u sec elapsed 949.68 sec
INFO: scanned index "flip_flippagedic_idcaderno" to remove 11184520 row versions
DETAIL: CPU 460.19s/1171.75u sec elapsed 2696.89 sec
INFO: scanned index "nomepdfindex" to remove 11184520 row versions
DETAIL: CPU 598.88s/1286.84u sec elapsed 3543.26 sec
INFO: scanned index "ocrindex" to remove 11184520 row versions
DETAIL: CPU 25657.30s/18907.82u sec elapsed 46652.49 sec
INFO: scanned index "idr_documento_index" to remove 11184520 row versions
DETAIL: CPU 321.96s/968.75u sec elapsed 2004.91 sec
INFO: "flip_pagina_edicao": removed 11184520 row versions in 3762529 pages
DETAIL: CPU 165.65s/95.06u sec elapsed 647.49 sec
INFO: scanned index "id_caderno" to remove 11184521 row versions
DETAIL: CPU 175.89s/461.26u sec elapsed 1098.74 sec
INFO: scanned index "ordem_index" to remove 11184521 row versions
DETAIL: CPU 301.57s/887.26u sec elapsed 2052.44 sec
INFO: scanned index "primary00024" to remove 11184521 row versions
DETAIL: CPU 150.12s/461.92u sec elapsed 947.74 sec
INFO: scanned index "ordem" to remove 11184521 row versions
DETAIL: CPU 286.29s/896.03u sec elapsed 1954.21 sec
INFO: scanned index "flip_pagina_edicao_pkey" to remove 11184521 row versions
DETAIL: CPU 140.75s/423.38u sec elapsed 901.04 sec
INFO: scanned index "flip_flippagedic_idcaderno" to remove 11184521 row versions
DETAIL: CPU 455.24s/1043.27u sec elapsed 2551.54 sec
INFO: scanned index "nomepdfindex" to remove 11184521 row versions
DETAIL: CPU 644.78s/1163.80u sec elapsed 3469.67 sec
INFO: scanned index "ocrindex" to remove 11184521 row versions
DETAIL: CPU 25951.95s/18712.36u sec elapsed 49880.37 sec
INFO: scanned index "idr_documento_index" to remove 11184521 row versions
DETAIL: CPU 274.05s/710.37u sec elapsed 1799.98 sec
INFO: "flip_pagina_edicao": removed 11184521 row versions in 2358457 pages
DETAIL: CPU 68.83s/36.81u sec elapsed 353.40 sec
INFO: scanned index "id_caderno" to remove 11184528 row versions
DETAIL: CPU 156.47s/372.75u sec elapsed 1022.31 sec
INFO: scanned index "ordem_index" to remove 11184528 row versions
DETAIL: CPU 257.13s/669.86u sec elapsed 2057.05 sec
INFO: scanned index "primary00024" to remove 11184528 row versions
DETAIL: CPU 116.29s/319.86u sec elapsed 842.55 sec
INFO: scanned index "ordem" to remove 11184528 row versions
DETAIL: CPU 232.78s/562.22u sec elapsed 2676.35 sec
INFO: scanned index "flip_pagina_edicao_pkey" to remove 11184528 row versions
DETAIL: CPU 91.21s/241.00u sec elapsed 875.27 sec
INFO: scanned index "flip_flippagedic_idcaderno" to remove 11184528 row versions
DETAIL: CPU 285.71s/585.63u sec elapsed 2593.08 sec
INFO: scanned index "nomepdfindex" to remove 11184528 row versions
DETAIL: CPU 413.55s/692.77u sec elapsed 3843.78 sec

Thank you.
Sidney Pryor.

De: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Enviado: terça-feira, 28 de julho de 2020 15:20
Para: Keith Fiske <keith.fiske@crunchydata.com>
Cc: Ron <ronljohnsonjr@gmail.com>; pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: RE: How can i be certain autovacuum is causing reuse if table still grows

Thank you very much for this complete explanation and opinion, Keith.

The first goal of publishing this mail on the list was to answer how to measure precisely free space: pgstattuple is the correct one.
The second goal is running so far: manual vacuum to accomplish reuse of that large free area. (i will post a pgstattuple of the table when this vacuum ends)
And the third goal is to free and downsize this huge allocation of space: schedule an outage to run vacuum full.

Again i am truly grateful for your precious help.
And i will post here Pg_total_relation_size before and after running vacuum full when we schedule that outage.

Best Regards,
Sidney Pryor

De: Keith Fiske <keith.fiske@crunchydata.com>
Enviado: terça-feira, 28 de julho de 2020 10:55
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: Ron <ronljohnsonjr@gmail.com>; pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

Again, I would caution against going with partitioning until you can first try and tune the table itself. I have quite a bit of experience with it as I'm the author of pg_partman so you can see I have nothing against partitioning itself, but I'm only for using it when it's necessary. If you decide to go that route, it would be a good tool to help you automate much of the maintenance process of partitioning either by time or integer.

https://github.com/pgpartman/pg_partman

I say this because, while it could possibly help with making vacuum more efficient, there are many caveats to partitioning. Since you're on 9.6, there is no native partitioning, so you'd have to do trigger based partitioning which has a HUGE impact on write performance (at least 2x slower if not worse). Even with native partitioning in 10+, there is a performance penalty during tuple routing and you would want to test that impact. Especially if you are doing updates which would move data across child partitions (updates that move data prior to PG10 are very, very complicated and not supported by pg_partman). Also, if you have any unique keys to maintain, they are not enforced across the partition set, and only supported in native if they are also part of the partition key itself. You're also just adding to the general complexity of your database overall.

I would in general say the primary reason you would want to consider partitioning in PG is if you're trying to expire old data out. Dropping a table is much more efficient than running large/many DELETE statements due to the very issues you're encountering now: bloat. But it sounds like you are not expiring data, so you also have to take into account how many partitions you will have long term. It wasn't until PG12 that having over just even a few hundred partitions could have major performance impact overall. So, before even considering partitioning, you would want to get onto the latest major release.

Partitioning can help with vacuum and general maintenance, but that's really only when the actual live data in a single table starts getting very, very large. And in that case you may want to first consider your hardware resources before going down the road of partitioning since that may be the better benefit long term. You've got nearly 2.3TB of space to clean up in this table, so, again, you want to solve that problem first. If you had that much free space before this vacuum finished, you likely have even more now after it finishes. And just note that a regular vacuum is likely not going to clean up this space. It may clean up some, but vacuum only returns space to the system in very specific circumstances (the tail-end pages are empty). You could use something like pg_repack or pg_squeeze to do this without a long outage, but with bloat this size, you may run into issues with how long those tools would need to run. Especially if the table is still in use. So, again, I would recommend an outage to do a VACUUM FULL which will lock and rewrite the table. I'd be curious to see what your total size of the entire table before and after is. The following function will give that total size (table + indexes)

select pg_total_relation_size('public.mytable');

So, while partitioning could possibly be the answer long term, I would not recommend it until you've cleaned up the existing bloat on this table and its indexes and try to tune autovacuum run more efficiently. If that works, you've saved yourself a great deal of complexity. And again, you'd definitely want to get on the latest major version of PG where partitioning has improved tremendously, so if a major version upgrade isn't in the works for a while, I wouldn't even consider it.

--
Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

On Tue, Jul 28, 2020 at 6:18 AM Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:
Thank you Ron
I would appreciate if you could share a good documentation or review link about partitioning tables on postgres.

Enviado do meu iPhone

Em 28 de jul de 2020, à(s) 02:16, Ron <ronljohnsonjr@gmail.com> escreveu:

> All lines even old ones might be updated by our application.

Ah. Even so, there are benefits to maintaining "lots of smaller chunks" instead of One Giant Table: "pg_dump --jobs=X" will run faster, for example, when there are more small-to-mid-sized tables instead of A Few Giant Tables. Ditto "reindexdb --jobs=X".

Given that even "old" records get updated, I'd look into some other key that you can "naturally" partition on.

On 7/27/20 10:54 PM, Sidney Aloisio Ferreira Pryor wrote:
Thank you Ron.
Yes there is a column id in which value is provided by a sequence.

But it was not clear to me exactly how to partition?
The idea is to cover a limited range of "id" so there will be less lines to vacuum on each operation?

Because yet we will need eventually to vacuum all others.
All lines even old ones might be updated by our application.

De: Ron <ronljohnsonjr@gmail.com>
Enviado: terça-feira, 28 de julho de 2020 00:39
Para: pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On 7/27/20 5:19 PM, Keith Fiske wrote:
[snip]
> Also, I would not recommend partitioning simply to improve vacuuming.
> Especially if extensive tuning hasn't been tried first. Most times you can
> get per-table tuning working well enough to get autovacuum running
> properly. Especially on 9.6 and even more-so on PG11, where autovacuum has
> itself been improved.

SIMPLY to improve vacuum performance? No. But there are reasons that
partitioning was "invented", and minimizing the work needed to be done on
the whole of a Very Large and Rapidly Growing table is one of them.

--
Angular momentum makes the world go 'round.

--
Angular momentum makes the world go 'round.

Guillaume.

Re: How can i be certain autovacuum is causing reuse if table still grows

From

Keith Fiske

Date:

04 August 2020, 13:49:30

On Tue, Aug 4, 2020 at 5:03 AM Guillaume Lelarge <guillaume@lelarge.info> wrote:

Le lun. 3 août 2020 à 23:36, Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> a écrit :
Hi Keith, how are you?
Hope you are very fine.

As i promissed we ran vacuum full on that table on a test environment similar to production and it took 11 hours to complete.
It reduced the original table segment from 2.3TB to 100GB as below (note that it is still growing...).
So we are now proposing to our Direction a maintenance window to vacuum full production.

Original:
table_schema | table_name | total_size | data_size | external_size
--------------+---------------------+------------+-----------+---------------
public | flip_pagina_edicao | 4578 GB | 2880 GB | 1699 GB

Copy of Orginal (after vacuum full):
table_schema | table_name | total_size | data_size | external_size
--------------+---------------------+------------+-----------+---------------
public | flip_pagina_edicao | 180 GB | 100 GB | 80 GB

In meantime if you could please help us understand some curious facts below about vacuum verbose (not full).
I am running on production another vacuum verbose (not full) because that first one stucked on the same indexes for 3 days.
But now is on a 3rd sequential scan of all indexes of that table followed by a scan on the table itself.
It is not finishing even completing scans.

1) Why it is scanning again and not finishing vacuum?

Your maintenance_work_mem parameter value is probably too low.

2) I noticed that on all scans it is taking 12 hours only on an index on a bpchar(1) column (character(1) DEFAULT 'N'::bpchar).
All other indexes and even the table it self are running too much faster, can you guess a reason for this?

CREATE INDEX ocrindex
ON public.flip_pagina_edicao
USING btree
(ocr COLLATE pg_catalog."default");

Not used, so not in cache? or much bigger than the other ones?

3) And is repeatly scanning 11.184.520 or 521 or 528 row versions.
Why does it not end and scans again the same amount of row versions?

Well, VACUUM is a multi-steps operation. It first scans the table and puts every tid that needs to be deleted in memory. Once done, il will scan indexes. But if there's more TID than memory can hold, it will have to do multiple scans of the table and the indexes. It seems clear in your case that, when VACUUM finds 11184520 TID, memory is full. By the way, memory is limited by a parameter named maintenance_work_mem. So what is its current value?

flip=# vacuum verbose flip_pagina_edicao;
INFO: vacuuming "public.flip_pagina_edicao"
INFO: scanned index "id_caderno" to remove 11184520 row versions
DETAIL: CPU 155.34s/540.40u sec elapsed 1101.49 sec
INFO: scanned index "ordem_index" to remove 11184520 row versions
DETAIL: CPU 253.35s/870.77u sec elapsed 1962.36 sec
INFO: scanned index "primary00024" to remove 11184520 row versions
DETAIL: CPU 134.10s/478.79u sec elapsed 922.02 sec
INFO: scanned index "ordem" to remove 11184520 row versions
DETAIL: CPU 265.16s/932.90u sec elapsed 1878.20 sec
INFO: scanned index "flip_pagina_edicao_pkey" to remove 11184520 row versions
DETAIL: CPU 149.51s/523.42u sec elapsed 949.68 sec
INFO: scanned index "flip_flippagedic_idcaderno" to remove 11184520 row versions
DETAIL: CPU 460.19s/1171.75u sec elapsed 2696.89 sec
INFO: scanned index "nomepdfindex" to remove 11184520 row versions
DETAIL: CPU 598.88s/1286.84u sec elapsed 3543.26 sec
INFO: scanned index "ocrindex" to remove 11184520 row versions
DETAIL: CPU 25657.30s/18907.82u sec elapsed 46652.49 sec
INFO: scanned index "idr_documento_index" to remove 11184520 row versions
DETAIL: CPU 321.96s/968.75u sec elapsed 2004.91 sec
INFO: "flip_pagina_edicao": removed 11184520 row versions in 3762529 pages
DETAIL: CPU 165.65s/95.06u sec elapsed 647.49 sec
INFO: scanned index "id_caderno" to remove 11184521 row versions
DETAIL: CPU 175.89s/461.26u sec elapsed 1098.74 sec
INFO: scanned index "ordem_index" to remove 11184521 row versions
DETAIL: CPU 301.57s/887.26u sec elapsed 2052.44 sec
INFO: scanned index "primary00024" to remove 11184521 row versions
DETAIL: CPU 150.12s/461.92u sec elapsed 947.74 sec
INFO: scanned index "ordem" to remove 11184521 row versions
DETAIL: CPU 286.29s/896.03u sec elapsed 1954.21 sec
INFO: scanned index "flip_pagina_edicao_pkey" to remove 11184521 row versions
DETAIL: CPU 140.75s/423.38u sec elapsed 901.04 sec
INFO: scanned index "flip_flippagedic_idcaderno" to remove 11184521 row versions
DETAIL: CPU 455.24s/1043.27u sec elapsed 2551.54 sec
INFO: scanned index "nomepdfindex" to remove 11184521 row versions
DETAIL: CPU 644.78s/1163.80u sec elapsed 3469.67 sec
INFO: scanned index "ocrindex" to remove 11184521 row versions
DETAIL: CPU 25951.95s/18712.36u sec elapsed 49880.37 sec
INFO: scanned index "idr_documento_index" to remove 11184521 row versions
DETAIL: CPU 274.05s/710.37u sec elapsed 1799.98 sec
INFO: "flip_pagina_edicao": removed 11184521 row versions in 2358457 pages
DETAIL: CPU 68.83s/36.81u sec elapsed 353.40 sec
INFO: scanned index "id_caderno" to remove 11184528 row versions
DETAIL: CPU 156.47s/372.75u sec elapsed 1022.31 sec
INFO: scanned index "ordem_index" to remove 11184528 row versions
DETAIL: CPU 257.13s/669.86u sec elapsed 2057.05 sec
INFO: scanned index "primary00024" to remove 11184528 row versions
DETAIL: CPU 116.29s/319.86u sec elapsed 842.55 sec
INFO: scanned index "ordem" to remove 11184528 row versions
DETAIL: CPU 232.78s/562.22u sec elapsed 2676.35 sec
INFO: scanned index "flip_pagina_edicao_pkey" to remove 11184528 row versions
DETAIL: CPU 91.21s/241.00u sec elapsed 875.27 sec
INFO: scanned index "flip_flippagedic_idcaderno" to remove 11184528 row versions
DETAIL: CPU 285.71s/585.63u sec elapsed 2593.08 sec
INFO: scanned index "nomepdfindex" to remove 11184528 row versions
DETAIL: CPU 413.55s/692.77u sec elapsed 3843.78 sec

Thank you.
Sidney Pryor.

--
Guillaume.

Guillaume answered most of your questions (thanks!).

Since this shrunk the size of your table so dramatically, the best thing going forward after fixing things would be to tune this table specifically until you find the point where vacuum is running often enough to clean up enough space for future updates/deletes to just use that space instead of allocating more. For reference again

https://www.keithf4.com/per-table-autovacuum-tuning/

There will likely always be a certain level of bloat, but that's not a bad thing as long as it doesn't continually keep growing over time. So I'd recommend running something like pg_bloat_check on this table maybe once a week, or even once a day, during off-peak hours to keep an eye on it. It may take a while to narrow down the sweet spot of how and when to get autovacuum to run and keep the free space in balance with future writes. And if you have particularly heavy periods of writes, it may require manually scheduling vacuums around those times as well.

Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

RE: How can i be certain autovacuum is causing reuse if table still grows

From

Sidney Aloisio Ferreira Pryor

Date:

04 August 2020, 15:46:34

Thank you very much Guillaume and Keith for your straight and quick answers.

I was looking this database size and tup_update history.

It never shrank even with autovacuum enabled (growing 100GB a month).

The number of updates was always this high.

But on last 2 months its size started to grow a lot (growing 50GB a day).

We will schedule a vacuum full to stop adding disks.

But i am truly afraid postgres is not suitable for this application.

We are talkiing with dev team for the last three weeks.

And so far no answer about optimizing the number of updates.

It is true we have never executed a vacuum full on this database.

But its size even growing was not so much to worry about.

And if postgres was never able to follow the number of updates and reuse efficiently.

I am afraid an autovacuum tuning may not be sufficente to reuse after we finish vacuum full and downsize the database from 4tb to 200gb.

De: Keith Fiske <keith.fiske@crunchydata.com>
Enviado: terça-feira, 4 de agosto de 2020 10:49
Para: Guillaume Lelarge <guillaume@lelarge.info>
Cc: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>; Ron <ronljohnsonjr@gmail.com>; pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On Tue, Aug 4, 2020 at 5:03 AM Guillaume Lelarge <guillaume@lelarge.info> wrote:

Le lun. 3 août 2020 à 23:36, Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> a écrit :
Hi Keith, how are you?
Hope you are very fine.

As i promissed we ran vacuum full on that table on a test environment similar to production and it took 11 hours to complete.
It reduced the original table segment from 2.3TB to 100GB as below (note that it is still growing...).
So we are now proposing to our Direction a maintenance window to vacuum full production.

Original:
table_schema | table_name | total_size | data_size | external_size
--------------+---------------------+------------+-----------+---------------
public | flip_pagina_edicao | 4578 GB | 2880 GB | 1699 GB

Copy of Orginal (after vacuum full):
table_schema | table_name | total_size | data_size | external_size
--------------+---------------------+------------+-----------+---------------
public | flip_pagina_edicao | 180 GB | 100 GB | 80 GB

In meantime if you could please help us understand some curious facts below about vacuum verbose (not full).
I am running on production another vacuum verbose (not full) because that first one stucked on the same indexes for 3 days.
But now is on a 3rd sequential scan of all indexes of that table followed by a scan on the table itself.
It is not finishing even completing scans.

1) Why it is scanning again and not finishing vacuum?

Your maintenance_work_mem parameter value is probably too low.

2) I noticed that on all scans it is taking 12 hours only on an index on a bpchar(1) column (character(1) DEFAULT 'N'::bpchar).
All other indexes and even the table it self are running too much faster, can you guess a reason for this?

CREATE INDEX ocrindex
ON public.flip_pagina_edicao
USING btree
(ocr COLLATE pg_catalog."default");

Not used, so not in cache? or much bigger than the other ones?

3) And is repeatly scanning 11.184.520 or 521 or 528 row versions.
Why does it not end and scans again the same amount of row versions?

Well, VACUUM is a multi-steps operation. It first scans the table and puts every tid that needs to be deleted in memory. Once done, il will scan indexes. But if there's more TID than memory can hold, it will have to do multiple scans of the table and the indexes. It seems clear in your case that, when VACUUM finds 11184520 TID, memory is full. By the way, memory is limited by a parameter named maintenance_work_mem. So what is its current value?

flip=# vacuum verbose flip_pagina_edicao;
INFO: vacuuming "public.flip_pagina_edicao"
INFO: scanned index "id_caderno" to remove 11184520 row versions
DETAIL: CPU 155.34s/540.40u sec elapsed 1101.49 sec
INFO: scanned index "ordem_index" to remove 11184520 row versions
DETAIL: CPU 253.35s/870.77u sec elapsed 1962.36 sec
INFO: scanned index "primary00024" to remove 11184520 row versions
DETAIL: CPU 134.10s/478.79u sec elapsed 922.02 sec
INFO: scanned index "ordem" to remove 11184520 row versions
DETAIL: CPU 265.16s/932.90u sec elapsed 1878.20 sec
INFO: scanned index "flip_pagina_edicao_pkey" to remove 11184520 row versions
DETAIL: CPU 149.51s/523.42u sec elapsed 949.68 sec
INFO: scanned index "flip_flippagedic_idcaderno" to remove 11184520 row versions
DETAIL: CPU 460.19s/1171.75u sec elapsed 2696.89 sec
INFO: scanned index "nomepdfindex" to remove 11184520 row versions
DETAIL: CPU 598.88s/1286.84u sec elapsed 3543.26 sec
INFO: scanned index "ocrindex" to remove 11184520 row versions
DETAIL: CPU 25657.30s/18907.82u sec elapsed 46652.49 sec
INFO: scanned index "idr_documento_index" to remove 11184520 row versions
DETAIL: CPU 321.96s/968.75u sec elapsed 2004.91 sec
INFO: "flip_pagina_edicao": removed 11184520 row versions in 3762529 pages
DETAIL: CPU 165.65s/95.06u sec elapsed 647.49 sec
INFO: scanned index "id_caderno" to remove 11184521 row versions
DETAIL: CPU 175.89s/461.26u sec elapsed 1098.74 sec
INFO: scanned index "ordem_index" to remove 11184521 row versions
DETAIL: CPU 301.57s/887.26u sec elapsed 2052.44 sec
INFO: scanned index "primary00024" to remove 11184521 row versions
DETAIL: CPU 150.12s/461.92u sec elapsed 947.74 sec
INFO: scanned index "ordem" to remove 11184521 row versions
DETAIL: CPU 286.29s/896.03u sec elapsed 1954.21 sec
INFO: scanned index "flip_pagina_edicao_pkey" to remove 11184521 row versions
DETAIL: CPU 140.75s/423.38u sec elapsed 901.04 sec
INFO: scanned index "flip_flippagedic_idcaderno" to remove 11184521 row versions
DETAIL: CPU 455.24s/1043.27u sec elapsed 2551.54 sec
INFO: scanned index "nomepdfindex" to remove 11184521 row versions
DETAIL: CPU 644.78s/1163.80u sec elapsed 3469.67 sec
INFO: scanned index "ocrindex" to remove 11184521 row versions
DETAIL: CPU 25951.95s/18712.36u sec elapsed 49880.37 sec
INFO: scanned index "idr_documento_index" to remove 11184521 row versions
DETAIL: CPU 274.05s/710.37u sec elapsed 1799.98 sec
INFO: "flip_pagina_edicao": removed 11184521 row versions in 2358457 pages
DETAIL: CPU 68.83s/36.81u sec elapsed 353.40 sec
INFO: scanned index "id_caderno" to remove 11184528 row versions
DETAIL: CPU 156.47s/372.75u sec elapsed 1022.31 sec
INFO: scanned index "ordem_index" to remove 11184528 row versions
DETAIL: CPU 257.13s/669.86u sec elapsed 2057.05 sec
INFO: scanned index "primary00024" to remove 11184528 row versions
DETAIL: CPU 116.29s/319.86u sec elapsed 842.55 sec
INFO: scanned index "ordem" to remove 11184528 row versions
DETAIL: CPU 232.78s/562.22u sec elapsed 2676.35 sec
INFO: scanned index "flip_pagina_edicao_pkey" to remove 11184528 row versions
DETAIL: CPU 91.21s/241.00u sec elapsed 875.27 sec
INFO: scanned index "flip_flippagedic_idcaderno" to remove 11184528 row versions
DETAIL: CPU 285.71s/585.63u sec elapsed 2593.08 sec
INFO: scanned index "nomepdfindex" to remove 11184528 row versions
DETAIL: CPU 413.55s/692.77u sec elapsed 3843.78 sec

Thank you.
Sidney Pryor.

--
Guillaume.

Guillaume answered most of your questions (thanks!).

https://www.keithf4.com/per-table-autovacuum-tuning/

Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

Re: How can i be certain autovacuum is causing reuse if table still grows

From

Keith Fiske

Date:

04 August 2020, 16:34:44

On Tue, Aug 4, 2020 at 11:46 AM Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:

Thank you very much Guillaume and Keith for your straight and quick answers.

I was looking this database size and tup_update history.
It never shrank even with autovacuum enabled (growing 100GB a month).
The number of updates was always this high.
But on last 2 months its size started to grow a lot (growing 50GB a day).

We will schedule a vacuum full to stop adding disks.
But i am truly afraid postgres is not suitable for this application.
We are talkiing with dev team for the last three weeks.
And so far no answer about optimizing the number of updates.

It is true we have never executed a vacuum full on this database.
But its size even growing was not so much to worry about.
And if postgres was never able to follow the number of updates and reuse efficiently.
I am afraid an autovacuum tuning may not be sufficente to reuse after we finish vacuum full and downsize the database from 4tb to 200gb.

You stated in your opening email that you just had the default autovacuum settings in place. The defaults are often nowhere near sufficient for most production use cases. But everyone's production use case is different, so common settings that work with the least issues are the default. If autovacuum is not running frequently enough, especially on larger tables with frequent updates, you will see exactly what happened here over time: autovacuum will not keep up with the write rate and take longer and longer to run over time.

If autovacuum, or manual vacuums, are scheduled often enough to keep up with marking old rows as reusable space, then any new writes will use that empty space vs allocating additional pages and constantly growing your table's size. This includes indexes as well. If you're adding NEW rows, of course it will continue to grow, but that is not autovacuum's problem. The trick is, for now, figuring out how often autovac will need to run to keep up with your write rate. You may need to adjust your cost settings as well to ensure autovac runs a little more aggressively, at least on this table. The blog I just shared goes over how to figure out how to ensure autovacuum is run at least once per day based on a consistent avg row change per day. If you have occasions where higher writes than normal are occurring, autovac may run more often in that case or you can even schedule a manual vacuum.

Thankfully you are on at least PG 9.6 as well, where vacuum has been made to run much more efficiently. Please just make sure you are on the latest minor release. Also, more recent major versions of PostgreSQL have improved upon this even more, so I would highly suggest starting to plan for a major version upgrade. PG9.6 is scheduled to be EOL next year as well, so you want to be looking into that not just for vacuum efficiency.

If you get vacuum tuned properly, on a table that isn't as massively bloated as that one was, I think you will be fine.

Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

RE: How can i be certain autovacuum is causing reuse if table still grows

From

Sidney Aloisio Ferreira Pryor

Date:

04 August 2020, 16:50:32

Thank you so much Keith for sharing your opinion again.

I will have a discussion this afternoon with my superior.

And i will propose vacuum full as soon as possible, will follow your blog's tuning information and promote upgrade PG version.

De: Keith Fiske <keith.fiske@crunchydata.com>
Enviado: terça-feira, 4 de agosto de 2020 13:34
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: Guillaume Lelarge <guillaume@lelarge.info>; Ron <ronljohnsonjr@gmail.com>; pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

On Tue, Aug 4, 2020 at 11:46 AM Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:

Thank you very much Guillaume and Keith for your straight and quick answers.

I was looking this database size and tup_update history.
It never shrank even with autovacuum enabled (growing 100GB a month).
The number of updates was always this high.
But on last 2 months its size started to grow a lot (growing 50GB a day).

We will schedule a vacuum full to stop adding disks.
But i am truly afraid postgres is not suitable for this application.
We are talkiing with dev team for the last three weeks.
And so far no answer about optimizing the number of updates.

It is true we have never executed a vacuum full on this database.
But its size even growing was not so much to worry about.
And if postgres was never able to follow the number of updates and reuse efficiently.
I am afraid an autovacuum tuning may not be sufficente to reuse after we finish vacuum full and downsize the database from 4tb to 200gb.

If you get vacuum tuned properly, on a table that isn't as massively bloated as that one was, I think you will be fine.

Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

RE: How can i be certain autovacuum is causing reuse if table still grows

From

Sidney Aloisio Ferreira Pryor

Date:

04 August 2020, 18:31:16

I read your blog and something is not right beside tuning PG.

As i saw on zabbix history of monitoring tup_update was always high.

Now a days i count an average of 30million updates a day on pg_stat_all_tables.

Daily i put values on excel and state the difference from the last value.

And historically as zabbix registered number of updates was approximately 70% of this 30M i have these days.

So even with default values i calculated that i always had 20% + 50 that autovacuum waits to run.

Beside this, i saw on pg_stat_tables and the vacuum and autovacuum count for this table is 0.

That is not possible because i have a registered on this very view of a finished autovacuum on May 10th.

data	n_tup_ins	n_tup_upd	n_tup_del	n_tup_live	n_tup_dead	last_autovacuum	last_autoanalyze
30/06/2020 13:30	57.308.133	6.744.431.635	1.614.325	165.531.847	2.458.943.354	2020-05-10 19:31:11.792893-03	2020-05-10 19:33:55.497008-03
30/06/2020 19:00	57.386.558	6.748.230.176	1.618.162	165.606.435	2.462.729.202	2020-05-10 19:31:11.792893-03	2020-05-10 19:33:55.497008-03
01/07/2020 13:30	57.736.340	6.783.682.703	1.636.765	165.937.614	2.483.584.486	2020-05-10 19:31:11.792893-03	2020-05-10 19:33:55.497008-03
01/07/2020 19:30	57.837.003	6.837.618.849	1.650.215	166.024.827	2.489.316.701	2020-05-10 19:31:11.792893-03	2020-05-10 19:33:55.497008-03
02/07/2020 13:30	58.143.439	6.865.274.910	1.660.968	166.320.510	2.515.717.862	2020-05-10 19:31:11.792893-03	2020-05-10 19:33:55.497008-03
02/07/2020 19:30	58.226.810	6.882.586.543	1.663.168	166.401.681	2.527.673.618	2020-05-10 19:31:11.792893-03	2020-05-10 19:33:55.497008-03
03/07/2020 13:30	58.544.013	6.908.650.471	1.687.055	166.694.933	2.551.229.982	2020-05-10 19:31:11.792893-03	2020-05-10 19:33:55.497008-03
03/07/2020 19:30	58.640.756	6.931.648.963	1.693.146	166.785.585	2.564.810.596	2020-05-10 19:31:11.792893-03	2020-05-10 19:33:55.497008-03

And when this database was restarted on July 6th the above last_autovacuum register was gone.

I have no history of vacuum or autovacuum count.

Neither altering fillfactor seemed to me a solution on my case. Please tell me if i am wrong.

I am not seeing something to improve (auto)vacuum execution performance itself (please consider that i have already acomplished a successfull vacuum full) beside this below:

1) resize maintenance_work_mem - eliminate memory limitation;

2) raise autovacuum_vacuum_cost_delay and autovacuum_vacuum_cost_limit to a greater value than default - eliminate io limitation;

De: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Enviado: terça-feira, 4 de agosto de 2020 13:50
Para: Keith Fiske <keith.fiske@crunchydata.com>
Cc: Guillaume Lelarge <guillaume@lelarge.info>; Ron <ronljohnsonjr@gmail.com>; pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: RE: How can i be certain autovacuum is causing reuse if table still grows

Thank you so much Keith for sharing your opinion again.

I will have a discussion this afternoon with my superior.

And i will propose vacuum full as soon as possible, will follow your blog's tuning information and promote upgrade PG version.

On Tue, Aug 4, 2020 at 11:46 AM Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:

Thank you very much Guillaume and Keith for your straight and quick answers.

I was looking this database size and tup_update history.
It never shrank even with autovacuum enabled (growing 100GB a month).
The number of updates was always this high.
But on last 2 months its size started to grow a lot (growing 50GB a day).

We will schedule a vacuum full to stop adding disks.
But i am truly afraid postgres is not suitable for this application.
We are talkiing with dev team for the last three weeks.
And so far no answer about optimizing the number of updates.

It is true we have never executed a vacuum full on this database.
But its size even growing was not so much to worry about.
And if postgres was never able to follow the number of updates and reuse efficiently.
I am afraid an autovacuum tuning may not be sufficente to reuse after we finish vacuum full and downsize the database from 4tb to 200gb.

If you get vacuum tuned properly, on a table that isn't as massively bloated as that one was, I think you will be fine.

Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

Re: How can i be certain autovacuum is causing reuse if table still grows

From

Scott Ribe

Date:

04 August 2020, 19:53:58

> On Aug 4, 2020, at 12:31 PM, Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:
>
> Beside this, i saw on pg_stat_tables and the vacuum and autovacuum count for this table is 0.

Those values are not persistent; they are only since the last start of PG.

RE: How can i be certain autovacuum is causing reuse if table still grows

From

Sidney Aloisio Ferreira Pryor

Date:

04 August 2020, 20:12:56

Thank you Scott.

Sorry for that. Did not attempted it is a view.

De: Scott Ribe <scott_ribe@elevated-dev.com>
Enviado: terça-feira, 4 de agosto de 2020 16:53
Para: Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br>
Cc: Keith Fiske <keith.fiske@crunchydata.com>; Guillaume Lelarge <guillaume@lelarge.info>; Ron <ronljohnsonjr@gmail.com>; pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Assunto: Re: How can i be certain autovacuum is causing reuse if table still grows

> On Aug 4, 2020, at 12:31 PM, Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:
>
> Beside this, i saw on pg_stat_tables and the vacuum and autovacuum count for this table is 0.

Those values are not persistent; they are only since the last start of PG.

Re: How can i be certain autovacuum is causing reuse if table still grows

From

Keith Fiske

Date:

04 August 2020, 22:25:38

On Tue, Aug 4, 2020 at 3:54 PM Scott Ribe <scott_ribe@elevated-dev.com> wrote:

> On Aug 4, 2020, at 12:31 PM, Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:
>
> Beside this, i saw on pg_stat_tables and the vacuum and autovacuum count for this table is 0.

Those values are not persistent; they are only since the last start of PG.

This is not correct. Statistics are persistent through restarts. They are only reset during specific cases such as a crash or a major version upgrade. They can also be reset at any time manually.

Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

Re: How can i be certain autovacuum is causing reuse if table still grows

From

MichaelDBA

Date:

05 August 2020, 13:29:17

Hi Keith,

I think statistics are also reset and not persistent through restarts when the pg_ctl option "-m immediate" is used when shutting down PG since it forces recovery mode when starting PG back up. By the way, for PostgreSQL Aurora, statistics are reset automatically every time aurora forces PG to restart or reboot or the user selects that option manually. I reported this as a bug with Aurora a few months ago. They said it was not a bug, but a feature that the might implement in the future. YUK.

Regards,
Michael Vitale

Keith Fiske wrote on 8/4/2020 6:25 PM:

On Tue, Aug 4, 2020 at 3:54 PM Scott Ribe <scott_ribe@elevated-dev.com> wrote:
> On Aug 4, 2020, at 12:31 PM, Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:
>
> Beside this, i saw on pg_stat_tables and the vacuum and autovacuum count for this table is 0.

Those values are not persistent; they are only since the last start of PG.

This is not correct. Statistics are persistent through restarts. They are only reset during specific cases such as a crash or a major version upgrade. They can also be reset at any time manually.

--
Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

Re: How can i be certain autovacuum is causing reuse if table still grows

From

Keith Fiske

Date:

05 August 2020, 14:25:40

On Wed, Aug 5, 2020 at 9:29 AM MichaelDBA <MichaelDBA@sqlexec.com> wrote:

Hi Keith,

I think statistics are also reset and not persistent through restarts when the pg_ctl option "-m immediate" is used when shutting down PG since it forces recovery mode when starting PG back up. By the way, for PostgreSQL Aurora, statistics are reset automatically every time aurora forces PG to restart or reboot or the user selects that option manually. I reported this as a bug with Aurora a few months ago. They said it was not a bug, but a feature that the might implement in the future. YUK.

Regards,
Michael Vitale

An "immediate" shutdown is often considered a crash to Postgres. So yes, it likely will reset the statistics.

Keith Fiske wrote on 8/4/2020 6:25 PM:

On Tue, Aug 4, 2020 at 3:54 PM Scott Ribe <scott_ribe@elevated-dev.com> wrote:
> On Aug 4, 2020, at 12:31 PM, Sidney Aloisio Ferreira Pryor <sidney@tjrj.jus.br> wrote:
>
> Beside this, i saw on pg_stat_tables and the vacuum and autovacuum count for this table is 0.

Those values are not persistent; they are only since the last start of PG.

This is not correct. Statistics are persistent through restarts. They are only reset during specific cases such as a crash or a major version upgrade. They can also be reset at any time manually.

--
Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com