Re: [GENERAL] Vacuum and state_change - Mailing list pgsql-general

From armand pirvu
Subject Re: [GENERAL] Vacuum and state_change
Date
Msg-id DAA7660F-3EB1-4E65-9710-08AB933AE1B5@gmail.com
Whole thread Raw
In response to Re: [GENERAL] Vacuum and state_change  (Adrian Klaver <adrian.klaver@aklaver.com>)
Responses Re: [GENERAL] Vacuum and state_change  (Adrian Klaver <adrian.klaver@aklaver.com>)
List pgsql-general

On Jun 9, 2017, at 11:23 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 06/09/2017 09:13 AM, armand pirvu wrote:
On Jun 9, 2017, at 11:01 AM, Adrian Klaver <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:

On 06/09/2017 08:45 AM, armand pirvu wrote:
Hi
Had a couple of processes blocking the vacuum so I terminated them using
select pg_terminate_backend(pid);
Running the following
select distinct pid, backend_start, query_start, state_change, state, query from pg_catalog.pg_stat_activity order by 1;
 pid  |         backend_start         |          query_start          |         state_change          | state  | query
-------+-------------------------------+-------------------------------+-------------------------------+--------+--------------------------------------------------------------------------------------------------------------------------
10677 | 2017-06-09 10:25:49.189848-05 | 2017-06-09 10:33:43.598805-05 | 2017-06-09 10:33:43.599091-05 | idle   | SELECT 1
11096 | 2017-06-09 10:27:03.686588-05 | 2017-06-09 10:33:56.28736-05  | 2017-06-09 10:33:56.287364-05 | active | select distinct pid, backend_start, query_start, state_change, state, query from pg_catalog.pg_stat_activity order by 1;
13277 | 2017-06-09 07:48:49.506686-05 | 2017-06-09 07:48:52.887185-05 | 2017-06-09 07:48:52.887188-05 | active | autovacuum: VACUUM csischema.tf_purchased_badge
13484 | 2017-06-09 10:31:54.127672-05 | 2017-06-09 10:33:47.137938-05 | 2017-06-09 10:33:47.138226-05 | idle   | SELECT 1
16886 | 2017-06-09 07:56:49.033893-05 | 2017-06-09 07:56:49.078369-05 | 2017-06-09 07:56:49.078371-05 | active | autovacuum: VACUUM csischema.tf_purchases_person
25387 | 2017-06-09 05:32:08.079397-05 | 2017-06-09 05:32:08.385728-05 | 2017-06-09 05:32:08.385731-05 | active | autovacuum: VACUUM csischema.tf_demographic_response_person
37465 | 2017-06-09 08:50:58.992002-05 | 2017-06-09 08:51:21.506829-05 | 2017-06-09 08:51:21.506831-05 | active | autovacuum: VACUUM csischema.tf_transaction_item_person
I did notice that state_change did not change one bit

Did the state change?

No and that was what got me worried

Are these large tables?



I would say yes

select count(*) from csischema.tf_purchased_badge;
 9380749

select count(*) from csischema.tf_purchases_person;
 19902172

select count(*) from csischema.tf_demographic_response_person;
 80868561

select count(*) from csischema.tf_transaction_item_person;
 3281084

Interesting enough two completed

          relname           | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_live_tup | n_dead_tup | n_mod_since_analyze | last_vacuum |        last_autovacuum        | autovacuum_count 
----------------------------+----------+--------------+----------+---------------+-----------+-----------+------------+------------+---------------------+-------------+-------------------------------+------------------
 tf_transaction_item_person |      160 |            0 |   476810 |       1946119 |      2526 |    473678 |    3226110 |          0 |              116097 |             | 2017-06-09 11:15:24.701997-05 |                2
 tf_purchased_badge         |      358 |   1551142438 |  2108331 |       7020502 |      5498 |   1243746 |    9747336 |     107560 |              115888 |             | 2017-06-09 15:09:16.624363-05 |                1


I did notice though that checkpoints seem a bit too often aka below 5 min from start to end


2017-06-09 14:18:38.552 CDT,,,888,,593a1810.378,271,,2017-06-08 22:37:52 CDT,,0,LOG,00000,"checkpoint starting: time",,,,,,,,,""
2017-06-09 14:21:12.210 CDT,,,888,,593a1810.378,272,,2017-06-08 22:37:52 CDT,,0,LOG,00000,"checkpoint complete: wrote 12070 buffers (9.2%); 0 transaction log file(s) added, 0 removed, 4 recycled; write=148.714 s, sync=4.834 s, total=153.657 s; sync files=17, longest=1.841 s, average=0.284 s; distance=89452 kB, estimate=89452 kB",,,,,,,,,""

2017-06-09 14:23:38.278 CDT,,,888,,593a1810.378,273,,2017-06-08 22:37:52 CDT,,0,LOG,00000,"checkpoint starting: time",,,,,,,,,""
2017-06-09 14:24:38.629 CDT,,,888,,593a1810.378,274,,2017-06-08 22:37:52 CDT,,0,LOG,00000,"checkpoint complete: wrote 593 buffers (0.5%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=59.825 s, sync=0.474 s, total=60.350 s; sync files=8, longest=0.355 s, average=0.059 s; distance=26952 kB, estimate=83202 kB",,,,,,,,,""

And also 

SELECT
total_checkpoints,
seconds_since_start / total_checkpoints / 60 AS minutes_between_checkpoints
FROM
(SELECT
EXTRACT(EPOCH FROM (now() - pg_postmaster_start_time())) AS seconds_since_start,
(checkpoints_timed+checkpoints_req) AS total_checkpoints
FROM pg_stat_bgwriter
) AS sub;


 total_checkpoints | minutes_between_checkpoints 
-------------------+-----------------------------
               285 |            3.33731205871345



These tables suffer quite some data changes IIRC but that comes via some temp tables which reside in a temp  schema and some previous messages from the log suggest that it might have ran into  ladder locking in early stages, aka tmp table locked from vacuum  and any further processing waiting for it and causing some other waits on those largish tables

Considering the temp ones are only for load and yes some processing goes in there , I am thinking disabling auto vacuum for the temp tables . Or should I disable auto vacuum all together and run say as a bath job on a weekend night ?


Aside that there are vacuum improvements and such, any other strong compelling reason to upgrade to 9.6 ?



Does that mean that something is not quite right with the vacuums ?

Might want to take a look at:

https://www.postgresql.org/docs/9.6/static/monitoring-stats.html#PG-STAT-ALL-TABLES-VIEW


Thank you
Armand


--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>


--
Adrian Klaver
adrian.klaver@aklaver.com

pgsql-general by date:

Previous
From: Thomas Kellerer
Date:
Subject: Re: [GENERAL] pg_catalog tables don't have constraints?
Next
From: Adrian Klaver
Date:
Subject: Re: [GENERAL] Vacuum and state_change