Re: autoanalyze criteria - Mailing list pgsql-general

From Stefan Andreatta
Subject Re: autoanalyze criteria
Date
Msg-id 512B45E7.1000007@synedra.com
Whole thread Raw
In response to Re: autoanalyze criteria  (Alban Hertroys <haramrae@gmail.com>)
List pgsql-general
On 02/25/2013 09:00 AM, Alban Hertroys wrote:
> On Feb 25, 2013, at 7:23, Stefan Andreatta <s.andreatta@synedra.com
> <mailto:s.andreatta@synedra.com>> wrote:
>
>> On 02/24/2013 12:52 PM, Alban Hertroys wrote:
>>> On Feb 23, 2013, at 14:11, Stefan Andreatta <s.andreatta@synedra.com
>>> <mailto:s.andreatta@synedra.com>> wrote:
>>>
>>>> And we are still missing a number for rows updated since the last
>>>> analyse.
>>>
>>> In MVCC an update is an insert + delete, so you already got those
>>> numbers.
>>>
>> Good point. But because they are an update and a delete, they cancel
>> each other out and do not show up in pg_stat_user_tables.n_live_tup -
>> and that's the only value for which we have a reference value from
>> the time of the last analyze (pg_class.reltuples).
>
> I'm pretty sure that an update results in 1 live + 1 dead tuple, so
> they don't cancel each other out - they end up adding to different
> statistics. Assuming those statistics are both since last vacuum,
> added together they are the total number of changed records since last
> vacuum.
> What gain do you expect from a number of updated tuples?
>
> And it seems to me those numbers are since last vacuum, not since last
> analyse - analyse doesn't change the amount of dead tuples (it just
> updates them to closer match reality), but vacuum does.
>
> Disclaimer: I'm not intimately familiar with the planner statistics,
> but knowing what vacuum and analyse do in an MVCC database, like I
> described above it makes sense to me. I might be wrong though.
1 update = 1 insert + 1 delete cancel each other out with respect to
pg_stat_user_tables.n_live_tup. Naturally, they dont't cancel each other
out with pg_stat_user_tables.n_tup_ins or n_tup_del - they don't even
show up in those values, presumably because that's what n_tup_upd is
there for. However the update adds to n_dead_tup.

VACUUM does not reset *any* of the statistics values that can be
accessed via pg_stat_user_tables, apart from n_dead_tup (hopefully ;-)
Anyway, to estimate the autoanalyze trigger, I would need statistics
that get reset by autoanalyze not autovacuum.


I wrote a test script to show the behaviour. Be sure to wait a second
each time before accessing pg_stat_user_tables as there is a delay in
getting those data:

CREATE TABLE test_stat (id BIGINT, some_number BIGINT);
INSERT INTO test_stat (SELECT generate_series(1,10000) AS i, random() AS r);
SELECT count(*) FROM test_stat;
ANALYZE test_stat;

-- wait here (0.5 s) for statistics collector to catch up
SELECT relname, reltuples FROM pg_class WHERE relname = 'test_stat';
SELECT relname, n_live_tup, n_dead_tup, n_tup_ins, n_tup_upd, n_tup_del,
n_tup_hot_upd FROM pg_stat_user_tables WHERE relname = 'test_stat';

INSERT INTO test_stat (SELECT generate_series(10001,10900) AS i,
random() AS r);

-- wait here (0.5 s) for statistics collector to catch up
SELECT relname, reltuples FROM pg_class WHERE relname = 'test_stat';
SELECT relname, n_live_tup, n_dead_tup, n_tup_ins, n_tup_upd, n_tup_del,
n_tup_hot_upd FROM pg_stat_user_tables WHERE relname = 'test_stat';

DELETE FROM test_stat WHERE id > 10000;

-- wait here (0.5 s) for statistics collector to catch up
SELECT relname, reltuples FROM pg_class WHERE relname = 'test_stat';
SELECT relname, n_live_tup, n_dead_tup, n_tup_ins, n_tup_upd, n_tup_del,
n_tup_hot_upd FROM pg_stat_user_tables WHERE relname = 'test_stat';

UPDATE test_stat set some_number = 1 where id > 9100;

-- wait here (0.5 s) for statistics collector to catch up
SELECT relname, reltuples FROM pg_class WHERE relname = 'test_stat';
SELECT relname, n_live_tup, n_dead_tup, n_tup_ins, n_tup_upd, n_tup_del,
n_tup_hot_upd FROM pg_stat_user_tables WHERE relname = 'test_stat';

ANALYZE test_stat;

SELECT relname, reltuples FROM pg_class WHERE relname = 'test_stat';
SELECT relname, n_live_tup, n_dead_tup, n_tup_ins, n_tup_upd, n_tup_del,
n_tup_hot_upd FROM pg_stat_user_tables WHERE relname = 'test_stat';

VACUUM test_stat;

-- wait here (0.5 s) for statistics collector to catch up
SELECT relname, reltuples FROM pg_class WHERE relname = 'test_stat';
SELECT relname, n_live_tup, n_dead_tup, n_tup_ins, n_tup_upd, n_tup_del,
n_tup_hot_upd FROM pg_stat_user_tables WHERE relname = 'test_stat';


DROP TABLE test_stat;



Output from a postgres 9.2 database:
--------------------------------------------------

test=# CREATE TABLE test_stat (id BIGINT, some_number BIGINT);
CREATE TABLE
test=# INSERT INTO test_stat (SELECT generate_series(1,10000) AS i,
random() AS r);
INSERT 0 10000
test=# SELECT count(*) FROM test_stat;
  count
-------
  10000
(1 row)

test=# ANALYZE test_stat;
ANALYZE
test=# SELECT relname, reltuples FROM pg_class WHERE relname = 'test_stat';
   relname  | reltuples
-----------+-----------
  test_stat |     10000
(1 row)

test=# SELECT relname, n_live_tup, n_dead_tup, n_tup_ins, n_tup_upd,
n_tup_del, n_tup_hot_upd FROM pg_stat_user_tables WHERE relname =
'test_stat';
   relname  | n_live_tup | n_dead_tup | n_tup_ins | n_tup_upd |
n_tup_del | n_tup_hot_upd
-----------+------------+------------+-----------+-----------+-----------+---------------
  test_stat |      10000 |          0 |     10000 |         0 |
0 |             0
(1 row)

test=# INSERT INTO test_stat (SELECT generate_series(10001,10900) AS i,
random() AS r);
INSERT 0 900
test=# SELECT relname, reltuples FROM pg_class WHERE relname = 'test_stat';
   relname  | reltuples
-----------+-----------
  test_stat |     10000
(1 row)

test=# SELECT relname, n_live_tup, n_dead_tup, n_tup_ins, n_tup_upd,
n_tup_del, n_tup_hot_upd FROM pg_stat_user_tables WHERE relname =
'test_stat';
   relname  | n_live_tup | n_dead_tup | n_tup_ins | n_tup_upd |
n_tup_del | n_tup_hot_upd
-----------+------------+------------+-----------+-----------+-----------+---------------
  test_stat |      10900 |          0 |     10900 |         0 |
0 |             0
(1 row)

test=# DELETE FROM test_stat WHERE id > 10000;
DELETE 900
test=# SELECT relname, reltuples FROM pg_class WHERE relname = 'test_stat';
   relname  | reltuples
-----------+-----------
  test_stat |     10000
(1 row)

test=# SELECT relname, n_live_tup, n_dead_tup, n_tup_ins, n_tup_upd,
n_tup_del, n_tup_hot_upd FROM pg_stat_user_tables WHERE relname =
'test_stat';
   relname  | n_live_tup | n_dead_tup | n_tup_ins | n_tup_upd |
n_tup_del | n_tup_hot_upd
-----------+------------+------------+-----------+-----------+-----------+---------------
  test_stat |      10000 |        900 |     10900 |         0 | 900
|             0
(1 row)

test=# UPDATE test_stat set some_number = 1 where id > 9100;
UPDATE 900
test=# -- wait here (0.5 s) for statistics collector to catch up
test=# SELECT relname, reltuples FROM pg_class WHERE relname = 'test_stat';
   relname  | reltuples
-----------+-----------
  test_stat |     10000
(1 row)

test=# SELECT relname, n_live_tup, n_dead_tup, n_tup_ins, n_tup_upd,
n_tup_del, n_tup_hot_upd FROM pg_stat_user_tables WHERE relname =
'test_stat';
   relname  | n_live_tup | n_dead_tup | n_tup_ins | n_tup_upd |
n_tup_del | n_tup_hot_upd
-----------+------------+------------+-----------+-----------+-----------+---------------
  test_stat |      10000 |       1800 |     10900 |       900 | 900
|            10
(1 row)

test=# ANALYZE test_stat;
ANALYZE
test=#
test=# SELECT relname, reltuples FROM pg_class WHERE relname = 'test_stat';
   relname  | reltuples
-----------+-----------
  test_stat |     10000
(1 row)

test=# SELECT relname, n_live_tup, n_dead_tup, n_tup_ins, n_tup_upd,
n_tup_del, n_tup_hot_upd FROM pg_stat_user_tables WHERE relname =
'test_stat';
   relname  | n_live_tup | n_dead_tup | n_tup_ins | n_tup_upd |
n_tup_del | n_tup_hot_upd
-----------+------------+------------+-----------+-----------+-----------+---------------
  test_stat |      10000 |       1800 |     10900 |       900 | 900
|            10
(1 row)

test=# VACUUM test_stat;
VACUUM
test=# SELECT relname, reltuples FROM pg_class WHERE relname = 'test_stat';
   relname  | reltuples
-----------+-----------
  test_stat |     10000
(1 row)

test=# SELECT relname, n_live_tup, n_dead_tup, n_tup_ins, n_tup_upd,
n_tup_del, n_tup_hot_upd FROM pg_stat_user_tables WHERE relname =
'test_stat';
   relname  | n_live_tup | n_dead_tup | n_tup_ins | n_tup_upd |
n_tup_del | n_tup_hot_upd
-----------+------------+------------+-----------+-----------+-----------+---------------
  test_stat |      10000 |          0 |     10900 |       900 | 900
|            10
(1 row)

test=# DROP TABLE test_stat;
DROP TABLE


Regards,
Stefan

pgsql-general by date:

Previous
From: Alban Hertroys
Date:
Subject: Re: autoanalyze criteria
Next
From: Ali Pouya
Date:
Subject: Partitionning by trigger