Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold < - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <
Date
Msg-id CA+TgmoZ2uFekmgJHM__LYkhUKtq4-Arh5OOLpsy+YY-6-0W58g@mail.gmail.com
Whole thread Raw
In response to Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <  (Kevin Grittner <kgrittn@gmail.com>)
Responses Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <
List pgsql-hackers
On Sat, Jun 11, 2016 at 11:29 AM, Kevin Grittner <kgrittn@gmail.com> wrote:
> I have reviewed the code and run tests to try to find something
> here which could be considered a bug, without finding any problem.
> When reading pages for the random sample for ANALYZE (or
> auto-analyze) there is not an age check; so ANALYZE completes
> without error, keeping statistics up-to-date.
>
> There really is no difference in behavior except in the case that:
>
> (1)  old_snapshot_threshold >= 0 to enable the "snapshot too old"
>        feature, and
> (2)  there were tuples that were dead as the result of completed
>        transactions, and
> (3)  those tuples became older than the threshold, and
> (4)  those tuples were pruned or vacuumed away, and
> (5)  an ANALYZE process would have read those dead tuples had they
>        not been removed.
>
> In such a case the irrevocably dead, permanently removed tuples are
> not counted in the statistics.  I have trouble seeing a better
> outcome than that.  Among my tests, I specifically checked for an
> ANALYZE of a table having an index on an expression, using an old
> snapshot:
>
> -- connection 1
> drop table if exists t1;
> create table t1 (c1 int not null);
> drop table if exists t2;
> create table t2 (c1 int not null);
> insert into t1 select generate_series(1, 10000);
> drop function mysq(i int);
> create function mysq(i int)
>   returns int
>   language plpgsql
>   immutable
> as $mysq$
> begin
>   return (i * i);
> end
> $mysq$;
> create index t1_c1sq on t1 ((mysq(c1)));
> begin transaction isolation level repeatable read;
> select 1;
>
> -- connection 2
> vacuum analyze verbose t1;
> delete from t1 where c1 between 1000 and 1999;
> delete from t1 where c1 = 8000;
> insert into t2 values (1);
> select pg_sleep_for('2min');
> vacuum verbose t1;  -- repeat if necessary to see the dead rows
> disappear
>
> -- connection 1
> analyze verbose t1;
>
> This runs to completion, as I would want and expect.
>
> I am closing this item on the "PostgreSQL 9.6 Open Items" page.  If
> anyone feels that I've missed something, please provide a test to
> show the problem, or a clear description of the problem and how you
> feel behavior should be different.

So what happens in this scenario:

1. ANALYZE runs really slowly - maybe the user-defined function it's
running for the expression index is extremely long-running.
2. Eventually, the snapshot for ANALYZE is older than the configured
value of snapshot_too_old.
3. Then, ANALYZE selects a page with an LSN new enough that it might
have been pruned.

Presumably, the ANALYZE ought to error out in this scenario, just as
it would in any other situation where an old snapshot sees a new page.
No?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Should pg_export_snapshot() and currtid() be tagged parallel-unsafe?
Next
From: Robert Haas
Date:
Subject: Re: Rename max_parallel_degree?