Re: Question about behavior of snapshot too old feature - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Question about behavior of snapshot too old feature
Date
Msg-id CAD21AoCrQDwUjYk5gSLhH-Xw_nVLDqe8daEgxJnoEC_vMdzarQ@mail.gmail.com
Whole thread Raw
In response to Re: Question about behavior of snapshot too old feature  (Kevin Grittner <kgrittn@gmail.com>)
Responses Re: Question about behavior of snapshot too old feature  (Kevin Grittner <kgrittn@gmail.com>)
List pgsql-hackers
On Fri, Oct 14, 2016 at 11:29 PM, Kevin Grittner <kgrittn@gmail.com> wrote:
> On Fri, Oct 14, 2016 at 8:53 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Fri, Oct 14, 2016 at 1:40 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
>>> For example, I set old_snapshot_threshold = 1min and prepare a table
>>> and two terminals.
>>> And I did the followings steps.
>>>
>>> 1. [Terminal 1] Begin transaction and get snapshot data and wait.
>>>      BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
>>>      SELECT * FROM test;
>>>
>>> 2. [Terminal 2] Another session updates test table in order to make
>>> snapshot dirty.
>>>      BEGIN;
>>>      UPDATE test SET c = c + 100;
>>>      COMMIT;
>>>
>>> 3. [Terminal 1] 1 minute after, read the test table again in same
>>> transaction opened at #1. I got no error.
>>>     SELECT * FROM test;
>>>
>>> 4. [Terminal 2] Another session reads the test table.
>>>      BEGIN;
>>>      SELECT * FROM test;
>>>      COMMIT;
>>>
>>> 5. [Terminal 1] 1 minute after, read the test table again, and got
>>> "snapshot error" error.
>>>      SELECT * FROM test;
>>>
>>> Since #2 makes a snapshot I got at #1 dirty, I expected to get
>>> "snapshot too old" error at #3 where I read test table again after
>>> enough time. But I could never get "snapshot too old" error at #3.
>>>
>>
>> Here, the basic idea is that till the time corresponding page is not
>> pruned or table vacuuming hasn't triggered, this error won't occur.
>> So, I think what is happening here that during step #4 or step #3, it
>> has pruned the table, after which you started getting error.
>
> The pruning might be one factor.  Another possible issue is that
> effectively it doesn't start timing that 1 minute until the clock
> hits the start of the next minute (i.e., 0 seconds after the next
> minute).  The old_snapshot_threshold does not attempt to guarantee
> that the snapshot too old error will happen at the earliest
> opportunity, but that the error will *not* happen until the
> snapshot is *at least* that old.  Keep in mind that the expected
> useful values for this parameter are from a small number of hours
> to a day or two, depending on the workload.  The emphasis was on
> minimizing overhead, even when it meant the cleanup might not be
> quite as "eager" as it could otherwise be.
>

Thanks! I understood.
I've tested with autovacuum = off, so it has pruned the table at step #4.

When I set old_snapshot_threshold = 0 I got error at step #3, which
means that the error is occurred without table pruning.
We have regression test for this feature but it sets
old_snapshot_threshold = 0, I doubt about we can test it properly.
Am I missing something?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Steps inside ExecEndGather
Next
From: Kouhei Kaigai
Date:
Subject: Re: Steps inside ExecEndGather