Re: snapshot too old, configured by time - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: snapshot too old, configured by time
Date
Msg-id CAMkU=1xOYSuYSwB2QBq94Apx2GJL7LhEjCK0qVJg6MDQcOmsPw@mail.gmail.com
Whole thread Raw
In response to Re: snapshot too old, configured by time  (Kevin Grittner <kgrittn@gmail.com>)
Responses Re: snapshot too old, configured by time  (Peter Geoghegan <pg@heroku.com>)
Re: snapshot too old, configured by time  (Kevin Grittner <kgrittn@gmail.com>)
List pgsql-hackers
On Wed, Mar 30, 2016 at 12:34 PM, Kevin Grittner <kgrittn@gmail.com> wrote:
> On Sat, Mar 19, 2016 at 1:27 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
>
>> I'm not sure if this is operating as expected.
>>
>> I set the value to 1min.
>>
>> I set up a test like this:
>>
>> pgbench -i
>>
>> pgbench -c4 -j4 -T 3600 &
>>
>> ### watch the size of branches table
>> while (true) ; do psql -c "\dt+" | fgrep _branches; sleep 10; done &
>>
>> ### set up a long lived snapshot.
>> psql -c 'begin; set transaction isolation level repeatable read;
>> select sum(bbalance) from pgbench_branches; select pg_sleep(300);
>> select sum(bbalance) from pgbench_branches;'
>>
>> As this runs, I can see the size of the pgbench_branches bloating once
>> the snapshot is taken, and continues bloating at a linear rate for the
>> full 300 seconds.
>>
>> Once the 300 second pg_sleep is up, the long-lived snapshot holder
>> receives an error once it tries to access the table again, and then
>> the bloat stops increasing.  But shouldn't the bloat have stopped
>> increasing as soon as the snapshot became doomed, which would be after
>> a minute or so?
>
> This is actually operating as intended, not a bug.  Try running a
> manual VACUUM command about two minutes after the snapshot is taken
> and you should get a handle on what's going on.  The old tuples
> become eligible for vacuuming after one minute, but that doesn't
> necessarily mean that autovacuum jumps in and that the space starts
> getting reused.

I can verify that a manual vacuum does stop the bloat from continuing
to increase.  But I don't see why autovacuum is not already stopping
the bloat.  It is running often enough that it really ought to do so
(as verified by setting log_autovacuum_min_duration = 0 and looking in
the log files to see that it is vacuuming the table once per nap-time,
although it is not accomplishing much by doing so as no tuples can be
removed.)

Also, HOT-cleanup should stop the bloat increase once the snapshot
crosses the old_snapshot_threshold without even needing to wait until
the next autovac runs.

Does the code intentionally only work for manual vacuums?  If so, that
seems quite surprising.  Or perhaps I am missing something else here.

Thanks,

Jeff



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Using quicksort for every external sort run
Next
From: Tomas Vondra
Date:
Subject: Re: PATCH: use foreign keys to improve join estimates v1