Thread: [HACKERS] Transaction held open by autoanalyze can be a bottleneck

[HACKERS] Transaction held open by autoanalyze can be a bottleneck

From
Jeff Janes
Date:

Autovacuum's analyze starts a transaction when it starts on a table, and holds it for the duration. This holds back the xmin horizon.

On a TPC-B-like benchmark, this can be a problem.  While it is autoanalyzing pgbench_accounts and pgbench_history, dead-but-for-analyze tuples accumulate rapidly in pgbench_tellers and pgbench_branches.  Now the UPDATES to those tables have to walk the unprunable HOT chains to find their tuples to update, greatly slowing them down.

The analyze actually takes quite a while, because it is frequently setting hint bits and so dirtying pages and so sleeping for autovacuum_vacuum_cost_delay.  

If I set autovacuum_vacuum_cost_delay=0, then tps averaged over an hour goes from 12,307.6 to 24,955.2.  I can get a similar gain just by changing the relopts for those two tables to autovacuum_analyze_threshold = 2000000000.  I don't think these are particularly attractive solutions, but they do demonstrate the nature of the problem.

Does analyze need all of its work done under the same transaction?  Is there an elegant way to make it periodically discard the transaction and get a new one, so that the xmin horizon can advance? I think doing so every time vacuum_delay_point decides to sleep would be a good time to do that, but that would expand its contract quite a bit. And it is probably possible to have analyze take a long time without ever deciding to sleep, so doing it there would not be a fully general solution.

Cheers,

Jeff

Re: [HACKERS] Transaction held open by autoanalyze can be abottleneck

From
Andres Freund
Date:
Hi,

On 2017-05-10 13:09:38 -0700, Jeff Janes wrote:
> Autovacuum's analyze starts a transaction when it starts on a table, and
> holds it for the duration. This holds back the xmin horizon.

Yea, I also complained about this:
http://archives.postgresql.org/message-id/20151031145303.GC6064%40alap3.anarazel.de


> Does analyze need all of its work done under the same transaction?

It's imo, as pointed out in the above email, not trivial to change it,
but it's imo doable.

Greetings,

Andres Freund