Avoiding bloat in the presence of a long-running transaction (Re: Freezing without write I/O) - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Avoiding bloat in the presence of a long-running transaction (Re: Freezing without write I/O)
Date
Msg-id 51B23834.5020803@vmware.com
Whole thread Raw
In response to Re: Freezing without write I/O  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 07.06.2013 22:15, Robert Haas wrote:
> On Fri, Jun 7, 2013 at 3:10 PM, Simon Riggs<simon@2ndquadrant.com>  wrote:
>> The long running query problem hasn't ever been looked at, it seems,
>> until here and now.
>
> For what it's worth (and that may not be much), I think most people
> will die a horrible death due to bloat after holding a transaction
> open for a tiny fraction of 2B XIDs.  :-(

Yeah, we should fix that too ;-).

While we're at it: I've been thinking that we should try harder to 
vacuum dead tuples that are no longer visible to any snapshot, even if 
there's an even old snapshot. The stereotypical scenario is a table with 
a single row that's updated very very frequently. Like a counter. 
Autovacuum can normally keep it in check, but if you have a long-running 
transaction, it will bloat horrendously. But if you only have one 
long-running transaction with one really old snapshot, and everything 
else is recent, you'd really only need to keep one old tuple around for 
the old snapshot to see, and a recent version or two for the rest. At 
worst, the database needs to bloat to double the size, but not more than 
that.

To know which tuples are dead at such a fine-grained level, vacuum would 
need to know in more detail what snapshots the backends have. I'm really 
excited about Ants Aasma's proposal to use a CSN for snapshots, or more 
precisely the variant using commit record's LSN for that. If a snapshot 
is just a single integer, it becomes easier for backends to share their 
snapshots, in limited amount of shared memory.

- Heikki



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Bad error message on valuntil
Next
From: Kevin Grittner
Date:
Subject: Re: Redesigning checkpoint_segments