Home > mailing lists

Re: Allow "snapshot too old" error, to prevent bloat - Mailing list pgsql-hackers

From	Kevin Grittner
Subject	Re: Allow "snapshot too old" error, to prevent bloat
Date	February 22, 2015 19:49:15
Msg-id	1816752140.3926212.1424623723314.JavaMail.yahoo@mail.yahoo.com Whole thread Raw
In response to	Re: Allow "snapshot too old" error, to prevent bloat (Amit Kapila <amit.kapila16@gmail.com>)
Responses	Re: Allow "snapshot too old" error, to prevent bloat (Andrew Dunstan <andrew@dunslane.net>) Re: Allow "snapshot too old" error, to prevent bloat (Amit Kapila <amit.kapila16@gmail.com>)
List	pgsql-hackers

Tree view

Amit Kapila <amit.kapila16@gmail.com> wrote:

> It seems to me that SQL Server also uses similar mechanism to
> avoid the bloat in version store (place to store previous
> versions or record).

> I think if other leading databases provide a way to control the
> bloat, it indicates that most of the customers having
> write-intesive workload would like to see such an option.

Yeah, I think many users are surprised by the damage that can be
done to a database by an idle connection or long-running read-only
query, especially when coming from other databases which have
protections against that.

The bad news is that changing to specifying the limit in time
rather than transaction IDs is far more complicated than I thought.
Basically, we need to associate times in the past with the value of
latestCompletedXid (or derived values like snapshot xmin) at those
times.  I was initially thinking that by adding a timestamp to the
snapshot we could derive this from active snapshots.  For any one
connection, it's not that hard for it to scan the pairingheap of
snapshots and pick out the right values; but the problem is that it
is process-local memory and this needs to work while the connection
is sitting idle for long periods of time.  I've got a couple ideas
on how to approach it, but neither seems entirely satisfying:

(1) Use a new timer ID to interrupt the process whenever its
oldest snapshot which hasn't yet crossed the "old snapshot"
threshold does so.  The problem is that this seems as though it
would need to do an awful lot more work (scanning the pairing heap
for the best match) than anything else is currently doing in a
timeout handler.  One alternative would be to keep a second pairing
heap to track snapshots which have not crossed the threshold,
removing items as they get old -- that would make the work needed
in the timeout handler closer to other handlers.

(2) Use a course enough granularity on time and a short enough
maximum for the GUC to just keep a circular buffer of the mappings
in memory.  We might be able to make this dense enough that one
minute resolution for up to 60 days could fit in 338kB.  Obviously
that could be reduced with courser granularity or a shorter
maximum.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Gavin Flower
Date: 22 February 2015, 19:06:15
Subject: Re: Abbreviated keys for Numeric

From: Tomas Vondra
Date: 22 February 2015, 19:57:20
Subject: Re: Abbreviated keys for Numeric

Re: Allow "snapshot too old" error, to prevent bloat - Mailing list pgsql-hackers

Previous

Next