Re: Allow "snapshot too old" error, to prevent bloat - Mailing list pgsql-hackers
From | Kevin Grittner |
---|---|
Subject | Re: Allow "snapshot too old" error, to prevent bloat |
Date | |
Msg-id | 1816752140.3926212.1424623723314.JavaMail.yahoo@mail.yahoo.com Whole thread Raw |
In response to | Re: Allow "snapshot too old" error, to prevent bloat (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Allow "snapshot too old" error, to prevent bloat
(Andrew Dunstan <andrew@dunslane.net>)
Re: Allow "snapshot too old" error, to prevent bloat (Amit Kapila <amit.kapila16@gmail.com>) |
List | pgsql-hackers |
Amit Kapila <amit.kapila16@gmail.com> wrote: > It seems to me that SQL Server also uses similar mechanism to > avoid the bloat in version store (place to store previous > versions or record). > I think if other leading databases provide a way to control the > bloat, it indicates that most of the customers having > write-intesive workload would like to see such an option. Yeah, I think many users are surprised by the damage that can be done to a database by an idle connection or long-running read-only query, especially when coming from other databases which have protections against that. The bad news is that changing to specifying the limit in time rather than transaction IDs is far more complicated than I thought. Basically, we need to associate times in the past with the value of latestCompletedXid (or derived values like snapshot xmin) at those times. I was initially thinking that by adding a timestamp to the snapshot we could derive this from active snapshots. For any one connection, it's not that hard for it to scan the pairingheap of snapshots and pick out the right values; but the problem is that it is process-local memory and this needs to work while the connection is sitting idle for long periods of time. I've got a couple ideas on how to approach it, but neither seems entirely satisfying: (1) Use a new timer ID to interrupt the process whenever its oldest snapshot which hasn't yet crossed the "old snapshot" threshold does so. The problem is that this seems as though it would need to do an awful lot more work (scanning the pairing heap for the best match) than anything else is currently doing in a timeout handler. One alternative would be to keep a second pairing heap to track snapshots which have not crossed the threshold, removing items as they get old -- that would make the work needed in the timeout handler closer to other handlers. (2) Use a course enough granularity on time and a short enough maximum for the GUC to just keep a circular buffer of the mappings in memory. We might be able to make this dense enough that one minute resolution for up to 60 days could fit in 338kB. Obviously that could be reduced with courser granularity or a shorter maximum. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: