Home > mailing lists

Re: Allow "snapshot too old" error, to prevent bloat - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: Allow "snapshot too old" error, to prevent bloat
Date	February 23, 2015 16:31:47
Msg-id	CAA4eK1LDyy4L+=qHDjDf=XERCT+SfYNtyWzryCVtTE-947_fLg@mail.gmail.com Whole thread Raw
In response to	Re: Allow "snapshot too old" error, to prevent bloat (Kevin Grittner <kgrittn@ymail.com>)
Responses	Re: Allow "snapshot too old" error, to prevent bloat
List	pgsql-hackers

Tree view

On Sun, Feb 22, 2015 at 10:18 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > It seems to me that SQL Server also uses similar mechanism to
> > avoid the bloat in version store (place to store previous
> > versions or record).
>
> > I think if other leading databases provide a way to control the
> > bloat, it indicates that most of the customers having
> > write-intesive workload would like to see such an option.
>
> Yeah, I think many users are surprised by the damage that can be
> done to a database by an idle connection or long-running read-only
> query, especially when coming from other databases which have
> protections against that.
>
> The bad news is that changing to specifying the limit in time
> rather than transaction IDs is far more complicated than I thought.
> Basically, we need to associate times in the past with the value of
> latestCompletedXid (or derived values like snapshot xmin) at those
> times. I was initially thinking that by adding a timestamp to the
> snapshot we could derive this from active snapshots. For any one
> connection, it's not that hard for it to scan the pairingheap of
> snapshots and pick out the right values; but the problem is that it
> is process-local memory and this needs to work while the connection
> is sitting idle for long periods of time.

Could you please explain in slightly more detail why can't it work

if we use timestamp instead of snapshot->xmin in your patch in

function TestForOldSnapshot()?

> I've got a couple ideas
> on how to approach it, but neither seems entirely satisfying:
>
> (1) Use a new timer ID to interrupt the process whenever its
> oldest snapshot which hasn't yet crossed the "old snapshot"
> threshold does so. The problem is that this seems as though it
> would need to do an awful lot more work (scanning the pairing heap
> for the best match) than anything else is currently doing in a
> timeout handler. One alternative would be to keep a second pairing
> heap to track snapshots which have not crossed the threshold,
> removing items as they get old -- that would make the work needed
> in the timeout handler closer to other handlers.
>
> (2) Use a course enough granularity on time and a short enough
> maximum for the GUC to just keep a circular buffer of the mappings
> in memory. We might be able to make this dense enough that one
> minute resolution for up to 60 days could fit in 338kB. Obviously
> that could be reduced with courser granularity or a shorter
> maximum.
>

How exactly will this allow to return "snapshot old" error, do we

need a check for page (page lsn) as in your current patch?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: David Steele
Date: 23 February 2015, 16:29:38
Subject: Re: pgaudit - an auditing extension for PostgreSQL

From: Eric Grinstein
Date: 23 February 2015, 16:39:49
Subject: Re: Query Rewrite with Postgres' materialized views

Re: Allow "snapshot too old" error, to prevent bloat - Mailing list pgsql-hackers

Previous

Next