Re: On-the-fly index tuple deletion vs. hot_standby - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: On-the-fly index tuple deletion vs. hot_standby
Date
Msg-id BANLkTi=hZgSSttATKsWC=J8bySdbnT_f_g@mail.gmail.com
Whole thread Raw
In response to Re: On-the-fly index tuple deletion vs. hot_standby  (Noah Misch <noah@leadboat.com>)
Responses Re: On-the-fly index tuple deletion vs. hot_standby
List pgsql-hackers
On Thu, Jun 9, 2011 at 10:38 PM, Noah Misch <noah@leadboat.com> wrote:
> On Fri, Apr 22, 2011 at 11:10:34AM -0400, Noah Misch wrote:
>> On Tue, Mar 15, 2011 at 10:22:59PM -0400, Noah Misch wrote:
>> > On Mon, Mar 14, 2011 at 01:56:22PM +0200, Heikki Linnakangas wrote:
>> > > On 12.03.2011 12:40, Noah Misch wrote:
>> > >> The installation that inspired my original report recently upgraded from 9.0.1
>> > >> to 9.0.3, and your fix did significantly decrease its conflict frequency.  The
>> > >> last several conflicts I have captured involve XLOG_BTREE_REUSE_PAGE records.
>> > >> (FWIW, the index has generally been pg_attribute_relid_attnam_index.)  I've
>> > >> attached a test script demonstrating the behavior.  _bt_page_recyclable approves
>> > >> any page deleted no more recently than RecentXmin, because we need only ensure
>> > >> that every ongoing scan has witnessed the page as dead.  For the hot standby
>> > >> case, we need to account for possibly-ongoing standby transactions.  Using
>> > >> RecentGlobalXmin covers that, albeit with some pessimism: we really only need
>> > >> LEAST(RecentXmin, PGPROC->xmin of walsender_1, .., PGPROC->xmin of walsender_N)
>> > >> - vacuum_defer_cleanup_age.  Not sure the accounting to achieve that would pay
>> > >> off, though.  Thoughts?
>> > >
>> > > Hmm, instead of bloating the master, I wonder if we could detect more
>> > > accurately if there are any on-going scans, in the standby. For example,
>> > > you must hold a lock on the index to scan it, so only transactions
>> > > holding the lock need to be checked for conflict.
>> >
>> > That would be nice.  Do you have an outline of an implementation in mind?
>>
>> In an attempt to resuscitate this thread, here's my own shot at that.  Apologies
>> in advance if it's just an already-burning straw man.
> [full proposal at http://archives.postgresql.org/message-id/20110422151034.GA8150@tornado.gateway.2wire.net]
>
> Anyone care to comment?  On this system, which has vacuum_defer_cleanup_age set
> to 3 peak hours worth of xid consumption, the problem caps recovery conflict
> hold off at 10-20 minutes.  It will have the same effect on standby feedback in
> 9.1.  I think we should start by using RecentGlobalXmin instead of RecentXmin as
> the reuse horizon when wal_level = hot_standby, and backpatch that.  Then,
> independently consider for master a bloat-avoidance improvement like I outlined
> most recently; I'm not sure whether that's worth it.  In any event, I'm hoping
> to get some consensus on the way forward.

I like your ideas.

(Also, I note that using xids in this way unnecessarily keeps bloat
around for a long time, if we have periods of mostly read-only
activity. Interesting point.)

I think we would only get away with this approach on leaf pages of the
index. It doesn't seem worth trying for the locks if we were higher
up.

On the standby side, its possible this could generate additional
buffer pin deadlocks and/or contention. So I would also want to look
at some deferral mechanism, so that we can mark the block removed, but
not actually do so until some time later, or we really need to, for
example when we write new data to that page.

Got time for a patch in this coming CF?

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-hackers by date:

Previous
From: Jim Nasby
Date:
Subject: Re: procpid?
Next
From: David Fetter
Date:
Subject: Re: Boolean operators without commutators vs. ALL/ANY