Re: Strange query stalls on replica in 9.3.9 - Mailing list pgsql-performance

From Jeff Janes
Subject Re: Strange query stalls on replica in 9.3.9
Date
Msg-id CAMkU=1yewz7mZ37EEh_98pc3p3PKL9eKNWxLFmaC2zhrYyZTnA@mail.gmail.com
Whole thread Raw
In response to Re: Strange query stalls on replica in 9.3.9  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-performance

On Fri, Aug 14, 2015 at 9:54 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
On Fri, Aug 14, 2015 at 9:34 AM, Josh Berkus <josh@agliodbs.com> wrote:

On 08/13/2015 01:59 PM, Jeff Janes wrote: execute on the

> Once the commit of the whole-table update has replayed, the problem
> should go way instantly because at that point each backend doing the
> seqscan will find the the transaction has committed and so will set the
> hint bit that means all of the other seqscan backends that come after it
> can skip the proc array scan for that tuple.

Yes ... and given that the commit on the master took < 3 seconds, it's
not likely to take 30 seconds on the replica.  That aside, the pattern
of behavior does look similar to the planner issue.

Another thought.  Who actually sets the hint bits on a replica?  

Do the read-only processes on the replica which discovers a tuple to have been securely committed set the hint bits?

My benchmarking suggests not.

The hint bits only get set if the commit lsn of the transaction of the tuple being hinted (*not* the page lsn) thinks it has already been flushed to WAL. On master the transaction commit record usually would have already flushed its own WAL, or if async then wal writer is going to take care of this fairly soon if nothing else gets to it first.

On the standby, it looks like the only thing that updates the thinks-it-has-been-flushed-to marker (which is stored in the control file, rather than memory) is either the eviction of a dirty buffer, or the completion of a restartpoint.  I could easily be wrong on that, though.  

In any case, you empirically can have committed but unhintable tuples hanging around for prolonged amounts of time on the standby.  Perhaps standbys need a wal writer process.

Cheers,

Jeff

pgsql-performance by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Strange query stalls on replica in 9.3.9
Next
From: "Graeme B. Bell"
Date:
Subject: Re: incredible surprise news from intel/micron right now...