Re: Hot Standby startup with overflowed snapshots - Mailing list pgsql-hackers

From Chris Redekop
Subject Re: Hot Standby startup with overflowed snapshots
Date
Msg-id CAC2SuRJN1+qk0gVzd1mr_e5qoGL6vXY4t-BzpGkqunB1kK7xMw@mail.gmail.com
Whole thread Raw
In response to Re: Hot Standby startup with overflowed snapshots  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: Hot Standby startup with overflowed snapshots
List pgsql-hackers
hrmz, still basically the same behaviour.  I think it might be a *little* better with this patch.  Before when under load it would start up quickly maybe 2 or 3 times out of 10 attempts....with this patch it might be up to 4 or 5 times out of 10...ish...or maybe it was just fluke *shrug*.  I'm still only seeing your log statement a single time (I'm running at debug2).  I have discovered something though - when the standby is in this state if I force a checkpoint on the primary then the standby comes right up.  Is there anything I check or try for you to help figure this out?....or is it actually as designed that it could take 10-ish minutes to start up even after all clients have disconnected from the primary?


On Thu, Oct 27, 2011 at 11:27 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Thu, Oct 27, 2011 at 5:26 PM, Chris Redekop <chris@replicon.com> wrote:

> Thanks for the patch Simon, but unfortunately it does not resolve the issue
> I am seeing.  The standby still refuses to finish starting up until long
> after all clients have disconnected from the primary (>10 minutes).  I do
> see your new log statement on startup, but only once - it does not repeat.
>  Is there any way for me to see  what the oldest xid on the standby is via
> controldata or something like that?  The standby does stream to keep up with
> the primary while the primary has load, and then it becomes idle when the
> primary becomes idle (when I kill all the connections)....so it appears to
> be current...but it just doesn't finish starting up
> I'm not sure if it's relevant, but after it has sat idle for a couple
> minutes I start seeing these statements in the log (with the same offset
> every time):
> DEBUG:  skipping restartpoint, already performed at 9/95000020

OK, so it looks like there are 2 opportunities to improve, not just one.

Try this.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: pg_dumpall Sets Roll default_tablespace Before Creating Tablespaces
Next
From: Kerem Kat
Date:
Subject: Re: (PATCH) Adding CORRESPONDING (NULL error)