Re: [REVIEW] pg_last_xact_insert_timestamp - Mailing list pgsql-hackers

From Greg Smith
Subject Re: [REVIEW] pg_last_xact_insert_timestamp
Date
Msg-id 4EE67B73.20407@2ndQuadrant.com
Whole thread Raw
In response to Re: [REVIEW] pg_last_xact_insert_timestamp  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 12/12/2011 08:45 AM, Robert Haas wrote:
> But I'm skeptical that anything that we only update once per 
> checkpoint cycle will help much in
> calculating an accurate lag value.

I'm sure there is no upper bound on how much WAL lag you can build up 
between commit/abort records either; they can be far less frequent than 
checkpoints.  All it takes is a multi-hour COPY with no other commits to 
completely hose lag measured by that advance, and that is not an unusual 
situation at all.  Overnight daily ETL or reporting MV-ish roll-ups, 
scheduled specifically for when no one is normally at the office, are 
the first thing that spring to mind.

Anyway, I wasn't suggesting checkpoints as anything other than a worst 
case behavior.  We can always thump out more frequent updates to reduce 
lag, and in what I expect to the most common case the WAL send/receive 
stuff will usually do much better.  I see the XID vs. WAL position UI 
issues as being fundamentally unsolvable, which really bothers me.  I'd 
have preferred to run screaming away from this thread if it hadn't.

> It also strikes me that anything that is based on augmenting the walsender/walreceiver protocol leaves
> anyone who is using WAL shipping out in the cold.  I'm not clear from
> the comments you or Simon have made how important you think that use
> case still is.
>    

There's a number of reasons why we might want more timestamps streamed 
into the WAL; this might be one.  We'd just need one to pop out one as 
part of the archive_timeout switch to in theory make it possible for 
these people to be happy.  I think Simon was hoping to avoid WAL 
timestamps, I wouldn't bet too much on that myself.  The obvious 
implementation problem here is that the logical place to put the 
timestamps is right at the end of the WAL file, just before it's closed 
for archiving.  But that position isn't known until you've at least 
started processing it, which you clearly are not doing fast enough if 
lag exists.

As far as who's still important here, two observations.  Note that the 
pg_last_xact_insert_timestamp approach can fail to satisfy WAL shipping 
people who are going to a separate network, where it's impractical to 
connect to both servers with libpq.  I have some customers who like 
putting a one-way WAL wall (sorry) between production and the standby 
server, with the log shipping being the only route between them; that's 
one reason why they might still be doing this instead of using 
streaming.  There's really no good way to make these people happy and 
provide time lag monitoring inside the database.

I was actually the last person I recall who suggested some extra 
monitoring mainly aimed at WAL shipping environments:  
http://archives.postgresql.org/pgsql-hackers/2010-01/msg01522.php  Had 
some pg_standby changes I was also working on back then, almost two 
years ago.  I never circled back to any of it due to having zero demand 
since 9.0 shipped, the requests I had been regularly getting about this 
all dried up.  While I'm all for keeping new features working for 
everyone when it doesn't hold progress back, it's not unreasonable to 
recognize we can't support every monitoring option through all of the 
weird ways WAL files can move around.  pg_stat_replication isn't very 
helpful for 9.0+ WAL shippers either, yet they still go on doing their 
thing.

In the other direction, people who will immediately adopt the latest 
hotness, cascading is a whole new layer of use case concerns on top of 
the ones considered so far.  Now you're talking two layers of 
connections users have to navigate though to compute master->cascaded 
standby lag.  Cascade the WALSender timestamps instead, which seems 
pretty simple to do, and then people can just ask their local standby.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



pgsql-hackers by date:

Previous
From: Alexander Shulgin
Date:
Subject: WIP: URI connection string support for libpq
Next
From: Peter van Hardenberg
Date:
Subject: Re: WIP: URI connection string support for libpq