Re: Streaming replication status - Mailing list pgsql-hackers
From | Greg Smith |
---|---|
Subject | Re: Streaming replication status |
Date | |
Msg-id | 4B4D88A5.2050905@2ndquadrant.com Whole thread Raw |
In response to | Re: Streaming replication status (Stefan Kaltenbrunner <stefan@kaltenbrunner.cc>) |
Responses |
Re: Streaming replication status
|
List | pgsql-hackers |
Stefan Kaltenbrunner wrote: > so is there an actually concrete proposal of _what_ interals to expose? ' The pieces are coming together...summary: -Status quo: really bad, but could probably ship anyway because existing PITR is no better and people manage to use it -Add slave pg_current_xlog_location() and something like pg_standby_received_xlog_location(): Much better, gets rid of the worst issues here. -Also add pg_standbys_xlog_location() on the master: while they could live without it, this really helps out the "alert/monitor" script writer whose use cases keep popping up here. Details...the original idea from Fujii was: "I'm thinking something like pg_standbys_xlog_location() [on the primary] which returns one row per standby servers, showing pid of walsender, host name/ port number/user OID of the standby, the location where the standby has written/flushed WAL. DBA can measure the gap from the combination of pg_current_xlog_location() and pg_standbys_xlog_location() via one query on the primary." After some naming quibbles and questions about what direction that should happen in, Tom suggested the initial step here is: "It seems to me that we should have at least two functions available on the slave: latest xlog location received and synced to disk by walreceiver (ie, we are guaranteed to be able to replay up to here); and latest xlog location actually replayed (ie, the state visible to queries on the slave). The latter perhaps could be pg_current_xlog_location()." So there's the first two of them: on the slave, pg_current_xlog_location() giving the latest location replayed, and a new one named something like pg_standby_received_xlog_location(). If you take the position that an unreachable standby does provide answers to these questions too (you just won't like them), this pair might be sufficient to ship. To help a lot at dealing with all the error situations where the standby isn't reachable and segments are piling up (possibly leading to full disk), the next figure that seems to answer the most questions is asking the primary "what's the location of the last WAL segment file in the pile of ones to be archived/distributed that has been requested (or processed if that's the easier thing to note) by the standby?". That's what is named pg_standbys_xlog_location() in the first paragraph I quoted. If you know enough to identify that segment file on disk, you can always look at its timestamp (and the ones on the rest of the files in that directory) in a monitoring script to turn that information into segments or a time measurement instead--xlog segments are nicely ordered after all. -- Greg Smith 2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.com
pgsql-hackers by date: