Re: Reporting the commit LSN at commit time - Mailing list pgsql-hackers
From | Craig Ringer |
---|---|
Subject | Re: Reporting the commit LSN at commit time |
Date | |
Msg-id | 53E4227E.6030702@2ndquadrant.com Whole thread Raw |
In response to | Re: Reporting the commit LSN at commit time (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
On 08/07/2014 11:42 PM, Robert Haas wrote: > I doubt whether it makes sense to do this without a broader > understanding of how the client-side failover mechanism would work. > If we're going to add something like this, it should include libpq > support for actually doing something useful with it. I'm currently interested in targeting PgBouncer and PgJDBC, not libpq, though I can see that exposing helpers for it in libpq could be useful. The goal here is to permit a client to safely switch from one server to another - either in a multimaster async replication system like BDR, or routing read-only queries to hot standbys with streaming replication - and know for sure that its last commit is visible on the server it is now connected to. For hot standby that means it can avoid running queries that won't see the latest work it did if the standby is lagging - deciding to run them on the upstream instead, or wait, as appropriate. For BDR it'll permit the client to safely perform transparent failover to another node and resume write operations without risking conflicts with its own prior transactions . (I wrote some explanations about this on -general in the thread here: http://www.postgresql.org/message-id/84184AEF-887D-49DF-8F47-6377B1D6EE9F@gmail.com ). Broadly, what I'm thinking of is: * Whenever a client issues a transaction that gets a txid assigned, and that tx commits, the server reports the LSN that includes the commit. * The client keeps track of which server it is connected to using the server's (sysid, timelineid, databaseoid) or a similar identifier - probably specific to the replication protocol in use, unless something generic proves practical. * When the client has to switch to a new server or chooses to do so, it checks pg_stat_replication or pg_replication_slots, finds the server it was previously connected to, and checks to see if the new server has replayed up to the last write transaction this client performed on the previous server. If not, it can make a policy-driven decision: wait until replay catchup, wait for a while then bail out, etc. This is admittedly all a bit hand-wavey. I'm looking at ways to do it, not a firm implementation plan. Notably, the LSN (and timelineID) aren't the only way to keep track of the replay progress of a server and check it from another server. If the commit timestamps work is merged and the timestamp of the last replayed commit record is exposed in pg_replication_slots, the client could use the server-reported commit timestamp to the same effect. In the above you'll note that the client has to make some choices. The client might be picking different servers for failover, read load spreading, or other things I haven't thought of. It might be retaining the old connection and making new ones it wants to be consistent up to a certain point on the old connection (read spreading), or it might be dropping the old connection and making a new one (failover). If the new server to connect to isn't caught up yet it might want to wait indefinitely, wait a short while, or bail out immediately and try a different server. There's a lot of client/application specific policy going to be involved here, so I'm not sure it makes sense to try to make it transparent in libpq. I can see it being useful to expose some tools in libpq for it, without a doubt, so clients can do these sorts of things usefully. (There's also another whole new question: how do you pick which alternative server to connect to? But that's not really within the scope of this.) -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: