Re: Transaction Snapshots and Hot Standby - Mailing list pgsql-hackers
From | Hannu Krosing |
---|---|
Subject | Re: Transaction Snapshots and Hot Standby |
Date | |
Msg-id | 1221219702.7026.41.camel@huvostro Whole thread Raw |
In response to | Re: Transaction Snapshots and Hot Standby (Hannu Krosing <hannu@2ndQuadrant.com>) |
Responses |
Re: Transaction Snapshots and Hot Standby
|
List | pgsql-hackers |
On Fri, 2008-09-12 at 12:31 +0300, Hannu Krosing wrote: > On Fri, 2008-09-12 at 09:45 +0100, Simon Riggs wrote: > > On Thu, 2008-09-11 at 15:42 +0300, Heikki Linnakangas wrote: > > > Gregory Stark wrote: > > > > b) vacuum on the server which cleans up a tuple the slave has in scope has to > > > > block WAL reply on the slave (which I suppose defeats the purpose of having > > > > a live standby for users concerned more with fail-over latency). > > > > > > One problem with this, BTW, is that if there's a continuous stream of > > > medium-length transaction in the slave, each new snapshot taken will > > > prevent progress in the WAL replay, so the WAL replay will advance in > > > "baby steps", and can fall behind indefinitely. As soon as there's a > > > moment that there's no active snapshot, it can catch up, but if the > > > slave is seriously busy, that might never happen. > > > > It should be possible to do mixed mode. > > > > Stall WAL apply for up to X seconds, then cancel queries. Some people > > may want X=0 or low, others might find X = very high acceptable (Merlin > > et al). > > Or even milder version. > > * Stall WAL apply for up to X seconds, > * then stall new queries, let old ones run to completion (with optional > fallback to canceling after Y sec), > * apply WAL. > * Repeat. Now that I have thought a little more about delegating keeping old versions to filesystem level (ZFS , XFS+LVM) snapshots I'd like to propose the following: 0. run queries and apply WAL freely until WAL application would remove old rows. 1. stall applying WAL for up to N seconds 2. stall starting new queries for up to M seconds 3. if some backends are still running long queries, then 3.1. make filesystem level snapshot (FS snapshot), 3.2. mount the FS snapshot somewhere (maybe as data.at.OldestXmin in parallel to $PGDATA) and 3.3 hand this mounted FS snapshot over to those backends 4. apply WAL 5. GoTo 0. Of course we need to do the filesystem level snapshots in 3. only if the long-running queries don't already have one given to them. Or maybe also if they are running in READ COMMITTED mode and and have aquired a new PG snapshot since they got their FS snapshot need a new one. Also, snapshots need to be reference counted, so we can unmount and destroy them once all their users have finished. I think that enabling long-running queries this way is both low-hanging fruit (or at least medium-height-hanging ;) ) and also consistent to PostgreSQL philosophy of not replication effort. As an example we trust OS's file system cache and don't try to write our own. ---------------- Hannu
pgsql-hackers by date: