Re: Transaction Snapshots and Hot Standby - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: Transaction Snapshots and Hot Standby |
Date | |
Msg-id | 1222328406.4445.711.camel@ebony.2ndQuadrant Whole thread Raw |
In response to | Re: Transaction Snapshots and Hot Standby (Bruce Momjian <bruce@momjian.us>) |
List | pgsql-hackers |
On Wed, 2008-09-24 at 21:19 -0400, Bruce Momjian wrote: > Simon Riggs wrote: > > 2. Master ignores Standby's OldestXmin > > Effects: > > * Long running queries on standby... > > Have no effect on primary > > Can delay apply of WAL records on standby > > * Queries on standby give consistent answers in all cases. > > Just for clarification, if you set a max_slave_delay it means it is the > maximum amount of time WAL replay can be delayed on the slave, _and_ it > is the maximum amount of time a query/snapshot can be guaranteed to run > without the possibility of being canceled. My point is that these two > concepts are linked to the same setting. I didn't even see them as separate, but now you mention it they could be. Startup process may need to wait up to max_slave_delay before applying WAL, if it sees that it must cancel a query to continue. max_slave_delay is set in postgresql.conf and honoured by the startup process. What is not yet defined is whether max_slave_delay is a setting per-blockage event, or a total time that could occur because of a single long blockage or many smaller ones. The latter makes more sense, but would only be sensible to calculate it when we have WAL streaming using calculations similar to the bgwriter delay calculations. That definition makes sense for the sysadmin because it is the amount of extra time a standby could take to startup after a failover. If we take max_slave_delay to mean the latter then there is no guarantee for user queries, since we may already have used up all our time waiting on one query and there may be no wait time left for the current user query. It would depend heavily on the rows accessed, so queries might easily run to completion even though they exceed the max_slave_delay. That makes sense because if you are asking a question like "what is the current total of widgets available" you understand that answer changes quickly over time, whereas the "how many widgets were reordered on day X" doesn't change over time at all. If you define this as a "per wait event" setting it provides a guarantee to the user queries, but also puts WAL apply into free-fall since you can't control number of blockage points queries might cause. >From what Merlin was saying, it would be sensible to have multiple slaves: one with a max_slave_delay of 30 seconds to be your HA fast-startup standby and another where we set max_slave_delay to 5 hours to guarantee execution time for large reporting queries. Anyway, its clear that max_slave_delay needs to be settable while running. >From my perspective, all I can say is "all things are possible" and I appreciate the need for options to satisfy different use cases. I'll build the basic mechanics and then we can add fine tuning over time. So I'll get the basic mechanics in place, suggest we observe how that works and then decide on the control mechanisms because they sound relatively simple to add. Another important note is Hannu's suggestion of using snapshot filesystems with Postgres. We can include that feature very quickly and it will complement Hot Standby very well. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
pgsql-hackers by date: